Yioop_V9.5_Source_Code_Documentation

MachineModel extends Model
in package

This is class is used to handle db results related to Machine Administration

Tags
author

Chris Pollett

Table of Contents

DEFAULT_DESCRIPTION_LENGTH  = 150
Default maximum character length of a search summary
MAX_SNIPPET_TITLE_LENGTH  = 20
MIN_SNIPPET_LENGTH  = 100
SNIPPET_LENGTH_LEFT  = 20
SNIPPET_LENGTH_RIGHT  = 40
SNIPPET_TITLE_LENGTH  = 20
$any_fields  : array<string|int, mixed>
These fields if present in $search_array (used by @see getRows() ), but with value "-1", will be skipped as part of the where clause but will be used for order by clause
$cache  : object
Cache object to be used if we are doing caching
$db  : object
Reference to a DatasourceManager
$db_name  : string
Name of the search engine database
$edited_page_summaries  : array<string|int, mixed>
Associative array of page summaries which might be used to override default page summaries if set.
$private_db  : object
Reference to a private DatasourceManager
$private_db_name  : string
Name of the private search engine database
$search_table_column_map  : array<string|int, mixed>
Associations of the form name of field for web forms => database column names/abbreviations
$web_site  : object
Reference to a WebSite object in use to serve pages (if any)
__construct()  : mixed
Sets up the database manager that will be used and name of the search engine database
addMachine()  : mixed
Add a machine to the database using provided string
boldKeywords()  : string
Given a string, wraps in bold html tags a set of key words it contains.
checkMachineExists()  : bool
Check if there is a machine with $column equal to value
createIfNecessaryDirectory()  : int
Creates a directory and sets it to world permission if it doesn't already exist
deleteMachine()  : mixed
Delete a machine by its name
fileGetContents()  : string
Either a wrapper for file_get_contents, or if a WebSite object is being used to serve pages, it reads it in using blocking I/O file_get_contents() and caches it before return its string contents.
filePutContents()  : mixed
Either a wrapper for file_put_contents, or if a WebSite object is being used to serve pages, writes $data to the persistent file with name $filename. Saves a copy in the RAM cache if there is a copy already there.
formatSinglePageResult()  : array<string|int, mixed>
Given a page summary, extracts snippets which are related to a set of search words. For each snippet, bold faces the search terms, and then creates a new summary array.
fromCallback()  : string
Controls which tables and the names of tables underlie the given model and should be used in a getRows call This defaults to the single table whose name is whatever is before Model in the name of the model. For example, by default on FooModel this method would return "FOO". If a different behavior, this can be overridden in subclasses of Model
getChannels()  : array<string|int, mixed>
Returns an array of channels used by at least one machine
getDbmsList()  : array<string|int, mixed>
Gets a list of all DBMS that work with the search engine
getFetchersQueueServerRatio()  : int
Returns the total number of active fetchers/number of queue server across all machines or 1 if this si smalelr than 1
getJobsList()  : array<string|int, mixed>
Returns a list of the media jobs present on this server and whether they are running
getJobStatus()  : bool
Returns whether or not a media job is currently scheduled to be periodically run
getLog()  : string
Get either a fetcher or queue_server log for a machine
getMachineList()  : mixed
Returns all the machine names stored in the DB
getMachineStatuses()  : array<string|int, mixed>
Returns the statuses of machines in the machine table of their fetchers and queue_server as well as the name and url's of these machines
getQueueServerNames()  : array<string|int, mixed>
Returns a list of the queue_server (not mirrors) names
getQueueServerUrls()  : array<string|int, mixed>
Returns urls for all the queue_servers (not mirrors) stored in the DB
getRows()  : array<string|int, mixed>
Gets a range of rows which match the provided search criteria from $th provided table
getSnippets()  : string
Given a string, extracts a snippets of text related to a given set of key words. For a given word a snippet is a window of characters to its left and right that is less than a maximum total number of characters.
getUserId()  : string
Get the user_id associated with a given username (In base class as used as an internal method in both signin and user models)
isSingleLocalhost()  : bool
Used to determine if an action involves just one yioop instance on the current local machine or not
loginDbms()  : bool
Returns whether the provided dbms needs a login and password or not (sqlite or sqlite3)
postQueryCallback()  : array<string|int, mixed>
Called after getRows has retrieved all the rows that it would retrieve but before they are returned to give one last place where they could be further manipulated. This callback is used to make parallel network calls to get the status of each machine returned by getRows. The default for this method is to leave the rows that would be returned unchanged
restartCrashedFetchers()  : mixed
Used to restart any fetchers which the user turned on, but which happened to have crashed. (Crashes are usually caused by CURL or memory issues)
rowCallback()  : array<string|int, mixed>
Called after as row is retrieved by getRows from the database to perform some manipulation that would be useful for this model.
searchArrayToWhereOrderClauses()  : array<string|int, mixed>
Creates the WHERE and ORDER BY clauses for a query of a Yioop table such as USERS, ROLE, GROUP, which have associated search web forms. Searches are case insensitive
selectCallback()  : string
Controls which columns and the names of those columns from the tables underlying the given model should be return from a getRows call.
setJobStatus()  : mixed
Sets whether a media job should be periodically run or not
translateDb()  : mixed
Used to get the translation of a string_id stored in the database to the given locale.
update()  : mixed
Used to start or stop a queue_server, fetcher, mirror instance on a machine managed by the current one
whereCallback()  : string
Controls the WHERE clause of the SQL query that underlies the given model and should be used in a getRows call.
getJobNameFromPath()  : string
Returns the name of a job from its class file path

Constants

DEFAULT_DESCRIPTION_LENGTH

Default maximum character length of a search summary

public mixed DEFAULT_DESCRIPTION_LENGTH = 150

MAX_SNIPPET_TITLE_LENGTH

public mixed MAX_SNIPPET_TITLE_LENGTH = 20

MIN_SNIPPET_LENGTH

public mixed MIN_SNIPPET_LENGTH = 100

SNIPPET_LENGTH_LEFT

public mixed SNIPPET_LENGTH_LEFT = 20

SNIPPET_LENGTH_RIGHT

public mixed SNIPPET_LENGTH_RIGHT = 40

SNIPPET_TITLE_LENGTH

public mixed SNIPPET_TITLE_LENGTH = 20

Properties

$any_fields

These fields if present in $search_array (used by @see getRows() ), but with value "-1", will be skipped as part of the where clause but will be used for order by clause

public array<string|int, mixed> $any_fields = []

$cache

Cache object to be used if we are doing caching

public static object $cache

$db

Reference to a DatasourceManager

public object $db

$db_name

Name of the search engine database

public string $db_name

$edited_page_summaries

Associative array of page summaries which might be used to override default page summaries if set.

public array<string|int, mixed> $edited_page_summaries = null

$private_db

Reference to a private DatasourceManager

public object $private_db

$private_db_name

Name of the private search engine database

public string $private_db_name

$search_table_column_map

Associations of the form name of field for web forms => database column names/abbreviations

public array<string|int, mixed> $search_table_column_map = ["name" => "NAME"]

$web_site

Reference to a WebSite object in use to serve pages (if any)

public object $web_site

Methods

__construct()

Sets up the database manager that will be used and name of the search engine database

public __construct([string $db_name = CDB_NAME ][, bool $connect = true ][, mixed $web_site = null ]) : mixed
Parameters
$db_name : string = CDB_NAME

the name of the database for the search engine

$connect : bool = true

whether to connect to the database by default after making the datasource class

$web_site : mixed = null
Return values
mixed

addMachine()

Add a machine to the database using provided string

public addMachine(string $name, string $url, int $channel, int $num_fetchers[, string $parent = "" ]) : mixed
Parameters
$name : string

the name of the machine to be added

$url : string

the url of this machine

$channel : int
  • whether this machine is not running a queue_server or mirror (-1) and if latter what its channel is (value >=0)
$num_fetchers : int
  • how many managed fetchers are on this machine.
$parent : string = ""
  • if this machine replicates some other machine then the name of the parent
Return values
mixed

boldKeywords()

Given a string, wraps in bold html tags a set of key words it contains.

public boldKeywords(string $text, array<string|int, mixed> $words) : string
Parameters
$text : string

haystack string to look for the key words

$words : array<string|int, mixed>

an array of words to bold face

Return values
string

the resulting string after boldfacing has been applied

checkMachineExists()

Check if there is a machine with $column equal to value

public checkMachineExists(mixed $fields, mixed $values) : bool
Parameters
$fields : mixed

field (string) or fields (array of strings) to use to look up machines (either name, url, channel)

$values : mixed

value (string) or values (array of strings) for that field

Return values
bool

whether or not has machine

createIfNecessaryDirectory()

Creates a directory and sets it to world permission if it doesn't already exist

public createIfNecessaryDirectory(string $directory) : int
Parameters
$directory : string

name of directory to create

Return values
int

-1 on failure, 0 if already existed, 1 if created

deleteMachine()

Delete a machine by its name

public deleteMachine(string $machine_name) : mixed
Parameters
$machine_name : string

the name of the machine to delete

Return values
mixed

fileGetContents()

Either a wrapper for file_get_contents, or if a WebSite object is being used to serve pages, it reads it in using blocking I/O file_get_contents() and caches it before return its string contents.

public fileGetContents(string $filename[, bool $force_read = false ]) : string

Note this function assumes that only the web server is performing I/O with this file. filemtime() can be used to see if a file on disk has been changed and then you can use $force_read = true below to force re- reading the file into the cache

Parameters
$filename : string

name of file to get contents of

$force_read : bool = false

whether to force the file to be read from persistent storage rather than the cache

Return values
string

contents of the file given by $filename

filePutContents()

Either a wrapper for file_put_contents, or if a WebSite object is being used to serve pages, writes $data to the persistent file with name $filename. Saves a copy in the RAM cache if there is a copy already there.

public filePutContents(string $filename, string $data) : mixed
Parameters
$filename : string

name of file to write to persistent storages

$data : string

string of data to store in file

Return values
mixed

formatSinglePageResult()

Given a page summary, extracts snippets which are related to a set of search words. For each snippet, bold faces the search terms, and then creates a new summary array.

public formatSinglePageResult(array<string|int, mixed> $page[, array<string|int, mixed> $words = null ][, int $description_length = self::DEFAULT_DESCRIPTION_LENGTH ]) : array<string|int, mixed>
Parameters
$page : array<string|int, mixed>

a single search result summary

$words : array<string|int, mixed> = null

keywords (typically what was searched on)

$description_length : int = self::DEFAULT_DESCRIPTION_LENGTH

length of the description

Return values
array<string|int, mixed>

$page which has been snippified and bold faced

fromCallback()

Controls which tables and the names of tables underlie the given model and should be used in a getRows call This defaults to the single table whose name is whatever is before Model in the name of the model. For example, by default on FooModel this method would return "FOO". If a different behavior, this can be overridden in subclasses of Model

public fromCallback([mixed $args = null ]) : string
Parameters
$args : mixed = null

any additional arguments which should be used to determine these tables

Return values
string

a comma separated list of tables suitable for a SQL query

getChannels()

Returns an array of channels used by at least one machine

public getChannels() : array<string|int, mixed>
Return values
array<string|int, mixed>

of integer server labels

getDbmsList()

Gets a list of all DBMS that work with the search engine

public getDbmsList() : array<string|int, mixed>
Return values
array<string|int, mixed>

Names of available data sources

getFetchersQueueServerRatio()

Returns the total number of active fetchers/number of queue server across all machines or 1 if this si smalelr than 1

public getFetchersQueueServerRatio() : int
Return values
int

average number fetchers currently turned 'on'/queue server

getJobsList()

Returns a list of the media jobs present on this server and whether they are running

public getJobsList() : array<string|int, mixed>
Return values
array<string|int, mixed>

[job_name => status, ...]

getJobStatus()

Returns whether or not a media job is currently scheduled to be periodically run

public getJobStatus(string $job) : bool
Parameters
$job : string

the job to see if running or not

Return values
bool

whether scheduled ot be periodically run or not

getLog()

Get either a fetcher or queue_server log for a machine

public getLog(string $machine_name, int $id, string $type[, string $filter = "" ]) : string
Parameters
$machine_name : string

the name of the machine to get the log file for

$id : int

if a fetcher, which instance on the machine

$type : string

one of queue_server, fetcher, mirror, or MediaUpdater

$filter : string = ""

only lines out of log containing this string returned

Return values
string

containing the last MachineController::LOG_LISTING_LEN bytes of the log record

getMachineList()

Returns all the machine names stored in the DB

public getMachineList() : mixed

@return array machine names

Return values
mixed

getMachineStatuses()

Returns the statuses of machines in the machine table of their fetchers and queue_server as well as the name and url's of these machines

public getMachineStatuses([array<string|int, mixed> $machines = [] ]) : array<string|int, mixed>
Parameters
$machines : array<string|int, mixed> = []

an array of machines to check the status for

Return values
array<string|int, mixed>

a list of machines, together with all their properties and the statuses of their fetchers and queue_servers

getQueueServerNames()

Returns a list of the queue_server (not mirrors) names

public getQueueServerNames() : array<string|int, mixed>
Return values
array<string|int, mixed>

of machine names

getQueueServerUrls()

Returns urls for all the queue_servers (not mirrors) stored in the DB

public getQueueServerUrls(string $crawl_time[, int $channel = -1 ]) : array<string|int, mixed>
Parameters
$crawl_time : string

of a crawl to see the machines used in that crawl

$channel : int = -1

only return QueueServers on this channel

Return values
array<string|int, mixed>

machine urls

getRows()

Gets a range of rows which match the provided search criteria from $th provided table

public getRows(int $limit, int $num, int &$total[, array<string|int, mixed> $search_array = [] ][, array<string|int, mixed> $args = null ]) : array<string|int, mixed>
Parameters
$limit : int

starting row from the potential results to return

$num : int

number of rows after start row to return

$total : int

gets set with the total number of rows that can be returned by the given database query

$search_array : array<string|int, mixed> = []

each element of this is a quadruple name of a field, what comparison to perform, a value to check, and an order (ascending/descending) to sort by

$args : array<string|int, mixed> = null

additional values which may be used to get rows (what these are will typically depend on the subclass implementation)

Return values
array<string|int, mixed>

getSnippets()

Given a string, extracts a snippets of text related to a given set of key words. For a given word a snippet is a window of characters to its left and right that is less than a maximum total number of characters.

public getSnippets(string $text, array<string|int, mixed> $words, string $description_length) : string

There is also a rule that a snippet should avoid ending in the middle of a word

Parameters
$text : string

haystack to extract snippet from

$words : array<string|int, mixed>

keywords used to make look in haystack

$description_length : string

length of the description desired

Return values
string

a concatenation of the extracted snippets of each word

getUserId()

Get the user_id associated with a given username (In base class as used as an internal method in both signin and user models)

public getUserId(string $username) : string
Parameters
$username : string

the username to look up

Return values
string

the corresponding userid

isSingleLocalhost()

Used to determine if an action involves just one yioop instance on the current local machine or not

public isSingleLocalhost(array<string|int, mixed> $machine_urls[, string $index_timestamp = -1 ]) : bool
Parameters
$machine_urls : array<string|int, mixed>

urls of yioop instances to which the action applies

$index_timestamp : string = -1

if timestamp exists checks if the index has declared itself to be a no network index.

Return values
bool

whether it involves a single local yioop instance (true) or not (false)

loginDbms()

Returns whether the provided dbms needs a login and password or not (sqlite or sqlite3)

public loginDbms(string $dbms) : bool
Parameters
$dbms : string

the name of a database management system

Return values
bool

true if needs a login and password; false otherwise

postQueryCallback()

Called after getRows has retrieved all the rows that it would retrieve but before they are returned to give one last place where they could be further manipulated. This callback is used to make parallel network calls to get the status of each machine returned by getRows. The default for this method is to leave the rows that would be returned unchanged

public postQueryCallback(array<string|int, mixed> $rows) : array<string|int, mixed>
Parameters
$rows : array<string|int, mixed>

that have been calculated so far by getRows

Return values
array<string|int, mixed>

$rows after this final manipulation

restartCrashedFetchers()

Used to restart any fetchers which the user turned on, but which happened to have crashed. (Crashes are usually caused by CURL or memory issues)

public restartCrashedFetchers() : mixed
Return values
mixed

rowCallback()

Called after as row is retrieved by getRows from the database to perform some manipulation that would be useful for this model.

public rowCallback(array<string|int, mixed> $row, mixed $args) : array<string|int, mixed>

For example, in CrawlModel, after a row representing a crawl mix has been gotten, this is used to perform an additional query to marshal its components. By default this method just returns this row unchanged.

Parameters
$row : array<string|int, mixed>

row as retrieved from database query

$args : mixed

additional arguments that might be used by this callback

Return values
array<string|int, mixed>

$row after callback manipulation

searchArrayToWhereOrderClauses()

Creates the WHERE and ORDER BY clauses for a query of a Yioop table such as USERS, ROLE, GROUP, which have associated search web forms. Searches are case insensitive

public searchArrayToWhereOrderClauses(array<string|int, mixed> $search_array[, array<string|int, mixed> $any_fields = ['status'] ]) : array<string|int, mixed>
Parameters
$search_array : array<string|int, mixed>

each element of this is a quadruple name of a field, what comparison to perform, a value to check, and an order (ascending/descending) to sort by

$any_fields : array<string|int, mixed> = ['status']

these fields if present in search array but with value "-1" will be skipped as part of the where clause but will be used for order by clause

Return values
array<string|int, mixed>

string for where clause, string for order by clause

selectCallback()

Controls which columns and the names of those columns from the tables underlying the given model should be return from a getRows call.

public selectCallback([mixed $args = null ]) : string

This defaults to *, but in general will be overridden in subclasses of Model

Parameters
$args : mixed = null

any additional arguments which should be used to determine the columns

Return values
string

a comma separated list of columns suitable for a SQL query

setJobStatus()

Sets whether a media job should be periodically run or not

public setJobStatus(string $job, bool $status) : mixed
Parameters
$job : string

the job to see if running or not

$status : bool

(true or non-empty) means periodically run the job, false means don't run the job.

Return values
mixed

translateDb()

Used to get the translation of a string_id stored in the database to the given locale.

public translateDb(string $string_id, string $locale_tag) : mixed
Parameters
$string_id : string

id to translate

$locale_tag : string

to translate to

Return values
mixed

translation if found, $string_id, otherwise

update()

Used to start or stop a queue_server, fetcher, mirror instance on a machine managed by the current one

public update(string $machine_name, string $action, int $id, string $type) : mixed
Parameters
$machine_name : string

name of machine

$action : string

"start" or "stop"

$id : int

id of process type to update (usually the number of a fetcher on a particular machine)

$type : string

type of process to change the status of QueueServer, Fetcher, MediaUpdater

Return values
mixed

whereCallback()

Controls the WHERE clause of the SQL query that underlies the given model and should be used in a getRows call.

public whereCallback([mixed $args = null ]) : string

This defaults to an empty WHERE clause.

Parameters
$args : mixed = null

additional arguments that might be used to construct the WHERE clause.

Return values
string

a SQL WHERE clause

getJobNameFromPath()

Returns the name of a job from its class file path

private getJobNameFromPath(string $job_path) : string
Parameters
$job_path : string

class file path of job

Return values
string

name of a job


        

Search results