Yioop_V9.5_Source_Code_Documentation

Utility.php

SeekQuarry/Yioop -- Open Source Pure PHP Search Engine, Crawler, and Indexer

Copyright (C) 2009 - 2023 Chris Pollett chris@pollett.org

LICENSE:

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see https://www.gnu.org/licenses/.

END LICENSE

A library of string, error reporting, log, hash, time, and conversion functions

Tags
author

Chris Pollett chris@pollett.org

license

https://www.gnu.org/licenses/ GPL3

link
https://www.seekquarry.com/
copyright

2009 - 2023

filesource

Interfaces, Classes, Traits and Enums

Mod9Constants
Mini-class (so not own file) used to hold encode decode info related to Mod9 encoding (as variant of Simplified-9 specify to Yioop).

Table of Contents

addRegexDelimiters()  : string
Adds delimiters to a regex that may or may not have them
preg_search()  : mixed
search for a pcre pattern in a subject from a given offset, return position of first match if found -1 otherwise.
preg_offset_replace()  : string
Replaces a pcre pattern with a replacement in $subject starting from some offset.
parse_ini_with_fallback()  : array<string|int, mixed>
Yioop replacement for parse_ini_file($name, true) in case parse_ini_file is on the disable_functions list. Name has underscores to match original function. This function checks if parse_ini_file is disabled on not. If not, it just calls parse_ini_file; otherwise, it simulates it enough so that configure.ini files used for string translations can be read.
getIniAssignMatch()  : mixed
Auxiliary function called from parse_ini_with_fallback to extract from the $matches array produced by the former function's preg_match what kind of assignment occurred in the ini file being parsed.
charCopy()  : mixed
Copies from $source string beginning at position $start, $length many bytes to destination string
vByteEncode()  : string
Encodes an integer using variable byte coding.
vByteDecode()  : int
Decodes from a string using variable byte coding an integer.
appendUnary()  : mixed
Appends a number re-encoded in unary to the end of an input string starting at a given bit offset into the string. Here n in unary has bit representation n-1 0's followed by a 1.
decodeUnary()  : int
Decodes a unary number froman input string at a given bit offset. Here n in unary has bit representation n-1 0's followed by a 1.
appendBits()  : string
Appends $num_bits bits from the start of the binary rep of $number beginning at offset $start_bit_offset of $input string overwriting any bits present. If $num_bits == -1, then appends all of $number.
decodeBits()  : int
Decode $num_bits many bits from the $input string beginning at offset $start_bit_offset. The result of this operation is up $start_bit_offset by number of bits that were able to be decoded.
appendGamma()  : string
Appends gamma code of $number beginning at offset $start_bit_offset of $input string overwriting any bits present. $start_bit_offset is updated to bit position after append.
decodeGammaList()  : array<string|int, mixed>
Decodes up to $num_decode gamma encoded integers beginning at $start_bit_offset. $start_bit_offset is updated to the bit position after the decoded integers.
appendRiceSequence()  : string
Appends using a Rice coding a sequence of integers $int_sequence at offset $start_bit_offset to the string $output, overwriting any bits present at that location. $start_bit_offset is updated to bit position after append.
decodeRiceSequence()  : array<string|int, mixed>
Decodes up to $num_decode rice encoded difference list of integers beginning at $start_bit_offset. $start_bit_offset is updated to the bit position after the decoded integers. If $delta_start >= 0 then the first int is assumed to be the difference from $delta_start;
encodePositionList()  : string
Encodes a list of integer positions of a term in a document. This is done as a gamma code of the first integer followed by the Rice coding of the remaining integers using a modulus based on the average gap between integers. If the number of positions is 1 or 2 then a gamma of each position only is used.
decodePositionList()  : array<string|int, mixed>
Decodes up to $num_decode term in document position integers from string $input under the assumption $input is encoded as per
encode255()  : string
Recodes a string in a 1-1 fashion to a string not involving \xFF (255). I.e., it maps characters \xFE -> \xFE\FD and \xFF -> \xFE\FE
decode255()  : string
Decodes a string in a 1-1 fashion from a string not involving \xFF (255). I.e., it maps characters \xFE\FE -> \xFF and \xFE\FD -> \xFF
encodeUnderscore()  : string
Recodes a string in a 1-1 fashion to a string not involving underscore (_). I.e., it maps characters - -> -- and _ -> -=
decodeUnderscore()  : string
Decodes a string in a 1-1 fashion from a string not involving underscore (_). I.e., it maps characters -= -> _ and -- -> -
packEncode255()  : string
Encodes a list of strings as their @see encode255 versions separated by \xFF's
unpackDecode255()  : array<string|int, mixed>
Decodes a list of strings from a string that encoded as their @see encode255 of its elements separated by \xFF's
packPosting()  : string
Makes an packed integer string from a docindex and the number of occurrences of a word in the document with that docindex.
unpackPosting()  : array<string|int, mixed>
Given a packed integer string, uses the top three bytes to calculate a doc_index of a document in the shard, and uses the low order byte to computer a number of occurrences of a word in that document.
addDocIndexPostings()  : string
This method is used while appending one index shard to another.
deltaList()  : array<string|int, mixed>
Computes the difference of a list of integers.
deDeltaList()  : array<string|int, mixed>
Given an array of differences of integers reconstructs the original list. This computes the inverse of the deltaList function
encodeModified9()  : string
Encodes a sequence of integers x, such that 1 <= x <= 2<<28-1 as a string. NOTICE x>=1.
packListModified9()  : string
Packs the contents of a single word of a sequence being encoded using Modified9.
nextPostString()  : string
Returns the next complete posting string from $input_string being at offset.
decodeModified9()  : array<string|int, mixed>
Decoded a sequence of positive integers from a string that has been encoded using Modified 9
unpackListModified9()  : array<string|int, mixed>
Decode a single word with high two bits off according to modified 9
docIndexModified9()  : int
Given an int encoding encoding a doc_index followed by a position list using Modified 9, extracts just the doc_index.
unpackInt()  : int
Unpacks an int from a 4 char string
packInt()  : string
Packs an int into a 4 char string
unpackFloat()  : float
Unpacks a float from a 4 char string
packFloat()  : string
Packs an float into a four char string
renameSerializedObject()  : string
Used to change the namespace of a serialized php object (assumes doesn't have nested subobjects)
getDomFromString()  : DOMDocument
Parses a provided string to make a DOM object. First tries to parse using XML and if this fails uses the more robust HTML Dom parser and manipulates the resulting DOM tree to make correspond to original tags for XML that isn't HTML
getTags()  : array<string|int, mixed>
Returns an array of DOMDocuments for the nodes that match an xpath query on $dom, a DOMDocument
toHexString()  : string
Converts a string to string where each char has been replaced by its hexadecimal equivalent
toIntString()  : string
Converts a string to string where each char has been replaced by a Integer equivalent
toBinString()  : string
Converts a string to string where each char has been replaced by its binary equivalent
metricToInt()  : int
Converts a string of the form some int followed by K, M, or G.
intToMetric()  : string
Converts a number to a string followed by nothing, K, M, G, T depending on whether number is < 1000, < 10^6, < 10^9, or < 10^(12)
crawlLog()  : mixed
Logs a message to a logfile or the screen. The super-global field $_SERVER['LOG_TO_FILES'] determines if this will log to a file. If not, then in cli mode, will log to stdout, otherwise it will use error_log. When logging to file $_SERVER["NO_ROTATE_LOGS"] controls whether or not there will be a log file rotation. The first call to this method is typically used to set up a process to check for liveness. For example a call: crawlLog("\n\nInitialize logger..", $this->process_name, true); says $this->process_name should be checked for liveness as part of any subsequent logging activity such as a call crawlLog("Another Message"); (note subsequent call don't need to specify the process name).
makeTimestamp()  : string
Used to make a log file entry time string of format: entry number, time in r format.
crawlTimeoutLog()  : bool
Writes a log message $msg if more than LOG_TIMEOUT time has passed since the last time crawlTimeoutLog was called. Useful in loops to write a message as progress is made through the loop (but not on every iteration, but say every 30 seconds).
crawlHash()  : string
Computes an 8 byte hash of a string for use in storing documents.
crawlHashWord()  : string
Used to create a 20 byte hash of a string (typically a word or phrase with a wikipedia page). Format is 8 byte crawlHash of term (md5 of term two halves XOR'd), followed by a \x00, followed by the first 11 characters from the term. If there are not enough char's to make 20 bytes, then the string is padded with \x00s to 20bytes.
canonicalTerm()  : string
Take a $term that might have come from adocuments and converts it to a string of 16 bytes which is either the original term padded by underscores or the first seven chars of the term followed by an underscore followed by the base64 encoding of the first 6 chars of its md5 hash.
compareWordHashes()  : int
Used to compare to ids for index dictionary lookup. ids are a 8 byte crawlHash together with 12 byte non-hash suffix.
base64Hash()  : string
Converts a crawl hash number to something closer to base64 coded but so doesn't get confused in urls or DBs
unbase64Hash()  : string
Decodes a crawl hash number from base64 to raw ASCII
webencode()  : string
Encodes a string in a format suitable for post data (mainly, base64, but str_replace data that might mess up post in result)
webdecode()  : string
Decodes a string encoded by webencode
crawlCrypt()  : string
The crawlHash function is used to encrypt passwords stored in the database.
partitionByHash()  : array<string|int, mixed>
Used by a controller to take a table and return those rows in the table that a given queue_server would be responsible for handling
calculatePartition()  : int
Used by a controller to say which queue_server should receive a given input
changeInMicrotime()  : float
Measures the change in time in seconds between two timestamps to microsecond precision
microTimestamp()  : string
Timestamp of current epoch with microsecond precision useful for situations where time() might cause too many collisions (account creation, etc)
checkTimeInterval()  : int
Checks that a timestamp is within the time interval given by a start time (HH:mm) and a duration
convertPixels()  : int
Converts a CSS unit string into its equivalent in pixels. This is used by @see SvgProcessor.
countFiles()  : int
Returns the number of files in a folder
makePath()  : bool
Creates folders along a filesystem path if they don't exist
deleteFileOrDir()  : mixed
This is a callback function used in the process of recursively deleting a directory
setWorldPermissions()  : mixed
This is a callback function used in the process of recursively chmoding to 777 all files in a folder
fileInfo()  : an
This is a callback function used in the process of recursively calculating an array of file modification times and files sizes for a directory
orderCallback()  : int
Callback function used to sort documents by a field
stringOrderCallback()  : int
Callback function used to sort documents by a field where field is assume to be a string
stringROrderCallback()  : int
Callback function used to sort documents by a field where field is assume to be a string
rorderCallback()  : int
Callback function used to sort documents by a field in reverse order
lessThan()  : int
Callback to check if $a is less than $b
greaterThan()  : int
Callback to check if $a is greater than $b
e()  : mixed
shorthand for echo
remoteAddress()  : mixed
Compute the real remote address of the incoming connection including forwarding
readInput()  : string
Used to read a line of input from the command-line
readPassword()  : string
Used to read a line of input from the command-line (on unix machines without echoing it)
readMessage()  : string
Used to read a several lines from the terminal up until a last line consisting of just a "."
mimeType()  : string
Returns the mime type of the provided file name if it can be determined.
generalIsA()  : bool
Checks if class_1 is the same as class_2 or has class_2 as a parent Behaves like 3 param version (last param true) of PHP is_a function that came into being with Version 5.3.9.
stripAttributes()  : string
Given the contents of a start XML/HMTL tag strips out all the attributes non listed in $safe_attribute_list
parseCsv()  : array<string|int, mixed>
Used to parse into a two dimensional array a string that contains CSV data.
arraytoCsv()  : string
Converts an array of values to a comma separated value formatted string.
diff()  : string
Computes a Unix-style diff of two strings. That is it only outputs lines which disagree between the two strings. It outputs +line if a line occurs in the second but not first string and -line if a line occurs in the first string but not the second.
computeLCS()  : mixed
Computes the longest common subsequence of two arrays
extractLCSFromTable()  : mixed
Extracts from a table of longest common sequence moves (probably calculated by @see computeLCS) and a starting coordinate $i, $j in that table, a longest common subsequence
tail()  : array<string|int, mixed>
Returns an array of the last $num_lines many lines our of a file
lineFilter()  : array<string|int, mixed>
Given an array of lines returns a subarray of those lines containing the filter string or filter array
logLineTimestamp()  : int
Tries to extract a timestamp from a line which is presumed to come from a Yioop log file
isPositiveInteger()  : bool
Returns whether an input can be parsed to a positive integer
measureCall()  : mixed
Used to measure the memory footprint in bytes and time spent calling a method of an object. It also records number of time the method has been called.
measureObject()  : mixed
Used to measure the memory footprint of an object in Yioop and save it to a statistics file No recording is done until an initial call to the function measureCall(null, save_statistics_file) where save_statistics_file is the name of the file you won't to store statistics to.
measureObjectCall()  : mixed
General method called by for @see measureCall and @see measureObject Used to measure the memory footprint in bytes of an object or memory and time spent calling a method of an object. It also records number of time the method has been called. When used to call a method before initialization, just calls the method without any recording or timing. To initialize, an initial call to the function measureCall(null, save_statistics_file) where save_statistics_file is the name of the file you won't to store statistics to should be done.
variableClone()  : mixed
Makes a deep copy of a variable regardless of its type
garbageCollect()  : int
Runs various system garbage collection functions and returns number of bytes freed.
utf8SafeSaveHtml()  : string
The dom method saveHTML has a tendency to replace UTF-8, non-ascii characters with html entities. This is supposed to save avoiding the replacement.
utf8WordWrap()  : string
A UTF-8 safe version of PHP's wordwrap function that wraps a string to a given number of characters

Functions

addRegexDelimiters()

Adds delimiters to a regex that may or may not have them

addRegexDelimiters(string $expression) : string
Parameters
$expression : string

a regex

Return values
string

rgex with delimiters if not there

search for a pcre pattern in a subject from a given offset, return position of first match if found -1 otherwise.

preg_search(string $pattern, string $subject, int $offset[, bool $return_match = false ]) : mixed
Parameters
$pattern : string

a Perl compatible regular expression

$subject : string

to search for pattern in

$offset : int

character offset into $subject to begin searching from

$return_match : bool = false

whether to return as well what the match was for the pattern

Return values
mixed

if $return_match is false then the integer position of first match, otherwise, it returns the ordered pair [$pos, $match].

preg_offset_replace()

Replaces a pcre pattern with a replacement in $subject starting from some offset.

preg_offset_replace(string $pattern, string $replacement, string $subject, int $offset) : string
Parameters
$pattern : string

a Perl compatible regular expression

$replacement : string

what to replace the pattern with

$subject : string

to search for pattern in

$offset : int

character offset into $subject to begin searching from

Return values
string

result of the replacements

parse_ini_with_fallback()

Yioop replacement for parse_ini_file($name, true) in case parse_ini_file is on the disable_functions list. Name has underscores to match original function. This function checks if parse_ini_file is disabled on not. If not, it just calls parse_ini_file; otherwise, it simulates it enough so that configure.ini files used for string translations can be read.

parse_ini_with_fallback(string $file) : array<string|int, mixed>
Parameters
$file : string

filename of ini data to parse into an array

Return values
array<string|int, mixed>

data parse from file

getIniAssignMatch()

Auxiliary function called from parse_ini_with_fallback to extract from the $matches array produced by the former function's preg_match what kind of assignment occurred in the ini file being parsed.

getIniAssignMatch(string $matches) : mixed
Parameters
$matches : string

produced by a preg_match in parse_ini_with_fallback

Return values
mixed

value of ini file assignment

charCopy()

Copies from $source string beginning at position $start, $length many bytes to destination string

charCopy(string $source, string &$destination, int $start, int $length[, string $timeout_msg = "" ]) : mixed
Parameters
$source : string

string to copy from

$destination : string

string to copy to

$start : int

starting offset

$length : int

number of bytes to copy

$timeout_msg : string = ""

message to print if taking more than 30 seconds

Return values
mixed

vByteEncode()

Encodes an integer using variable byte coding.

vByteEncode(int $pos_int) : string
Parameters
$pos_int : int

integer to encode

Return values
string

a string of 1-5 chars depending on how bit $pos_int was

vByteDecode()

Decodes from a string using variable byte coding an integer.

vByteDecode(string $str, int &$offset) : int
Parameters
$str : string

string to use for decoding

$offset : int

byte offset into string when var int stored

Return values
int

the decoded integer

appendUnary()

Appends a number re-encoded in unary to the end of an input string starting at a given bit offset into the string. Here n in unary has bit representation n-1 0's followed by a 1.

appendUnary(int $number, mixed $input, mixed &$start_bit_offset[, mixed $just_bit_offset = false ]) : mixed
Parameters
$number : int

number to append

$input : mixed
$start_bit_offset : mixed
$just_bit_offset : mixed = false
Return values
mixed

either the resulting string or its length

decodeUnary()

Decodes a unary number froman input string at a given bit offset. Here n in unary has bit representation n-1 0's followed by a 1.

decodeUnary(string $input, int &$start_bit_offset) : int
Parameters
$input : string

the string that we want to decode a unary number from

$start_bit_offset : int

the starting bit offset in $input to start decoding from. After the call it will be the position after the decode

Return values
int

the decoded unary number

appendBits()

Appends $num_bits bits from the start of the binary rep of $number beginning at offset $start_bit_offset of $input string overwriting any bits present. If $num_bits == -1, then appends all of $number.

appendBits(int $number, string $input, int &$start_bit_offset[,  $num_bits = -1 ]) : string
Parameters
$number : int

to append

$input : string

the string to append to.

$start_bit_offset : int

starting location to begin append from

$num_bits : = -1

number of bits of $input to append.

Return values
string

resulting string

decodeBits()

Decode $num_bits many bits from the $input string beginning at offset $start_bit_offset. The result of this operation is up $start_bit_offset by number of bits that were able to be decoded.

decodeBits(string $input, int &$start_bit_offset, int $num_bits) : int
Parameters
$input : string

string to decode bits from

$start_bit_offset : int

bit offset to start decoding from in $input

$num_bits : int

number of bits tot try to decode

Return values
int

the number decoded

appendGamma()

Appends gamma code of $number beginning at offset $start_bit_offset of $input string overwriting any bits present. $start_bit_offset is updated to bit position after append.

appendGamma(int $number, string $input, int &$start_bit_offset) : string
Parameters
$number : int

to append

$input : string

the string to append to.

$start_bit_offset : int

starting bit location to begin append from

Return values
string

resulting string

decodeGammaList()

Decodes up to $num_decode gamma encoded integers beginning at $start_bit_offset. $start_bit_offset is updated to the bit position after the decoded integers.

decodeGammaList(string $input, int &$start_bit_offset, int $num_decode) : array<string|int, mixed>
Parameters
$input : string

the string to decode from

$start_bit_offset : int

starting bit location to decode from

$num_decode : int

number of int's to decode

Return values
array<string|int, mixed>

decoded int's

appendRiceSequence()

Appends using a Rice coding a sequence of integers $int_sequence at offset $start_bit_offset to the string $output, overwriting any bits present at that location. $start_bit_offset is updated to bit position after append.

appendRiceSequence(array<string|int, mixed> $int_sequence, int $modulus, string $output, int &$start_bit_offset[, int $delta_start = -1 ]) : string

Encoding is done as a difference list. If $delta_start is set to a value other than >= then the first gap is assumed to be from int $delta_start

Parameters
$int_sequence : array<string|int, mixed>

int's to append

$modulus : int

i in the 2^i modulus to use for Rice code

$output : string

the string to append to.

$start_bit_offset : int

starting bit location to begin append from

$delta_start : int = -1

if >= 0 previous int to use for difference list otherwise the first integer is encoded as itself rather than a difference

Return values
string

resulting string

decodeRiceSequence()

Decodes up to $num_decode rice encoded difference list of integers beginning at $start_bit_offset. $start_bit_offset is updated to the bit position after the decoded integers. If $delta_start >= 0 then the first int is assumed to be the difference from $delta_start;

decodeRiceSequence(string $input, int &$start_bit_offset, int $num_decode[, int $delta_start = -1 ]) : array<string|int, mixed>
Parameters
$input : string

the string to decode from

$start_bit_offset : int

starting bit location to decode from

$num_decode : int

number of int's to decode

$delta_start : int = -1

if >= 0 previous int to use for difference list otherwise the first integer is decoded as itself rather than a difference

Return values
array<string|int, mixed>

decoded int's

encodePositionList()

Encodes a list of integer positions of a term in a document. This is done as a gamma code of the first integer followed by the Rice coding of the remaining integers using a modulus based on the average gap between integers. If the number of positions is 1 or 2 then a gamma of each position only is used.

encodePositionList(array<string|int, mixed> $positions) : string
Parameters
$positions : array<string|int, mixed>

integer term positions

Return values
string

encoded position list

decodePositionList()

Decodes up to $num_decode term in document position integers from string $input under the assumption $input is encoded as per

decodePositionList(string $input, int $num_decode) : array<string|int, mixed>
Parameters
$input : string

string to decode from

$num_decode : int

number of integer to decode

Tags
see
encodePositionList

.

Return values
array<string|int, mixed>

decoded positions

encode255()

Recodes a string in a 1-1 fashion to a string not involving \xFF (255). I.e., it maps characters \xFE -> \xFE\FD and \xFF -> \xFE\FE

encode255(string $str) : string
Parameters
$str : string

to be encoded

Return values
string

encoded string without \xFF

decode255()

Decodes a string in a 1-1 fashion from a string not involving \xFF (255). I.e., it maps characters \xFE\FE -> \xFF and \xFE\FD -> \xFF

decode255(string $str) : string
Parameters
$str : string

to be frcoded

Return values
string

decoded string

encodeUnderscore()

Recodes a string in a 1-1 fashion to a string not involving underscore (_). I.e., it maps characters - -> -- and _ -> -=

encodeUnderscore(string $str) : string
Parameters
$str : string

to be encoded

Return values
string

encoded string without _

decodeUnderscore()

Decodes a string in a 1-1 fashion from a string not involving underscore (_). I.e., it maps characters -= -> _ and -- -> -

decodeUnderscore(string $str) : string
Parameters
$str : string

to be frcoded

Return values
string

decoded string

packEncode255()

Encodes a list of strings as their @see encode255 versions separated by \xFF's

packEncode255(array<string|int, mixed> $strs) : string
Parameters
$strs : array<string|int, mixed>

strings to encode as a single string

Return values
string

encoded list

unpackDecode255()

Decodes a list of strings from a string that encoded as their @see encode255 of its elements separated by \xFF's

unpackDecode255(string $encoded_strs) : array<string|int, mixed>
Parameters
$encoded_strs : string

string to decode into a list of strings

Return values
array<string|int, mixed>

decoded list

packPosting()

Makes an packed integer string from a docindex and the number of occurrences of a word in the document with that docindex.

packPosting(int $doc_index, array<string|int, mixed> $position_list[, bool $delta = true ]) : string
Parameters
$doc_index : int

index (i.e., a count of which document it is rather than a byte offset) of a document in the document string

$position_list : array<string|int, mixed>

integer positions word occurred in that doc

$delta : bool = true

if true then stores the position_list as a sequence of differences (a delta list)

Return values
string

a modified9 (our compression scheme) packed string containing this info.

unpackPosting()

Given a packed integer string, uses the top three bytes to calculate a doc_index of a document in the shard, and uses the low order byte to computer a number of occurrences of a word in that document.

unpackPosting(string $posting, int &$offset[, bool $dedelta = true ]) : array<string|int, mixed>
Parameters
$posting : string

a string containing a doc index position list pair coded encoded using modified9

$offset : int

a offset into the string where the modified9 posting is encoded

$dedelta : bool = true

if true then assumes the list is a sequence of differences (a delta list) and undoes the difference to get the original sequence

Return values
array<string|int, mixed>

consisting of integer doc_index and a subarray consisting of integer positions of word in doc.

addDocIndexPostings()

This method is used while appending one index shard to another.

addDocIndexPostings(string &$postings, int $add_offset) : string

Given a string of postings adds $add_offset add to each offset to the document map in each posting.

Parameters
$postings : string

a string of index shard postings

$add_offset : int

an fixed amount to add to each postings doc map offset

Return values
string

$new_postings where each doc offset has had $add_offset added to it

deltaList()

Computes the difference of a list of integers.

deltaList(array<string|int, mixed> $list) : array<string|int, mixed>

i.e., (a1, a2, a3, a4) becomes (a1, a2-a1, a3-a2, a4-a3)

Parameters
$list : array<string|int, mixed>

a nondecreasing list of integers

Return values
array<string|int, mixed>

the corresponding list of differences of adjacent integers

deDeltaList()

Given an array of differences of integers reconstructs the original list. This computes the inverse of the deltaList function

deDeltaList(array<string|int, mixed> &$delta_list) : array<string|int, mixed>
Parameters
$delta_list : array<string|int, mixed>

a list of nonegative integers

Tags
see
deltaList
Return values
array<string|int, mixed>

a nondecreasing list of integers

encodeModified9()

Encodes a sequence of integers x, such that 1 <= x <= 2<<28-1 as a string. NOTICE x>=1.

encodeModified9(array<string|int, mixed> $list) : string

The encoded string is a sequence of 4 byte words (packed int's). The high order 2 bits of a given word indicate whether or not to look at the next word. The codes are as follows: 11 start of encoded string, 10 continue four more bytes, 01 end of encoded, and 00 indicates whole sequence encoded in one word.

After the high order 2 bits, the next most significant bits indicate the format of the current word. There are nine possibilities: 00 - 1 28 bit number, 01 - 2 14 bit numbers, 10 - 3 9 bit numbers, 1100 - 4 6 bit numbers, 1101 - 5 5 bit numbers, 1110 6 4 bit numbers, 11110 - 7 3 bit numbers, 111110 - 12 2 bit numbers, 111111 - 24 1 bit numbers.

Parameters
$list : array<string|int, mixed>

a list of positive integers satsfying above

Return values
string

encoded string

packListModified9()

Packs the contents of a single word of a sequence being encoded using Modified9.

packListModified9(int $continue_bits, int $cnt, array<string|int, mixed> $pack_list) : string
Parameters
$continue_bits : int

the high order 2 bits of the word

$cnt : int

the number of element that will be packed in this word

$pack_list : array<string|int, mixed>

a list of positive integers to pack into word

Tags
see
encodeModified9
Return values
string

encoded 4 byte string

nextPostString()

Returns the next complete posting string from $input_string being at offset.

nextPostString(string &$input_string, int &$offset) : string

Does not do any decoding.

Parameters
$input_string : string

a string of postings

$offset : int

an offset to this string which will be updated after call

Return values
string

undecoded posting

decodeModified9()

Decoded a sequence of positive integers from a string that has been encoded using Modified 9

decodeModified9(string $input_string, int &$offset) : array<string|int, mixed>
Parameters
$input_string : string

string to decode from

$offset : int

where to string in the string, after decode points to where one was after decoding.

Tags
see
encodeModified9
Return values
array<string|int, mixed>

sequence of positive integers that were decoded

unpackListModified9()

Decode a single word with high two bits off according to modified 9

unpackListModified9(string $encoded_list) : array<string|int, mixed>
Parameters
$encoded_list : string

four byte string to decode

Return values
array<string|int, mixed>

sequence of integers that results from the decoding.

docIndexModified9()

Given an int encoding encoding a doc_index followed by a position list using Modified 9, extracts just the doc_index.

docIndexModified9(int $encoded_list) : int
Parameters
$encoded_list : int

in the just described format

Return values
int

a doc index into an index shard document map.

unpackInt()

Unpacks an int from a 4 char string

unpackInt(string $str) : int
Parameters
$str : string

where to extract int from

Return values
int

extracted integer

packInt()

Packs an int into a 4 char string

packInt(int $my_int) : string
Parameters
$my_int : int

the integer to pack

Return values
string

the packed string

unpackFloat()

Unpacks a float from a 4 char string

unpackFloat(string $str) : float
Parameters
$str : string

where to extract int from

Return values
float

extracted float

packFloat()

Packs an float into a four char string

packFloat(float $my_float) : string
Parameters
$my_float : float

the float to pack

Return values
string

the packed string

renameSerializedObject()

Used to change the namespace of a serialized php object (assumes doesn't have nested subobjects)

renameSerializedObject(string $class_name, string $object_string) : string
Parameters
$class_name : string

new fully qualified name with namespace

$object_string : string

serialized object

Return values
string

serialized object with new name

getDomFromString()

Parses a provided string to make a DOM object. First tries to parse using XML and if this fails uses the more robust HTML Dom parser and manipulates the resulting DOM tree to make correspond to original tags for XML that isn't HTML

getDomFromString(string $to_parse) : DOMDocument
Parameters
$to_parse : string

the string to parse a DOMDocument from

Return values
DOMDocument

computed based on the provided string

getTags()

Returns an array of DOMDocuments for the nodes that match an xpath query on $dom, a DOMDocument

getTags(DOMDocument $dom, string $query) : array<string|int, mixed>
Parameters
$dom : DOMDocument

document to run xpath query on

$query : string

xpath query to run

Return values
array<string|int, mixed>

of DOMDocuments one for each node matching the xpath query in the original DOMDocument

toHexString()

Converts a string to string where each char has been replaced by its hexadecimal equivalent

toHexString(string $str) : string
Parameters
$str : string

what we want rewritten in hex

Return values
string

the hexified string

toIntString()

Converts a string to string where each char has been replaced by a Integer equivalent

toIntString(string $str) : string
Parameters
$str : string

what we want rewritten in hex

Return values
string

the hexified string

toBinString()

Converts a string to string where each char has been replaced by its binary equivalent

toBinString(string $str) : string
Parameters
$str : string

what we want rewritten in hex

Return values
string

the binary string

metricToInt()

Converts a string of the form some int followed by K, M, or G.

metricToInt(string $metric_num) : int

into its integer equivalent. For example 4K would become 4000, 16M would become 16000000, and 1G would become 1000000000 Note not using base 2 for K, M, G

Parameters
$metric_num : string

metric number to convert

Return values
int

number the metric string corresponded to

intToMetric()

Converts a number to a string followed by nothing, K, M, G, T depending on whether number is < 1000, < 10^6, < 10^9, or < 10^(12)

intToMetric(int $num) : string
Parameters
$num : int

number to convert

Return values
string

number the metric string corresponded to

crawlLog()

Logs a message to a logfile or the screen. The super-global field $_SERVER['LOG_TO_FILES'] determines if this will log to a file. If not, then in cli mode, will log to stdout, otherwise it will use error_log. When logging to file $_SERVER["NO_ROTATE_LOGS"] controls whether or not there will be a log file rotation. The first call to this method is typically used to set up a process to check for liveness. For example a call: crawlLog("\n\nInitialize logger..", $this->process_name, true); says $this->process_name should be checked for liveness as part of any subsequent logging activity such as a call crawlLog("Another Message"); (note subsequent call don't need to specify the process name).

crawlLog(string $msg[, string $lname = null ][, bool $check_process_handler = false ]) : mixed
Parameters
$msg : string

message to log. If empty then no message written

$lname : string = null

name of log file in the LOG_DIR directory, rotated logs will also use this as their basename followed by a number followed by gzipped (since they are gzipped (older versions of Yioop used bzip Some distros don't have bzip but do have gzip. Also gzip was being used elsewhere in Yioop, so to remove the dependency bzip was replaced )).

$check_process_handler : bool = false

by default set to false. After the first time set to true, as long as in subsequent calls set to false, processHandler will be called to check how long the code has run since the last time processHandler called.

Return values
mixed

makeTimestamp()

Used to make a log file entry time string of format: entry number, time in r format.

makeTimestamp([int $time = -1 ]) : string
Parameters
$time : int = -1

a unix timestamp

Return values
string

[line_count_in_log r_formatted_date]

crawlTimeoutLog()

Writes a log message $msg if more than LOG_TIMEOUT time has passed since the last time crawlTimeoutLog was called. Useful in loops to write a message as progress is made through the loop (but not on every iteration, but say every 30 seconds).

crawlTimeoutLog(mixed $msg) : bool
Parameters
$msg : mixed

usually a string with what to be printed out after the timeout period. If $msg === true then clears the timeout cache

Return values
bool

whether a log message was written

crawlHash()

Computes an 8 byte hash of a string for use in storing documents.

crawlHash(string $string[, bool $raw = false ]) : string

An eight byte hash was chosen so that the odds of collision even for a few billion documents via the birthday problem are still reasonable. If the raw flag is set to false then an 11 byte base64 encoding of the 8 byte hash is returned. The hash is calculated as the xor of the two halves of the 16 byte md5 of the string. (8 bytes takes less storage which is useful for keeping more doc info in memory)

Parameters
$string : string

the string to hash

$raw : bool = false

whether to leave raw or base 64 encode

Return values
string

the hash of $string

crawlHashWord()

Used to create a 20 byte hash of a string (typically a word or phrase with a wikipedia page). Format is 8 byte crawlHash of term (md5 of term two halves XOR'd), followed by a \x00, followed by the first 11 characters from the term. If there are not enough char's to make 20 bytes, then the string is padded with \x00s to 20bytes.

crawlHashWord(string $string[, bool $raw = false ]) : string
Parameters
$string : string

word to hash

$raw : bool = false

whether to base64Hash the result

Return values
string

first 8 bytes of md5 of $string concatenated with \x00 to indicate the hash is of a word not a phrase concatenated with the padded to 11 byte $meta_string.

canonicalTerm()

Take a $term that might have come from adocuments and converts it to a string of 16 bytes which is either the original term padded by underscores or the first seven chars of the term followed by an underscore followed by the base64 encoding of the first 6 chars of its md5 hash.

canonicalTerm(string $term) : string

Base64 used to make this all nice and printable.

Parameters
$term : string

to made into a canonical form

Return values
string

canonicalize by apbove version of term.

compareWordHashes()

Used to compare to ids for index dictionary lookup. ids are a 8 byte crawlHash together with 12 byte non-hash suffix.

compareWordHashes(string $id1, string $id2) : int
Parameters
$id1 : string

20 byte word id to compare

$id2 : string

20 byte word id to compare

Return values
int

negative if $id1 smaller, positive if bigger, and 0 if same

base64Hash()

Converts a crawl hash number to something closer to base64 coded but so doesn't get confused in urls or DBs

base64Hash(string $string) : string
Parameters
$string : string

a hash to base64 encode

Return values
string

the encoded hash

unbase64Hash()

Decodes a crawl hash number from base64 to raw ASCII

unbase64Hash(string $base64) : string
Parameters
$base64 : string

a hash to decode

Return values
string

the decoded hash

webencode()

Encodes a string in a format suitable for post data (mainly, base64, but str_replace data that might mess up post in result)

webencode(string $str) : string
Parameters
$str : string

string to encode

Return values
string

encoded string

webdecode()

Decodes a string encoded by webencode

webdecode(string $str) : string
Parameters
$str : string

string to encode

Return values
string

encoded string

crawlCrypt()

The crawlHash function is used to encrypt passwords stored in the database.

crawlCrypt(string $string[, int $salt = null ]) : string

It tries to use the best version the Blowfish variant of php's crypt function available on the current system.

Parameters
$string : string

the string to encrypt

$salt : int = null

salt value to be used (needed to verify if a password is valid)

Return values
string

the crypted string where crypting is done using crawlHash

partitionByHash()

Used by a controller to take a table and return those rows in the table that a given queue_server would be responsible for handling

partitionByHash(array<string|int, mixed> $table, string $field, int $num_partition, int $instance[, object $callback = null ]) : array<string|int, mixed>
Parameters
$table : array<string|int, mixed>

an array of rows of associative arrays which a queue_server might need to process

$field : string

column of $table whose values should be used for partitioning

$num_partition : int

number of queue_servers to choose between

$instance : int

the id of the particular server we are interested in

$callback : object = null

function or static method that might be applied to input before deciding the responsible queue_server. For example, if input was a url we might want to get the host before deciding on the queue_server

Return values
array<string|int, mixed>

the reduced table that the $instance queue_server is responsible for

calculatePartition()

Used by a controller to say which queue_server should receive a given input

calculatePartition(string $input, int $num_partition[, object $callback = null ]) : int
Parameters
$input : string

can view as a key that might be processes by a queue_server. For example, in some cases input might be a url and we want to determine which queue_server should be responsible for queuing that url

$num_partition : int

number of queue_servers to choose between

$callback : object = null

function or static method that might be applied to input before deciding the responsible queue_server. For example, if the input was a url we might want to get the host before deciding on the queue_server

Return values
int

id of server responsible for input

changeInMicrotime()

Measures the change in time in seconds between two timestamps to microsecond precision

changeInMicrotime(string $start[, string $end = null ]) : float
Parameters
$start : string

starting time with microseconds

$end : string = null

ending time with microseconds, if null use current time

Return values
float

time difference in seconds

microTimestamp()

Timestamp of current epoch with microsecond precision useful for situations where time() might cause too many collisions (account creation, etc)

microTimestamp() : string
Return values
string

timestamp to microsecond of time in second since start of current epoch

checkTimeInterval()

Checks that a timestamp is within the time interval given by a start time (HH:mm) and a duration

checkTimeInterval(string $start_time, string $duration[, int $time = -1 ]) : int
Parameters
$start_time : string

string of the form (HH:mm)

$duration : string

string containing an int in seconds

$time : int = -1

a Unix timestamp.

Return values
int

-1 if the time of day of $time is not within the given interval. Otherwise, the Unix timestamp at which the interval will be over for the same day as $time.

convertPixels()

Converts a CSS unit string into its equivalent in pixels. This is used by @see SvgProcessor.

convertPixels(string $value) : int
Parameters
$value : string

a number followed by a legal CSS unit

Return values
int

a number in pixels

countFiles()

Returns the number of files in a folder

countFiles(string $folder) : int
Parameters
$folder : string

path to folder to count

Return values
int

number of files

makePath()

Creates folders along a filesystem path if they don't exist

makePath(string $path) : bool
Parameters
$path : string

a file system path

Return values
bool

success or failure

deleteFileOrDir()

This is a callback function used in the process of recursively deleting a directory

deleteFileOrDir(string $file_or_dir) : mixed
Parameters
$file_or_dir : string

the filename or directory name to be deleted

Tags
see
DatasourceManager::unlinkRecursive()
Return values
mixed

setWorldPermissions()

This is a callback function used in the process of recursively chmoding to 777 all files in a folder

setWorldPermissions(string $file) : mixed
Parameters
$file : string

the filename or directory name to be chmod

Tags
see
DatasourceManager::setWorldPermissionsRecursive()
Return values
mixed

fileInfo()

This is a callback function used in the process of recursively calculating an array of file modification times and files sizes for a directory

fileInfo(string $file) : an
Parameters
$file : string

a name of a file in the file system

Return values
an

array whose single element contain an associative array with the size and modification time of the file

orderCallback()

Callback function used to sort documents by a field

orderCallback(string $word_doc_a, string $word_doc_b[, string $order_field = null ]) : int

Should be initialized before using in usort with a call like: orderCallback($tmp, $tmp, "field_want");

Parameters
$word_doc_a : string

doc id of first document to compare

$word_doc_b : string

doc id of second document to compare

$order_field : string = null

which field of these associative arrays to sort by

Return values
int

-1 if first doc bigger 1 otherwise

stringOrderCallback()

Callback function used to sort documents by a field where field is assume to be a string

stringOrderCallback(string $word_doc_a, string $word_doc_b[, string $order_field = null ]) : int

Should be initialized before using in usort with a call like: stringOrderCallback($tmp, $tmp, "field_want");

Parameters
$word_doc_a : string

doc id of first document to compare

$word_doc_b : string

doc id of second document to compare

$order_field : string = null

which field of these associative arrays to sort by

Return values
int

-1 if first doc smaller 1 otherwise

stringROrderCallback()

Callback function used to sort documents by a field where field is assume to be a string

stringROrderCallback(string $word_doc_a, string $word_doc_b[, string $order_field = null ]) : int

Should be initialized before using in usort with a call like: stringROrderCallback($tmp, $tmp, "field_want");

Parameters
$word_doc_a : string

doc id of first document to compare

$word_doc_b : string

doc id of second document to compare

$order_field : string = null

which field of these associative arrays to sort by

Return values
int

-1 if first doc bigger 1 otherwise

rorderCallback()

Callback function used to sort documents by a field in reverse order

rorderCallback(string $word_doc_a, string $word_doc_b[, string $order_field = null ]) : int

Should be initialized before using in usort with a call like: rorderCallback($tmp, $tmp, "field_want");

Parameters
$word_doc_a : string

doc id of first document to compare

$word_doc_b : string

doc id of second document to compare

$order_field : string = null

which field of these associative arrays to sort by

Return values
int

1 if first doc bigger -1 otherwise

lessThan()

Callback to check if $a is less than $b

lessThan(float $a, float $b) : int

Used to help sort document results returned in PhraseModel called in IndexArchiveBundle

Parameters
$a : float

first value to compare

$b : float

second value to compare

Tags
see
IndexArchiveBundle::getSelectiveWords()
see
PhraseModel::getPhrasePageResults()
Return values
int

-1 if $a is less than $b; 1 otherwise

greaterThan()

Callback to check if $a is greater than $b

greaterThan(float $a, float $b) : int

Used to help sort document results returned in PhraseModel called in IndexArchiveBundle

Parameters
$a : float

first value to compare

$b : float

second value to compare

Tags
see
IndexArchiveBundle::getSelectiveWords()
see
PhraseModel::getTopPhrases()
Return values
int

-1 if $a is greater than $b; 1 otherwise

e()

shorthand for echo

e(string $text) : mixed
Parameters
$text : string

string to send to the current output

Return values
mixed

remoteAddress()

Compute the real remote address of the incoming connection including forwarding

remoteAddress() : mixed
Return values
mixed

readInput()

Used to read a line of input from the command-line

readInput() : string
Return values
string

from the command-line

readPassword()

Used to read a line of input from the command-line (on unix machines without echoing it)

readPassword() : string
Return values
string

from the command-line

readMessage()

Used to read a several lines from the terminal up until a last line consisting of just a "."

readMessage() : string
Return values
string

from the command-line

mimeType()

Returns the mime type of the provided file name if it can be determined.

mimeType(string $file_name[, bool $use_extension = false ]) : string
Parameters
$file_name : string

(name of file including path to figure out mime type for)

$use_extension : bool = false

whether to just try to guess from the file extension rather than looking at the file

Return values
string

mime type or unknown if can't be determined

generalIsA()

Checks if class_1 is the same as class_2 or has class_2 as a parent Behaves like 3 param version (last param true) of PHP is_a function that came into being with Version 5.3.9.

generalIsA(mixed $class_1, mixed $class_2) : bool
Parameters
$class_1 : mixed

object or string class name to see if in class2

$class_2 : mixed

object or string class name to see if contains class1

Return values
bool

equal or contains class

stripAttributes()

Given the contents of a start XML/HMTL tag strips out all the attributes non listed in $safe_attribute_list

stripAttributes(string $start_tag_contents[, array<string|int, mixed> $safe_attribute_list = [] ]) : string
Parameters
$start_tag_contents : string

the contents of an HTML/XML tag. I.e., if the tag was <tag stuff> then $start_tag_contents could be stuff

$safe_attribute_list : array<string|int, mixed> = []

a list of attributes which should be kept

Return values
string

containing only safe attributes and their values

parseCsv()

Used to parse into a two dimensional array a string that contains CSV data.

parseCsv(string $csv_string) : array<string|int, mixed>
Parameters
$csv_string : string

string with csv data

Return values
array<string|int, mixed>

two dimensional array of elements from csv

arraytoCsv()

Converts an array of values to a comma separated value formatted string.

arraytoCsv(array<string|int, mixed> $arr) : string
Parameters
$arr : array<string|int, mixed>

values to convert

Return values
string

CSV string after conversion

diff()

Computes a Unix-style diff of two strings. That is it only outputs lines which disagree between the two strings. It outputs +line if a line occurs in the second but not first string and -line if a line occurs in the first string but not the second.

diff(string $data1, string $data2[, bool $html = false ]) : string
Parameters
$data1 : string

first string to compare

$data2 : string

second string to compare

$html : bool = false

whether to output html highlighting

Return values
string

representing info about where $data1 and $data2 don't match

computeLCS()

Computes the longest common subsequence of two arrays

computeLCS(array<string|int, mixed> $lines1, array<string|int, mixed> $lines2, int $offset) : mixed
Parameters
$lines1 : array<string|int, mixed>

an array of lines to compute LCS of

$lines2 : array<string|int, mixed>

an array of lines to compute LCS of

$offset : int

an offset to shift over array addresses in output by

Return values
mixed

extractLCSFromTable()

Extracts from a table of longest common sequence moves (probably calculated by @see computeLCS) and a starting coordinate $i, $j in that table, a longest common subsequence

extractLCSFromTable(array<string|int, mixed> $lcs_moves, array<string|int, mixed> $lines, int $i, int $j, int $offset, array<string|int, mixed> &$lcs) : mixed
Parameters
$lcs_moves : array<string|int, mixed>

a table of move computed by computeLCS

$lines : array<string|int, mixed>

from first of the two arrays computing LCS of

$i : int

a line number in string 1

$j : int

a line number in string 2

$offset : int

a number to add to each line number output into $lcs. This is useful if we have trimmed off the initially common lines from our two strings we are trying to compute the LCS of

$lcs : array<string|int, mixed>

an array of triples (index_string1, index_string2, line) the indexes indicate the line number in each string, line is the line in common the two strings

Return values
mixed

tail()

Returns an array of the last $num_lines many lines our of a file

tail(string $file_name, string $num_lines) : array<string|int, mixed>
Parameters
$file_name : string

name of file to return lines from

$num_lines : string

number of lines to retrieve

Return values
array<string|int, mixed>

retrieved lines

lineFilter()

Given an array of lines returns a subarray of those lines containing the filter string or filter array

lineFilter(string $lines, mixed $filters[, bool $case_insensitive = true ]) : array<string|int, mixed>
Parameters
$lines : string

to search

$filters : mixed

either string to filter lines with or an array of strings (any of which can be present to pass the filter)

$case_insensitive : bool = true

whether search should be done case insensitively or not.

Return values
array<string|int, mixed>

lines containing the string

logLineTimestamp()

Tries to extract a timestamp from a line which is presumed to come from a Yioop log file

logLineTimestamp(string $line) : int
Parameters
$line : string

to search

Return values
int

timestamp of that log entry

isPositiveInteger()

Returns whether an input can be parsed to a positive integer

isPositiveInteger(mixed $input) : bool
Parameters
$input : mixed
Return values
bool

whether $input can be parsed to a positive integer.

measureCall()

Used to measure the memory footprint in bytes and time spent calling a method of an object. It also records number of time the method has been called.

measureCall(object $object, string $method[, mixed $arguments = [] ][, string $call_name = "" ]) : mixed

Just calls the method without any recording or timing until an initial call to the function measureCall(null, save_statistics_file) where save_statistics_file is the name of the file you won't to store statistics to.

Parameters
$object : object

name of object whose method we want to call and measure

$method : string

method we're calling

$arguments : mixed = []
$call_name : string = ""

name to use when outputting stats for this call, defaults to $method.

Return values
mixed

whatever method would normally returned when called as above

measureObject()

Used to measure the memory footprint of an object in Yioop and save it to a statistics file No recording is done until an initial call to the function measureCall(null, save_statistics_file) where save_statistics_file is the name of the file you won't to store statistics to.

measureObject(object $object[, string $save_file = "" ][, mixed $class_name = "" ]) : mixed
Parameters
$object : object

name of object whose size we want to measure

$save_file : string = ""

statistics file to write info to

$class_name : mixed = ""
Return values
mixed

measureObjectCall()

General method called by for @see measureCall and @see measureObject Used to measure the memory footprint in bytes of an object or memory and time spent calling a method of an object. It also records number of time the method has been called. When used to call a method before initialization, just calls the method without any recording or timing. To initialize, an initial call to the function measureCall(null, save_statistics_file) where save_statistics_file is the name of the file you won't to store statistics to should be done.

measureObjectCall(object $object, string $method[, mixed $arguments = [] ][, string $call_name = "" ]) : mixed
Parameters
$object : object

name of object whose method we want to call and measure

$method : string

method we're calling

$arguments : mixed = []
$call_name : string = ""

name to use when outputting stats for this call, defaults to $method.

Return values
mixed

whatever method would normally returned when called as above

variableClone()

Makes a deep copy of a variable regardless of its type

variableClone(mixed $var) : mixed
Parameters
$var : mixed

variable to deep copy

Return values
mixed

the deep copy

garbageCollect()

Runs various system garbage collection functions and returns number of bytes freed.

garbageCollect() : int
Return values
int

number of bytes freed

utf8SafeSaveHtml()

The dom method saveHTML has a tendency to replace UTF-8, non-ascii characters with html entities. This is supposed to save avoiding the replacement.

utf8SafeSaveHtml(DOMDocument $dom) : string

What it does is to first save the dom, then it replaces htmlentities of the form &single_char; or &#some_number; with the UTF-8 they correspond to. It leaves all other entities as they are

Parameters
$dom : DOMDocument
Return values
string

output of saving html

utf8WordWrap()

A UTF-8 safe version of PHP's wordwrap function that wraps a string to a given number of characters

utf8WordWrap(string $string[, int $width = 75 ][, string $break = " " ][, bool $cut = false ]) : string
Parameters
$string : string

the input string

$width : int = 75

the number of characters at which the string will be wrapped

$break : string = " "

string used to break a line into two

$cut : bool = false

whether to always force wrap at $width characters even if word hasn't ended

Return values
string

the given string wrapped at the specified length

Search results