Yioop_V9.5_Source_Code_Documentation

LocaleFunctions.php

SeekQuarry/Yioop -- Open Source Pure PHP Search Engine, Crawler, and Indexer

Copyright (C) 2009 - 2023 Chris Pollett chris@pollett.org

LICENSE:

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see https://www.gnu.org/licenses/.

END LICENSE

Tags
author

Chris Pollett chris@pollett.org

license

https://www.gnu.org/licenses/ GPL3

link
https://www.seekquarry.com/
copyright

2009 - 2023

filesource

Table of Contents

localesWithStopwordsList()  : array<string|int, mixed>
Returns an array of locales that have a stop words list and a stop words remover method
localeTagToIso639_2Tag()  : string
Converts a $locale_tag (major-minor) to an Iso 632-2 language name
guessLocale()  : string
Attempts to guess the user's locale based on the request, session, and user-agent data
guessLocaleFromString()  : string
Attempts to guess the user's locale based on a string sample
checkQuery()  : string
Tries to find whether query belongs to a programming language
guessLangEncoding()  : string
Tries to guess at a language tag based on the name of a character encoding
guessEncodingHtmlXml()  : mixed
Tries to guess the encoding used for an Html document
convertUtf8IfNeeded()  : mixed
Converts page data in a site associative array to UTF-8 if it is not already in UTF-8
tl()  : string
Translate the supplied arguments into the current locale.
setLocaleObject()  : mixed
Sets the language to be used for locale settings
getLocaleTag()  : string
Gets the language tag (for instance, en_US for American English) of the locale that is currently being used. This function has the side effect of setting Yioop's current locale.
getLocaleDirection()  : string
Returns the current language directions.
getLocaleQueryStatistics()  : array<string|int, mixed>
Returns the query statistics info for the current llocalt.
getBlockProgression()  : string
Returns the current locales method of writing blocks (things like divs or paragraphs).A language like English puts blocks one after another from the top of the page to the bottom. Other languages like classical Chinese list them from right to left.
getWritingMode()  : string
Returns the writing mode of the current locale. This is a combination of the locale direction and the block progression. For instance, for English the writing mode is lr-tb (left-to-right top-to-bottom).
w1256ToUTF8()  : string
Convert the string $str encoded in Windows-1256 into UTF-8
utf8chr()  : string
Given a unicode codepoint convert it to UTF-8
formatDateByLocale()  : string
Function for formatting a date string based on the locale.

Functions

localesWithStopwordsList()

Returns an array of locales that have a stop words list and a stop words remover method

localesWithStopwordsList() : array<string|int, mixed>
Return values
array<string|int, mixed>

list of locales that have a stopwords list;

localeTagToIso639_2Tag()

Converts a $locale_tag (major-minor) to an Iso 632-2 language name

localeTagToIso639_2Tag(string $locale_tag) : string
Parameters
$locale_tag : string

want to convert

Return values
string

corresponding Iso 632-2 language tag

guessLocale()

Attempts to guess the user's locale based on the request, session, and user-agent data

guessLocale() : string
Return values
string

IANA language tag of the guessed locale

guessLocaleFromString()

Attempts to guess the user's locale based on a string sample

guessLocaleFromString(string $phrase_string[, string $locale_tag = null ]) : string
Parameters
$phrase_string : string

used to make guess

$locale_tag : string = null

language tag to use if can't guess -- if not provided uses current locale's value

Return values
string

IANA language tag of the guessed locale

checkQuery()

Tries to find whether query belongs to a programming language

checkQuery(string $query) : string
Parameters
$query : string

query entered by user

Return values
string

$lang programming language for the the query provided

guessLangEncoding()

Tries to guess at a language tag based on the name of a character encoding

guessLangEncoding(string $encoding) : string
Parameters
$encoding : string

a character encoding name

Return values
string

guessed language tag

guessEncodingHtmlXml()

Tries to guess the encoding used for an Html document

guessEncodingHtmlXml(string $html[, string $return_loc_info = false ]) : mixed
Parameters
$html : string

a character encoding name

$return_loc_info : string = false

if meta http-equiv info was used to find the encoding, then if $return_loc_info is true, we return the location of charset substring. This allows converting to UTF-8 later so cached pages will display correctly and redirects without char encoding won't be given a different hash.

Return values
mixed

either string or array if string then guessed encoding, if array guessed encoding, start_pos of where charset info came from, length

convertUtf8IfNeeded()

Converts page data in a site associative array to UTF-8 if it is not already in UTF-8

convertUtf8IfNeeded(array<string|int, mixed> &$site, string $page_field, string $encoding_field[, function $log_function = "" ]) : mixed
Parameters
$site : array<string|int, mixed>

an associative of info about a web site

$page_field : string

the field in the associative array that contains the $site's web page as a string.

$encoding_field : string

the field in the associative array that contains the character encoding the page is currently in

$log_function : function = ""

a callback function used to write log messages with, if desired.

Return values
mixed

tl()

Translate the supplied arguments into the current locale.

tl() : string

This function takes a variable number of arguments. The first being an identifier to translate. Additional arguments are used to interpolate values in for %s's in the translation.

Return values
string

translated string

setLocaleObject()

Sets the language to be used for locale settings

setLocaleObject(string $locale_tag) : mixed
Parameters
$locale_tag : string

the tag of the language to use to determine locale settings

Return values
mixed

getLocaleTag()

Gets the language tag (for instance, en_US for American English) of the locale that is currently being used. This function has the side effect of setting Yioop's current locale.

getLocaleTag() : string
Return values
string

the tag of the language currently being used for locale settings

getLocaleDirection()

Returns the current language directions.

getLocaleDirection() : string
Return values
string

ltr or rtl depending on if the language is left-to-right or right-to-left

getLocaleQueryStatistics()

Returns the query statistics info for the current llocalt.

getLocaleQueryStatistics() : array<string|int, mixed>
Return values
array<string|int, mixed>

consisting of queries and elapses times for locale computations

getBlockProgression()

Returns the current locales method of writing blocks (things like divs or paragraphs).A language like English puts blocks one after another from the top of the page to the bottom. Other languages like classical Chinese list them from right to left.

getBlockProgression() : string
Return values
string

tb lr rl depending on the current locales block progression

getWritingMode()

Returns the writing mode of the current locale. This is a combination of the locale direction and the block progression. For instance, for English the writing mode is lr-tb (left-to-right top-to-bottom).

getWritingMode() : string
Return values
string

the locales writing mode

w1256ToUTF8()

Convert the string $str encoded in Windows-1256 into UTF-8

w1256ToUTF8(string $str) : string
Parameters
$str : string

Windows-1256 string to convert

Return values
string

the UTF-8 equivalent

utf8chr()

Given a unicode codepoint convert it to UTF-8

utf8chr(int $code) : string
Parameters
$code : int

the codepoint to convert

Return values
string

the corresponding UTF-8 string

formatDateByLocale()

Function for formatting a date string based on the locale.

formatDateByLocale( $timestamp,  $locale_tag) : string
Parameters
$timestamp :

is the crawl time

$locale_tag :

is the tag for locale

Return values
string

formatted date string

Search results