LocaleFunctions.php
SeekQuarry/Yioop -- Open Source Pure PHP Search Engine, Crawler, and Indexer
Copyright (C) 2009 - 2023 Chris Pollett chris@pollett.org
LICENSE:
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program. If not, see https://www.gnu.org/licenses/.
END LICENSE
Tags
Table of Contents
- localesWithStopwordsList() : array<string|int, mixed>
- Returns an array of locales that have a stop words list and a stop words remover method
- localeTagToIso639_2Tag() : string
- Converts a $locale_tag (major-minor) to an Iso 632-2 language name
- guessLocale() : string
- Attempts to guess the user's locale based on the request, session, and user-agent data
- guessLocaleFromString() : string
- Attempts to guess the user's locale based on a string sample
- checkQuery() : string
- Tries to find whether query belongs to a programming language
- guessLangEncoding() : string
- Tries to guess at a language tag based on the name of a character encoding
- guessEncodingHtmlXml() : mixed
- Tries to guess the encoding used for an Html document
- convertUtf8IfNeeded() : mixed
- Converts page data in a site associative array to UTF-8 if it is not already in UTF-8
- tl() : string
- Translate the supplied arguments into the current locale.
- setLocaleObject() : mixed
- Sets the language to be used for locale settings
- getLocaleTag() : string
- Gets the language tag (for instance, en_US for American English) of the locale that is currently being used. This function has the side effect of setting Yioop's current locale.
- getLocaleDirection() : string
- Returns the current language directions.
- getLocaleQueryStatistics() : array<string|int, mixed>
- Returns the query statistics info for the current llocalt.
- getBlockProgression() : string
- Returns the current locales method of writing blocks (things like divs or paragraphs).A language like English puts blocks one after another from the top of the page to the bottom. Other languages like classical Chinese list them from right to left.
- getWritingMode() : string
- Returns the writing mode of the current locale. This is a combination of the locale direction and the block progression. For instance, for English the writing mode is lr-tb (left-to-right top-to-bottom).
- w1256ToUTF8() : string
- Convert the string $str encoded in Windows-1256 into UTF-8
- utf8chr() : string
- Given a unicode codepoint convert it to UTF-8
- formatDateByLocale() : string
- Function for formatting a date string based on the locale.
Functions
localesWithStopwordsList()
Returns an array of locales that have a stop words list and a stop words remover method
localesWithStopwordsList() : array<string|int, mixed>
Return values
array<string|int, mixed> —list of locales that have a stopwords list;
localeTagToIso639_2Tag()
Converts a $locale_tag (major-minor) to an Iso 632-2 language name
localeTagToIso639_2Tag(string $locale_tag) : string
Parameters
- $locale_tag : string
-
want to convert
Return values
string —corresponding Iso 632-2 language tag
guessLocale()
Attempts to guess the user's locale based on the request, session, and user-agent data
guessLocale() : string
Return values
string —IANA language tag of the guessed locale
guessLocaleFromString()
Attempts to guess the user's locale based on a string sample
guessLocaleFromString(string $phrase_string[, string $locale_tag = null ]) : string
Parameters
- $phrase_string : string
-
used to make guess
- $locale_tag : string = null
-
language tag to use if can't guess -- if not provided uses current locale's value
Return values
string —IANA language tag of the guessed locale
checkQuery()
Tries to find whether query belongs to a programming language
checkQuery(string $query) : string
Parameters
- $query : string
-
query entered by user
Return values
string —$lang programming language for the the query provided
guessLangEncoding()
Tries to guess at a language tag based on the name of a character encoding
guessLangEncoding(string $encoding) : string
Parameters
- $encoding : string
-
a character encoding name
Return values
string —guessed language tag
guessEncodingHtmlXml()
Tries to guess the encoding used for an Html document
guessEncodingHtmlXml(string $html[, string $return_loc_info = false ]) : mixed
Parameters
- $html : string
-
a character encoding name
- $return_loc_info : string = false
-
if meta http-equiv info was used to find the encoding, then if $return_loc_info is true, we return the location of charset substring. This allows converting to UTF-8 later so cached pages will display correctly and redirects without char encoding won't be given a different hash.
Return values
mixed —either string or array if string then guessed encoding, if array guessed encoding, start_pos of where charset info came from, length
convertUtf8IfNeeded()
Converts page data in a site associative array to UTF-8 if it is not already in UTF-8
convertUtf8IfNeeded(array<string|int, mixed> &$site, string $page_field, string $encoding_field[, function $log_function = "" ]) : mixed
Parameters
- $site : array<string|int, mixed>
-
an associative of info about a web site
- $page_field : string
-
the field in the associative array that contains the $site's web page as a string.
- $encoding_field : string
-
the field in the associative array that contains the character encoding the page is currently in
- $log_function : function = ""
-
a callback function used to write log messages with, if desired.
Return values
mixed —tl()
Translate the supplied arguments into the current locale.
tl() : string
This function takes a variable number of arguments. The first being an identifier to translate. Additional arguments are used to interpolate values in for %s's in the translation.
Return values
string —translated string
setLocaleObject()
Sets the language to be used for locale settings
setLocaleObject(string $locale_tag) : mixed
Parameters
- $locale_tag : string
-
the tag of the language to use to determine locale settings
Return values
mixed —getLocaleTag()
Gets the language tag (for instance, en_US for American English) of the locale that is currently being used. This function has the side effect of setting Yioop's current locale.
getLocaleTag() : string
Return values
string —the tag of the language currently being used for locale settings
getLocaleDirection()
Returns the current language directions.
getLocaleDirection() : string
Return values
string —ltr or rtl depending on if the language is left-to-right or right-to-left
getLocaleQueryStatistics()
Returns the query statistics info for the current llocalt.
getLocaleQueryStatistics() : array<string|int, mixed>
Return values
array<string|int, mixed> —consisting of queries and elapses times for locale computations
getBlockProgression()
Returns the current locales method of writing blocks (things like divs or paragraphs).A language like English puts blocks one after another from the top of the page to the bottom. Other languages like classical Chinese list them from right to left.
getBlockProgression() : string
Return values
string —tb lr rl depending on the current locales block progression
getWritingMode()
Returns the writing mode of the current locale. This is a combination of the locale direction and the block progression. For instance, for English the writing mode is lr-tb (left-to-right top-to-bottom).
getWritingMode() : string
Return values
string —the locales writing mode
w1256ToUTF8()
Convert the string $str encoded in Windows-1256 into UTF-8
w1256ToUTF8(string $str) : string
Parameters
- $str : string
-
Windows-1256 string to convert
Return values
string —the UTF-8 equivalent
utf8chr()
Given a unicode codepoint convert it to UTF-8
utf8chr(int $code) : string
Parameters
- $code : int
-
the codepoint to convert
Return values
string —the corresponding UTF-8 string
formatDateByLocale()
Function for formatting a date string based on the locale.
formatDateByLocale( $timestamp, $locale_tag) : string
Parameters
Return values
string —formatted date string