Application
Interfaces, Classes, Traits and Enums
- Compressor
- A Compressor is used to apply a filter to objects before they are stored into a WebArchive. The filter is assumed to be invertible, and the typical intention is the filter carries out some kind of string compression.
- CrawlConstants
- Shared constants and enums used by components that are involved in the crawling process
- MediaConstants
- Shared constants and enums used by components that are involved in the media related operations
- Notifier
- A Notifier is an object which will be notified by a priority queue when the index in the queue viewed as array of some data item has been changed.
- ConfigureTool
- Provides a command-line interface way to configure a Yioop Instance.
- CreditConfig
- Class containing methods used to handle payment processing when keyword advertising is enabled.
- AdminController
- Controller used to handle admin functionalities such as modify login and password, CREATE, UPDATE,DELETE operations for users, roles, locale, and crawls
- ApiController
- Controller used mainly for handling JS requests for Help Wiki Pages
- ArchiveController
- Fetcher machines also act as archives for complete caches of web pages, this controller is used to handle access to these web page caches
- ClassifierController
- This class handles XmlHttpRequests to label documents during classifier construction.
- AccountaccessComponent
- Component of the Yioop control panel used to handle activitys for managing accounts, users, roles, and groups. i.e., Settings of users and groups, what roles and groups a user has, what roles and users a group has, and what activities make up a role. It is used by AdminController
- ChatbotComponent
- Provides the AdminController activity that allows users to create Chat Bot Stories. A Chat Bot story is a collection of patterns (expression, trigger state, remote call, result state, responses) that govern how a chat bot will behave under various circumstances
- Component
- Base component class for all components on the SeekQuarry site. A component consists of a collection of activities and their auxiliary methods that can be used by a controller
- CrawlComponent
- This component is used to provide activities for the admin controller related to configuring and performing a web or archive crawl
- SocialComponent
- Provides activities to AdminController related to creating, updating blogs (and blog entries), static web pages, and crawl mixes.
- StoreComponent
- Component of the Yioop control panel used to handle activitys for managing advertisements. i.e., create advertisement, activate/ deactivate advertisement, edit advertisement.It is used by AdminController
- SystemComponent
- This component is used to handle activities related to the configuration of a Yioop installation, translations of text ging in the installation, as well as control of specifying what machines make up the installation and which processes they run.
- Controller
- Base controller class for all controllers on the SeekQuarry site.
- CrawlController
- Controller used to manage networked installations of Yioop where there might be mulliple QueueServers and a NameServer. Command sent to the nameserver web page are mapped out to queue_servers using this controller. Each method of the controller essentially mimics one method of CrawlModel, PhraseModel, or in general anything that extends ParallelModel and is used to proxy that information through a result web page back to the name_server.
- FetchController
- This class handles data coming to a queue_server from a fetcher Basically, it receives the data from the fetcher and saves it into various files for later processing by the queue server.
- GroupController
- Controller used to handle user group activities outside of the admin panel setting. This either could be because the admin panel is "collapsed" or because the request concerns a wiki page.
- JobsController
- This class is used to handle requests from a MediaUpdater to a name server There are three main types of requests: getUpdateProperties, and for any job that the MediaUpdater might be running, its getTasks, and putTasks request. getUpdateProperties is supposed to provide configuration settings for the MediaUpdater. A MediaUpdater might be running several periodic jobs. The getTasks requests of a job is used to see if there is any new work available of that job type on the name server. A putTasks request is used to handle any computed data sent back from a MediaUpdater to the name server.
- MachineController
- This class handles requests from a computer that is managing several fetchers and queue_servers. This controller might be used to start, stop fetchers/queue_server as well as get status on the active fetchers
- RegisterController
- Controller used to handle account registration and retrieval for the Yioop website. Also handles data for suggest a url
- ResourceController
- Used to serve resources, css, or scripts such as images from APP_DIR
- SearchController
- Controller used to handle search requests to SeekQuarry search site. Used to both get and display search results.
- StaticController
- This controller is used by the Yioop web site to display PUBLIC_GROUP_ID pages more like static forward facing pages.
- TestsController
- Controller used to handle search requests to SeekQuarry search site. Used to both get and display search results.
- StockBot
- This class demonstrates a simple Stock Chat Bot using the Yioop ChatBot APIs for Yioop Discussion Groups.
- WeatherBot
- This class demonstrates a simple Weather Chat Bot using the Yioop ChatBot APIs for Yioop Discussion Groups.
- ArcTool
- Command line program that allows one to examine the content of the WebArchiveBundles and IndexArchiveBundles of Yioop crawls.
- ClassifierTool
- Class used to encapsulate all the activities of the ClassifierTool.php command line script. This script allows one to automate the building and testing of classifiers, providing an alternative to the web interface when
- PageIterator
- This class provides the same interface as an iterator over crawl mixes, but simply iterates over an array.
- ClassifierTrainer
- This class is used to finalize a classifier via the web interface.
- DictionaryUpdater
- Fetcher
- This class is responsible for fetching web pages for the SeekQuarry/Yioop search engine
- MediaUpdater
- Separate process/command-line script which can be used to update news sources for Yioop and also handle other kinds of activities such as video conversion. This is as an alternative to using the web app for updating. Makes use of the web-apps code.
- Mirror
- This class is responsible for syncing crawl archives between machines using the SeekQuarry/Yioop search engine
- QueueServer
- Command line program responsible for managing Yioop crawls.
- AnalyticsManager
- Used to set and get SQL query and search query timing statistic between models and index_bundle_iterators
- ArcArchiveBundleIterator
- Used to iterate through the records of a collection of arc files stored in a WebArchiveBundle folder. Arc is the file format of the Internet Archive http://www.archive.org/web/researcher/ArcFileFormat.php. Iteration would be for the purpose making an index of these records
- ArchiveBundleIterator
- Abstract class used to model iterating documents indexed in an WebArchiveBundle or set of such bundles.
- DatabaseBundleIterator
- Used to iterate through the records that result from an SQL query to a database
- MediaWikiArchiveBundleIterator
- Used to iterate through a collection of .xml.bz2 media wiki files stored in a WebArchiveBundle folder. Here these media wiki files contain the kinds of documents used by wikipedia. Iteration would be for the purpose making an index of these records
- MixArchiveBundleIterator
- Used to do an archive crawl based on the results of a crawl mix.
- OdpRdfArchiveBundleIterator
- Used to iterate through the records of a collection of one or more open directory RDF files stored in a WebArchiveBundle folder. Open Directory file can be found at http://rdf.dmoz.org/ . Iteration would be for the purpose making an index of these records
- TextArchiveBundleIterator
- Used to iterate through the records of a collection of text or compressed text-oriented records
- WarcArchiveBundleIterator
- Used to iterate through the records of a collection of warc files stored in a WebArchiveBundle folder. Warc is the newer file format of the Internet Archive and other for digital preservation: http://www.digitalpreservation.gov/formats/fdd/fdd000236.shtml http://archive-access.sourceforge.net/warc/ Iteration is done for the purpose making an index of these records
- WebArchiveBundleIterator
- Class used to model iterating documents indexed in an WebArchiveBundle. This would typically be for the purpose of re-indexing these documents.
- BloomFilterBundle
- A BloomFilterBundle is a directory of BloomFilterFile.
- BloomFilterFile
- Code used to manage a bloom filter in-memory and in file.
- BPlusTree
- This class implements the B+-tree structure over existing file system
- BZip2BlockIterator
- This class is used to allow one to iterate through a Bzip2 file.
- BinaryFeatures
- A concrete Features subclass that represents a document as a binary vector where a one indicates that a feature is present in the document, and a zero indicates that it is not. The absent features are ignored, so the binary vector is actually sparse, containing only those feature indices where the value is one.
- ChiSquaredFeatureSelection
- A subclass of FeatureSelection that implements chi-squared feature selection.
- Classifier
- The primary interface for building and using classifiers. An instance of this class represents a single classifier in memory, but the class also provides static methods to manage classifiers on disk.
- ClassifierAlgorithm
- An abstract class shared by classification algorithms that implement a common interface.
- Features
- Manages a dataset's features, providing a standard interface for converting documents to feature vectors, and for accessing feature statistics.
- FeatureSelection
- This is an abstract class that specifies an interface for selecting top features from a dataset.
- LassoRegression
- Implements the logistic regression text classification algorithm using lasso regression and a cyclic coordinate descent optimization step.
- InvertedData
- Stores a data matrix in an inverted index on columns with non-zero entries.
- NaiveBayes
- Implements the Naive Bayes text classification algorithm.
- SparseMatrix
- A sparse matrix implementation based on an associative array of associative arrays.
- WeightedFeatures
- A concrete Features subclass that represents a document as a vector of feature weights, where weights are computed using a modified form of TF * IDF. This feature mapping is experimental, and may not work correctly.
- GzipCompressor
- Implementation of a Compressor using GZIP/GUNZIP as the filter.
- NonCompressor
- Implementation of a trivial Compressor.
- ComputerVision
- Class used to encapsulate various methods related to computer vision that might be useful for indexing documents. These include recognizing text in images
- ContextTagger
- Abstract, base context tagger class.
- CrawlDaemon
- Used to run scripts as a daemon on *nix systems
- CrawlQueueBundle
- Encapsulates the data structures needed to have a queue of to crawl urls
- DoubleIndexBundle
- A DoubleIndexBundle encapsulates and provided methods for two IndexDocumentBundle used to store a repeating crawl. One one thse bundles is used to handle current search queries, while the other is used to store an ongoing crawl, once the crawl time has been reach the roles of the two bundles are swapped
- FeedArchiveBundle
- Subclass of IndexArchiveBundle with bloom filters to make it easy to check if a news feed item has been added to the bundle already before adding it
- FeedDocumentBundle
- Subclass of IndexDocumentBundle with bloom filters to make it easy to check if a news feed item has been added to the bundle already before adding it
- FetchGitRepositoryUrls
- Library of functions used to fetch Git internal urls
- FetchUrl
- Code used to manage HTTP or Gopher requests from one or more URLS
- FileCache
- Library of functions used to implement a simple file cache
- HashTable
- Code used to manage a memory efficient hash table Weights for the queue must be flaots
- DisjointIterator
- Used to iterate over the documents which occur in a set of disjoint iterators all belonging to the same index
- DocIterator
- Used to iterate through all the documents and links associated with a an IndexArchiveBundle. It iterates through each doc or link regardless of the words it contains. It also makes it easy to get the summaries of these documents.
- GroupIterator
- This iterator is used to group together documents or document parts which share the same url. For instance, a link document item and the document that it links to will both be stored in the IndexArchiveBundle by the QueueServer. This iterator would combine both these items into a single document result with a sum of their score, and a summary, if returned, containing text from both sources. The iterator's purpose is vaguely analogous to a SQL GROUP BY clause
- IndexBundleIterator
- Abstract classed used to model iterating documents indexed in an IndexArchiveBundle or set of such bundles.
- IntersectIterator
- Used to iterate over the documents which occur in all of a set of iterator results
- NegationIterator
- Used to iterate over the documents which don't occur in a set of iterator results
- NetworkIterator
- This iterator is used to handle querying a network of queue_servers with regard to a query
- UnionIterator
- Used to iterate over the documents which occur in any of a set of WordIterator results
- WordIterator
- Used to iterate through the documents associated with a word in an IndexArchiveBundle. It also makes it easy to get the summaries of these documents.
- IndexArchiveBundle
- Encapsulates a set of web page summaries and an inverted word-index of terms from these summaries which allow one to search for summaries containing a particular word.
- IndexDictionary
- Data structure used to store for entries of the form: word id, index shard generation, posting list offset, and length of posting list. It has entries for all words stored in a given IndexArchiveBundle. There might be multiple entries for a given word_id if it occurs in more than one index shard in the given IndexArchiveBundle.
- IndexDocumentBundle
- Encapsulates a set of web page documents and an inverted word-index of terms from these documents which allow one to search for documents containing a particular word.
- AddressesPlugin
- Used to extract emails, phone numbers, and addresses from a web page.
- IndexingPlugin
- Base indexing plugin Class. An indexing plugin allows a developer to do additional processing on web pages during a crawl, then after the web crawl is over do post processing on the additional data that was collected. For example, during a crawl one might by analysing web pages mark pages that have recipes on them with the meta word recipe:all, then after the crawl is over do post processing such as clustering the recipe's found and add additional meta words to retrieve recipe's by principle ingredient.
- WordfilterPlugin
- WordFilterPlugin is used to filter documents by terms during a crawl.
- IndexManager
- Class used to manage open IndexArchiveBundle's while performing a query. Ensures an easy place to obtain references to these bundles and ensures only one object per bundle is instantiated in a Singleton-esque way.
- IndexShard
- Data structure used to store one generation worth of the word document index (inverted index). This data structure consists of three main components a word entries, word_doc entries, and document entries.
- JavascriptUnitTest
- Super class of all the test classes testing Javascript functions.
- LinearAlgebra
- Class useful for handling linear algebra operations on associative array with key => value pairs where the value is a number.
- LinearHashTable
- This class implements a linear hash table for storing records that use PackedTableTools for their format
- MailServer
- A small class for communicating with an SMTP server. Used to avoid configuration issues that might be needed with PHP's built-in mail() function. Here is an example of how one might use this class:
- AnalyticsJob
- A media job used to periodically calculate summary statistics about group, thread, page, and query impressions.
- BulkEmailJob
- MediaJob class for sending out emails from a Yioop instance (either in response to account registrations or in response to group posts and similar activities)
- FeedsUpdateJob
- A media job to download and index feeds from various search sources (RSS, HTML scraper, etc). Idea is that this job runs once an hour to get the latest news, movies, weather from those sources.
- MediaJob
- Base class for jobs to be carried out by a MediaUpdater process.
- PodcastDownloadJob
- A media job to periodically download Podcasts and store them as resources of a Wiki Page
- RecommendationJob
- Recommendation Job recommends the trending threads as well as threads and groups which are relevant based on the users viewing history
- TrendingHighlightsJob
- A media job to compute trending terms from the feed search sources, and to generate a list of top feed items for the landing page for the different subsearches displlayed on the landing page.
- VideoConvertJob
- Media Job used to convert videos uploaded to the wiki or group feeds to a common format (mp4)
- WikiThumbDetailJob
- A media job to add thumbnails and animated thumbnails for wiki page media resources that have just been viewed in the browser. This is detected by the method: GroupModel::getGroupPageResourceUrls which write a file GroupModel::NEEDS_THUMBS_DIR . L\crawlHash($folder) . ".txt" to with information about the folders needing thumbs.
- NamedEntityContextTagger
- Machine learning based named entity recognizer.
- NWordGrams
- Library of functions used to create and extract n word grams
- PackedTableTools
- A collection of methods to encode and decode records according to a signature.
- PageRuleParser
- Has methods to parse user-defined page rules to apply documents to be indexed.
- PartialZipArchive
- Used to extract files from an initial segment or a fragment of a ZIP Archive.
- PartitionDocumentBundle
- A partition document bundle is a collection of partition each of which in turn can hold a concatenated sequence of compressed documents and which are managed together. It is a successor format to the earlier WebArchiveBundle of Yioop. The partition document bundle stores individual records using a record format defined via the PackedTableTools class.
- PartOfSpeechContextTagger
- Machine learning based Part of Speech tagger.
- PersistentStructure
- A PersistentStructure is a data structure which every so many operations will be saved to secondary storage (such as disk).
- PhraseParser
- Library of functions used to manipulate words and phrases
- PriorityQueue
- Code used to manage a memory efficient priority queue.
- BmpProcessor
- Used to create crawl summary information for BMP and ICO files
- CompressedProcessor
- Used to create crawl summary information for a gz compressed file whose uncompressed form has a processor we index.
- DocProcessor
- Used to create crawl summary information for binary DOC files
- DocxProcessor
- Used to create crawl summary information for DOCX files
- EpubProcessor
- Used to create crawl summary information for XML files (those served as application/epub+zip)
- GifProcessor
- Used to create crawl summary information for GIF files
- GitXmlProcessor
- Parent class common to all processors used to create crawl summary information that involves basically text data
- GopherProcessor
- Used to create crawl summary information for gopher protocol pages
- HtmlProcessor
- Used to create crawl summary information for HTML files
- IconProcessor
- Used to create crawl summary information for BMP and ICO files
- ImageProcessor
- Base abstract class common to all processors used to create crawl summary information from images
- JavaProcessor
- Parent class common to all processors used to create crawl summary information that involves basically text data
- JpgProcessor
- Used to create crawl summary information for JPEG files
- PageProcessor
- Base class common to all processors of web page data
- PdfProcessor
- Used to create crawl summary information for PDF files
- PngProcessor
- Used to create crawl summary information for PNG files
- PptProcessor
- Used to create crawl summary information for PPT files
- PptxProcessor
- Used to create crawl summary information for PPTX files
- PythonProcessor
- Parent class common to all processors used to create crawl summary information that involves basically text data
- RobotProcessor
- Processor class used to extract information from robots.txt files
- RssProcessor
- Used to create crawl summary information for RSS or Atom files
- RtfProcessor
- Used to create crawl summary information for RTF files
- SitemapProcessor
- Used to create crawl summary information for sitemap files
- SvgProcessor
- Used to create crawl summary information for SVG files. This class is a little bit weird in that it generates thumbs like the image processor classes, but when it gives up on the data it falls back to text processor handling.
- TextProcessor
- Parent class common to all processors used to create crawl summary information that involves basically text data
- VideoProcessor
- Base abstract class common to all processors used to create crawl summary information from videos
- XlsxProcessor
- Used to create crawl summary information for xlsx files
- XmlProcessor
- Used to create crawl summary information for XML files (those served as text/xml)
- ScraperManager
- Class used by html processors to detect if a page matches a particular signature such as that of a content management system, and also to provide scraping mechanisms for the content of such a page
- StochasticTermSegmenter
- Class for segmenting terms using Stochastic Finite State Word Segmentation
- StringArray
- Memory efficient implementation of persistent arrays
- SuffixTree
- Data structure used to maintain a suffix tree for a passage of words.
- CentroidSummarizer
- Class which may be used by TextProcessors to get a summary for a text document that may later be used for indexing. This is done by the @see getSummmary method. getSummary does this splitting the document into sentences and computing inverse sentence frequency (should be ISL, but we call IDF) scores for each term. It then computes an average document vector (we call centroid) with components (total number of occurrences of term) * (IDF score of term).
- CentroidWeightedSummarizer
- Class which may be used by TextProcessors to get a summary for a text document that may later be used for indexing. This is done by the @see getSummmary method. To generate a summary a normalized term frequency vector is computed for each sentence. An average vector is then computed by summing these and renormalizing the result.
- GraphBasedSummarizer
- Class which may be used by TextProcessors to get a summary for a text document that may later be used for indexing. The method @see getSummary is used to obtain such a summary. In GraphBasedSummarizer's implementation of this method sentences are ranks using a page rank style algorithm based on sentence adjacencies calculated using a distortion score between pair of sentence (@see LinearAlgebra::distortion for details on this).
- ScrapeSummarizer
- Class which may be used by TextProcessors to get a summary for a text document that may later be used for indexing.
- Summarizer
- Base class for all summarizers. Summarizers chief method is getSummary which is supposed to take a text or XML document and produces a summary of that document up to PageProcessor::$max_description_len many characters. Summarizers also contain various methods to generate word cloud from such a summary
- Trie
- Implements a trie data structure which can be used to store terms read from a dictionary in a succinct way
- UnitTest
- Base class for all the SeekQuarry/Yioop engine Unit tests
- UrlParser
- Library of functions used to manipulate and to extract components from urls
- Mod9Constants
- Mini-class (so not own file) used to hold encode decode info related to Mod9 encoding (as variant of Simplified-9 specify to Yioop).
- VersionManager
- VersionManager can be used to create and manage versions of files in a folder so that a user can revert the files to any version desired back to the time the folder under manager was first managed. It is used by Yioop's Wiki system to handle versions of image and other media resources for a Wiki page.
- WebArchive
- Code used to manage web archive files
- WebArchiveBundle
- A web archive bundle is a collection of web archives which are managed together.It is useful to split data across several archive files rather than just store it in one, for both read efficiency and to keep filesizes from getting too big. In some places we are using 4 byte int's to store file offsets which restricts the size of the files we can use for wbe archives.
- WebSite
- A single file, low dependency, pure PHP web server and web routing engine class.
- WebException
- Exception generated when a running WebSite script calls webExit()
- WikiParser
- Class with methods to parse mediawiki documents, both within Yioop, and when Yioop indexes mediawiki dumps as from Wikipedia.
- Tokenizer
- Arabic specific tokenization code. In particular, it has a stemmer, The stemmer is my stab at porting Ljiljana Dolamic (University of Neuchatel, www.unine.ch/info/clef/) C stemming algorithm: http://members.unine.ch/jacques.savoy/clef That algorithm maps all stems to ASCII. Instead, I tried to leave everything using Arabic characters.
- Tokenizer
- Bengali specific tokenization code. Typically, tokenizer.php either contains a stemmer for the language in question or it specifies how many characters in a char gram
- Tokenizer
- German specific tokenization code. Typically, tokenizer.php either contains a stemmer for the language in question or it specifies how many characters in a char gram
- Tokenizer
- Greek specific tokenization code. Contains a list of greek stop words used in making word clouds. It also has a greek stemmer.
- Tokenizer
- This class has a collection of methods for English locale specific tokenization. In particular, it has a stemmer, a stop word remover (for use mainly in word cloud creation), and a part of speech tagger (for question answering). The stemmer is my stab at implementing the Porter Stemmer algorithm presented http://tartarus.org/~martin/PorterStemmer/def.txt The code is based on the non-thread safe C version given by Martin Porter.
- Tokenizer
- Spanish specific tokenization code. Typically, tokenizer.php either contains a stemmer for the language in question or it specifies how many characters in a char gram
- Tokenizer
- Persian specific tokenization code. In particular, it has a stemmer, The stemmer is a modified variant (handling prefixes slightly differently) of my stab at porting Nick Patch's Perl port, https://metacpan.org/pod/Lingua::Stem::UniNE::FA, of the stemming algorithm by Ljiljana Dolamic and Jacques Savoy of the University of Neuchâtel. The Java version of this is at http://members.unine.ch/jacques.savoy/clef/persianStemmerUnicode.txt (beware of Java's handling of Unicode).
- Tokenizer
- This class has a collection of methods for French locale specific tokenization. In particular, it has a stemmer, a stop word remover (for use mainly in word cloud creation). The stemmer is my stab at re-implementing the stemmer algorithm given at http://snowball.tartarus.org and was inspired by http://snowball.tartarus.org/otherlangs/french_javascript.txt Here given a word, its stem is that part of the word that is common to all its inflected variants. For example, tall is common to tall, taller, tallest. A stemmer takes a word and tries to produce its stem.
- Tokenizer
- Hebrew specific tokenization code. Typically, tokenizer.php either contains a stemmer for the language in question or it specifies how many characters in a char gram
- Tokenizer
- Hindi specific tokenization code. In particular, it has a stemmer, The stemmer is my stab at porting Ljiljana Dolamic (University of Neuchatel, www.unine.ch/info/clef/) Java stemming algorithm: http://members.unine.ch/jacques.savoy/clef/HindiStemmerLight.java.txt Here given a word, its stem is that part of the word that is common to all its inflected variants. For example, tall is common to tall, taller, tallest. A stemmer takes a word and tries to produce its stem.
- Tokenizer
- Indonesian specific tokenization code. Typically, tokenizer.php either contains a stemmer for the language in question or it specifies how many characters in a char gram
- Tokenizer
- Italian specific tokenization code. Typically, tokenizer.php either contains a stemmer for the language in question or it specifies how many characters in a char gram
- Tokenizer
- Japanese specific tokenization code. Typically, tokenizer.php either contains a stemmer for the language in question or it specifies how many characters in a char gram
- Tokenizer
- Kanada specific tokenization code. Typically, tokenizer.php either contains a stemmer for the language in question or it specifies how many characters in a char gram
- Tokenizer
- Korean specific tokenization code. Typically, tokenizer.php either contains a stemmer for the language in question or it specifies how many characters in a char gram
- Tokenizer
- This class has a collection of methods for Dutch locale specific tokenization. In particular, it has a stemmer, .
- Tokenizer
- Polish specific tokenization code. Typically, tokenizer.php either contains a stemmer for the language in question or it specifies how many characters in a char gram
- Tokenizer
- This class has a collection of methods for Portuguese locale specific tokenization. In particular, it has a stemmer implementing the Snowball Stemming algorithm presented in http://snowball.tartarus.org/algorithms/portuguese/stemmer.html
- Tokenizer
- This class has a collection of methods for Russian locale specific tokenization. In particular, it has a stemmer, a stop word remover (for use mainly in word cloud creation). The stemmer is a modification (with bug fixes ) of Dennis Kreminsky's stemmer from: http://snowball.tartarus.org/otherlangs/russian_php5.txt Here given a word, its stem is that part of the word that is common to all its inflected variants. For example, tall is common to tall, taller, tallest. A stemmer takes a word and tries to produce its stem.
- Tokenizer
- Telegu specific tokenization code. Typically, tokenizer.php either contains a stemmer for the language in question or it specifies how many characters in a char gram
- Tokenizer
- Thai specific tokenization code. Typically, tokenizer.php either contains a stemmer for the language in question or it specifies how many characters in a char gram
- Tokenizer
- Tagalog (spoken in Philipines) specific tokenization code.
- Tokenizer
- Turkish specific tokenization code. Typically, tokenizer.php either contains a stemmer for the language in question or it specifies how many characters in a char gram
- Tokenizer
- Vietnamese specific tokenization code. Typically, tokenizer.php either contains a stemmer for the language in question or it specifies how many characters in a char gram for Vietnamese neither char gramming or stemming seemed to make sense, so for now this file is blank.
- Tokenizer
- Chinese specific tokenization code. Typically, tokenizer.php either contains a stemmer for the language in question or it specifies how many characters in a char gram
- ActivityModel
- This is class is used to handle db results related to Administration Activities
- AdvertisementModel
- This class is used to handle database statements related to Advertisements
- BotModel
- BotModel is used to handle database statements related to Bot User stories A Bot User Story consists of a sequence of patterns for what a bot should do when another user posts a request to the bot (a message beginning with @bot_name) in a discussion group.
- CaptchaModel
- This is class is used to handle the captcha settings for Yioop
- CrawlModel
- This is class is used to handle getting/setting crawl parameters, CRUD operations on current crawls, start, stop, status of crawls, getting cache files out of crawls, determining what is the default index to be used, marshalling/unmarshalling crawl mixes, and handling data from suggest-a-url forms
- CreditModel
- This class is used to manage Advertising Credits a user may purchase or spend
- CronModel
- Used to remember the last time the web app ran periodic activities
- DatasourceManager
- This abstract class defines the interface through which the seek_quarry program communicates with a database and the filesystem.
- MysqlManager
- Mysql DatasourceManager
- PdoManager
- Pdo DatasourceManager
- Sqlite3Manager
- SQLite3 DatasourceManager
- GroupModel
- This is class is used to handle db results related to Group Administration.
- ImpressionModel
- Model used to keep track for analytic and user experience activities that users carry out on a Yioop web site. For analytics things that might tracked are wiki page views, queries, query outcomes. For UX things that the impression model allows is to keep track of recent group a user has visited to provide better bread crumb drop downs, make the manage account landing page list more relevant groups, determine start of whether a media item has been watched, completely watched, etc.
- LocaleModel
- Used to encapsulate information about a locale (data about a language in a given region).
- MachineModel
- This is class is used to handle db results related to Machine Administration
- Model
- This is a base class for all models in the SeekQuarry search engine. It provides support functions for formatting search results
- ParallelModel
- Base class of models that need access to data from multiple queue servers Subclasses include @see CrawlModel and @see PhraseModel.
- PhraseModel
- This is class is used to handle results for a given phrase search
- ProfileModel
- This is class is used to handle getting and saving the Profile.php of the current search engine instance
- RoleModel
- This is class is used to handle db results related to Role Administration
- ScraperModel
- Used to manage data related to the SCRAPER database table.
- SearchverticalsModel
- This class manages the editing of search verticals. This includes allowing one to specify a search result should be filtered from the results of a query, it also includes alterning the title and description of a result from how it is stored in a particular index and it finally includes creating, updating, deleting knowledge wiki results To handle these activities this class leverages the existing group wiki system of Yioop. Edited and filtered search results correspond to group feed entries in a Search Group. Edited knowledge wiki entries correspond to wiki entries in the Search Group.
- SigninModel
- This is class is used to handle db results needed for a user to login
- SourceModel
- Used to manage data related to video, news, and other search sources Also, used to manage data about available subsearches seen in SearchView
- TrendingModel
- This is class is used to handle db results related to Group Administration. Groups are collections of users who might access a common blog/news feed and set of pages. This method also controls adding and deleting entries to a group feed and does limited access control checks of these operations.
- UserModel
- This class is used to handle database statements related to User Administration
- VisitorModel
- Used to keep track of ip address of failed account creation and login attempts
- AdminView
- View used to draw activity list and current activity for a logged in user
- ApiView
- View used to draw and allow editing of wiki page when not in the admin view (so activities panel on side is not present.) This is also used to draw wiki pages for public groups when not logged.
- ComponentView
- Base class for views created by adding elements to top, sub-top, same, opposite, center columns, or bottom possitions
- CrawlstatusView
- This view is used to display information about crawls that have been made by this seek_quarry instance
- AdminbarElement
- Element used to draw the navigation bar on admin pages.
- AdminElement
- Element used to render the admin interface for a logged in user of Yioop
- AdminmenuElement
- Element responsible for drawing the side menu portion of an admin page. This allows the user to signout or select from among allowed admin activities
- ApiElement
- Element responsible for drawing wiki pages in either admin or wiki view It is also responsible for rendering wiki history pages, and listings of wiki pages available for a group
- AppearanceElement
- Element responsible for drawing the screen used to set up the search engine appearance.
- BotstoryElement
- Element responsible for displaying the bot story features that someone can use to create for their own chat bot.
- ConfigureElement
- Element responsible for drawing the screen used to set up the search engine
- CrawloptionsElement
- Element responsible for displaying options about how a crawl will be performed. For instance, what are the seed sites for the crawl, what sites are allowed to be crawl what sites must not be crawled, etc.
- DisplayadvertisementElement
- Element responsible for displaying the advertisement on the search results page
- EditclassifierElement
- This element renders the initial edit page for a classifier, where the user can update the classifier label and find documents to label and add to the training set. The page displays some initial statistics and a form for finding documents in any existing index, but after that it is heavily modified by JavaScript in response to user actions and XmlHttpRequests made to the server.
- EditlocalesElement
- Element responsible for displaying the form where users can input string translations for a given locale
- EditmixElement
- Element responsible for displaying info about a given crawl mix
- Element
- Base Element Class.
- FooterElement
- Element responsible for drawing footer links on search view and static view pages
- GroupbarElement
- Element used to draw the navigation bar on group feed and wiki pages.
- GroupElement
- Element used to present group feed and wiki pages for the Yioop website
- GroupfeedElement
- Element responsible for draw the feeds a user is subscribed to
- GroupmenuElement
- Element responsible for drawing the menu side bar for group and wiki pages. These options include recently viewed wiki pages, groups, and threads
- HeaderElement
- Element responsible for drawing the header on admin, group, and, search views
- HelpElement
- This element is used to display the list of available activities in the AdminView
- LanguageElement
- Element used to display available languages in the settings view
- MachinelogElement
- Element responsible for displaying the queue_server or fetcher log of a machine
- ManageaccountElement
- Element responsible for displaying the user account features that someone can modify for their own SeekQuarry/Yioop account.
- ManageadvertisementsElement
- Element responsible for displaying advertisements information that someone can create, view, and modify for their own SeekQuarry/Yioop account.
- ManageclassifiersElement
- This element renders the page that lists classifiers, provides a form to create new ones, and provides per-classifier action links to edit, finalize, and delete the associated classifier.
- ManagecrawlsElement
- Element responsible for displaying info about starting, stopping, deleting, and using a crawl. It makes use of the CrawlStatusView
- ManagecreditsElement
- Element responsible for displaying Ad credits purchase form and recent transaction table
- ManagegroupsElement
- Used to draw the admin screen on which users can create groups, delete groups and add and delete users and roles to a group
- ManagelocalesElement
- This Element is responsible for drawing screens in the Admin View related to localization. Namely, the ability to create, delete, and text writing mode for locales as well as the ability to modify translations within a locale.
- ManagemachinesElement
- Used to draw the admin screen on which admin users can add/delete and manage machines which might act as fetchers or queue_servers.
- ManagerolesElement
- Used to draw the admin screen on which admin users can create roles, delete roles and add and delete activitiess from roles
- ManageusersElement
- Element responsible for drawing the activity screen for User manipulation in the AdminView.
- MediajobsElement
- Element used to draw toggles indicating which jobs the Media Updater will run and letting the user turn these jobs on/off.
- MixcrawlsElement
- Element responsible for displaying info to allow a user to create a crawl mix or edit an existing one
- PageOptionsElement
- This element is used to render the Page Options admin activity This activity lets a user control the amount of web pages downloaded, the recrawl frequency, the file types, etc of the pages crawled
- PaginationElement
- Element responsible for drawing the sequence of available pages for search results.
- QuerystatsElement
- Element responsible for displaying statistics about recent queries that have been run against the search engine
- ResultsEditorElement
- Element used to control how urls are filtered out of search results (if desired) after a crawl has already been performed.
- ScrapersElement
- Contains the forms for managing Web Page Scrapers.
- SearchbarElement
- Element used to draw the navigation bar on search pages.
- SearchcalloutElement
- Element responsible for drawing search wiki callouts for search results
- SearchElement
- Element used to present search results It is also contains the landing page search box for people to types searches into
- SearchmenuElement
- Element responsible for drawing the side menu with sign in/create account, search source options, search settings, and tool info for search pages
- SearchsourcesElement
- This element renders the forms for managing search sources for news, etc.
- SecurityElement
- Element used to handle configurations of Yioop related to authentication, captchas, and recovery of missing passwords
- ServersettingsElement
- Element used to draw forms to set up the various external servers that might be connected with a Yioop installation
- SideadvertisementElement
- Element used to draw an external server advertisement (if there is one) as a column on the opposite side of a search results page
- StatisticsElement
- Draws an element displaying statistical information about a web crawl such as number of hosts visited, distribution of file sizes, distribution of file type, distribution of languages, etc
- TopadvertisementElement
- This element is used to draw the keyword advertisement above search results (if present)
- TrendingElement
- Class to draw statistics and charts about trending news feed terms
- UsermessagesElement
- Element responsible for draw the feeds a user is subscribed to
- WelcomemenuElement
- Element responsible for drawing the side menu portion of an admin page. This allows the user to signout or select from among allowed admin activities
- WikiElement
- Element responsible for drawing wiki pages in group view It is also responsible for rendering wiki history pages, and listings of wiki pages available for a group
- FeedstatusView
- This view is drawn to refresh a group feed that has recently been posted to. Redrawing is invoked from a client script every so many seconds.
- FetchView
- This view is displayed by the fetch_controller.php to send information to a fetcher about things like what to crawl next
- GroupView
- View used to draw and allow editing of group feeds when not in the admin view (so activities panel on side is not present.) This is also used to draw group feeds for public feeds when not logged.
- CloseHelper
- This is a helper class is used to handle closing an option window for an activity
- EmojipickerHelper
- This is a helper class is used to handle drawing of Emojis in the Usermessages activity
- FeedsHelper
- Helper used to draw links and snippets for RSS feeds
- FiletypeHelper
- This is a helper class is used to handle used to render the filetype based on the supplied mimetype. It is mainly intended to be used in outputting webpage results for non html pages.
- FileUploadHelper
- This helper is used to render a drag and drop file upload region
- GrouplistHelper
- This is a helper class is used to draw grouped view discussions for group feeds on a variety on elements such as ManageAccountElement, GroupfeedElement, ManagegroupElement.
- HamburgerHelper
- This is a helper class is used to draw the hamburger menu symbol and associated link to the settings menu
- HelpbuttonHelper
- This is a helper class is used to draw help button for context sensitive help.
- Helper
- Base Helper Class.
- IconlinkHelper
- This is a helper class is used to draw icon buttons and links
- ImagesHelper
- Helper used to draw thumbnails strips for images
- OptionsHelper
- This is a helper class is used to handle draw select options form elements
- PaginationHelper
- This is a helper class is used to handle pagination of search results
- PagingtableHelper
- Used to create links to go backward/forwards and search a database tables. HTML table with data representing a database table might have millions of rows so want to limit what the user actually gets at one time and just allow the user to "page" through in increments of 10, 20, 50, 100, 200 rows at a time.
- SearchformHelper
- Used to draw the form to do advanced search for items in a user, group, locale, etc folder
- ToggleHelper
- This is a helper class is used to draw an On-Off switch in a web page
- VideosHelper
- Helper used to draw thumbnails strips for images
- VideourlHelper
- Helper used to draw thumbnails for video sites
- ApiLayout
- Layout used for the seek_quarry Website including pages such as search landing page and settings page
- Layout
- Base layout Class. Layouts are used to render the headers and footer of the page on which a View lives
- RssLayout
- Layout used for the seek_quarry Website including pages such as search landing page and settings page
- WebLayout
- Layout used for the seek_quarry Website including pages such as search landing page and settings page
- MachinestatusView
- This view is used to display information about the on/off state of the queue_servers and fetchers managed by this instance of Yioop.
- MediadetailView
- View used to draw and allow editing of a single media resource when a media resource gallery is draw in detail view.
- NocacheView
- This view is drawn when someone clicks on the cached link of a web-page for which no cache is available
- RecoverView
- This View is responsible for drawing the screen for recovering a forgotten password
- RegisterView
- Draws the page that allows a user to register for an account
- ResendEmailView
- This View is responsible for drawing the screen for resending the confirm account link
- RssView
- Web page used to present search results It is also contains the search box for people to types searches into
- SearchView
- Web page used to present search results It is also contains the search box for people to types searches into
- SigninView
- This View is responsible for drawing the login screen for the admin panel of the Seek Quarry app
- StaticView
- This View is responsible for drawing forward-facing wiki pages in a more static cleaned up way
- SuggestView
- View responsible for drawing the form where a user can suggest a URL
- TestsView
- Draws the view on which people can control their search settings such as num links per screen and the language settings
- View
- Base View Class. A View is used to display the output of controller activity
- BloomFilterFileTest
- Used to test that the BloomFilterFile class provides the basic functionality of a persistent set. I.e., we can insert things into it, and we can do membership testing
- BmpProcessorTest
- UnitTest for the BmpProcessor class. A BmpProcessor is used to process a .bmp file and extract summary from it. This class tests the processing of an .bmp file.
- BPlusTreeTest
- Yioop B+-tree Unit Class
- CrawlQueueBundleTest
- UnitTest for the CrawlQueueBundle class.
- DeTokenizerTest
- Code used to test the German stemming algorithm.
- DocxProcessorTest
- UnitTest for the DocxProcessor class. It is used to process docx files which are a zip of an xml-based format
- ElTokenizerTest
- Code used to test the Greek stemming algorithm.
- EnTokenizerTest
- Code used to test the English stemming algorithm. The inputs for the algorithm are words in http://snowball.tartarus.org/algorithms/porter/voc.txt and the resulting stems are compared with the stem words in http://snowball.tartarus.org/algorithms/porter/output.txt Code uses original Porter stemmer, not Porter 2
- EpubProcessorTest
- UnitTest for the EpubProcessor class. An EpubProcessor is used to process a .epub (ebook publishing standard) file and extract summary from it. This class tests the processing of an .epub file format by EpubProcessor.
- EsTokenizerTest
- Code used to test the French stemming algorithm. The inputs for the algorithm are words in http://snowball.tartarus.org/algorithms/spanish/voc.txt and the resulting stems are compared with the stem words in http://snowball.tartarus.org/algorithms/spanish/output.txt
- FaTokenizerTest
- Code used to test the Persian stemming algorithm. The inputs for the algorithm came from the sample text file for the Hamshahri Collection found at http://ece.ut.ac.ir/DBRG/Hamshahri/download.html The stemmed results come from the Java program that the PHP stemmer is based off of at http://members.unine.ch/jacques.savoy/clef/persianStemmerArabic.txt
- FetchUrlTest
- Used to test auxiliary functions related to downloading pages with the FetchUrl class.
- FrTokenizerTest
- Code used to test the French stemming algorithm. The inputs for the algorithm are words in http://snowball.tartarus.org/algorithms/french/voc.txt and the resulting stems are compared with the stem words in http://snowball.tartarus.org/algorithms/french/output.txt
- HashTableTest
- Used to test that the HashTable class properly stores key value pairs, handles insert, deletes, collisions okay. It should also detect when table is full
- HiTokenizerTest
- Code used to test the Hindi stemming algorithm. The inputs for the algorithm came from the sample text file for the The stemmed results come from the Java program that the PHP stemmer is based off of at http://members.unine.ch/jacques.savoy/clef/HindiStemmerLight.java.txt which has since been modified to try to improve accuracy
- IconProcessorTest
- UnitTest for the IconProcessor class. A IconProcessor is used to process a .ico file and extract summary from it. This class tests the processing of an .ico file.
- IndexDictionaryTest
- Used to test that the IndexDictionary class can properly add shards and retrieve correct posting slice ranges in the shards.
- IndexDocumentBundleTest
- Used to test that the IndexDocumentBundle class can properly add and retrieve documents. Check its prepareMethod correctly deduplicates documents before inverted index creation. Tests inverted index creation and adding terms to IndexDocumentBundle's BPlusTree. Check look up of documents according to term.
- IndexManagerTest
- Used to run unit tests for the IndexManager class. IndexManager acts a a resource manager for the open indexes used to process a query.
- IndexShardTest
- Used to test that the IndexShard class can properly add new documents and retrieve those documents by word. Checks that doc offsets can be updated, shards can be saved and reloaded
- ItTokenizerTest
- My code for testing the Italian stemming algorithm. The inputs for the algorithm are words in http://snowball.tartarus.org/algorithms/italian/voc.txt and the resulting stems are compared with the stem words in http://snowball.tartarus.org/algorithms/italian/output.txt
- LinearHashTableTest
- Used to test that the LinearHashTable class properly stores key value pairs, handles insert, deletes, retrievals okay.
- NlTokenizerTest
- Code used to test the Dutch stemming algorithm. The inputs for the algorithm are words in http://snowball.tartarus.org/algorithms/Dutch/voc.txt and the resulting stems are compared with the stem words in http://snowball.tartarus.org/algorithms/Dutch/output.txt
- PackedTableToolsTest
- Used to test the PackedTableTools class. PackedTableTools are used for reading and storing rows with respect to some signature
- PdfProcessorTest
- UnitTest for the PdfProcessor class. A PdfProcessor is used to process a .pdf file and extract summary from it. This class tests the processing of an .pdf file.
- PhraseParserTest
- Used to test that the PhraseParser class. Want to make sure bigram extracting works correctly
- PptxProcessorTest
- UnitTest for the PptxProcessor class. It is used to process pptx files which are a zip of an xml-based format
- PriorityQueueTest
- Used to test the PriorityQueue class that is used to figure out which URL to crawl next
- PtTokenizerTest
- Code used to test the Portuguese stemming algorithm. The inputs for the algorithm are words in http://snowball.tartarus.org/algorithms/porter/voc.txt and the resulting stems are compared with the stem words in http://snowball.tartarus.org/algorithms/porter/output.txt Code uses original Porter stemmer, not Porter 2
- QueueServerTest
- Used to test functions related to scheduling websites to crawl for a web crawl (the responsibility of a QueueServer)
- RuTokenizerTest
- Code used to test the Russian stemming algorithm. The inputs for the algorithm are words in http://snowball.tartarus.org/algorithms/russian/voc.txt and the resulting stems are compared with the stem words in http://snowball.tartarus.org/algorithms/russian/output.txt
- ScraperManagerTest
- Code used to test Web Scrapers.
- Sha1JavascriptTest
- Used to test the Javascript implementation of the sha1 function.
- StringArrayTest
- Used to test that the StringArray class properly stores/retrieves values, and can handle loading and saving
- TrieTest
- Used to test that the Trie class properly stores words that could be used for an autosuggest dictionary
- UrlParserTest
- Used to test that the UrlParser class. For now, want to see that the method canonicalLink is working correctly and that isPathMemberRegexPaths (used in robot_processor.php) works
- UtilityTest
- Used to test the various methods in utility, in particular, those related to posting lists and time.
- VersionManagerTest
- UnitTests for the VersionManager class.
- WebArchiveTest
- UnitTest for the WebArchive class. A web archive is used to store array-based objects persistently to a file. This class tests storing and retrieving from such an archive.
- WikiParserTest
- Tests the functionality of WikiParser used when processing Wikipedia dumps and used for Yioop's internal wiki infrastructure
- WordIteratorTest
- Tests the functionality of the WordIterator class used to iterate over documents in an IndexDocumentBundle containing a term.
- XlsxProcessorTest
- Used to test that the XlsxProcessor class provides the basic functionality of getting the tile, description, languages and links
- ZhTokenizerTest
- Used to test Named Entity Tagging and Part of Speech Tagging for the Chinese Language. Word segmentation is already tested in
- CrawlcontrolsElement
- Used to the control buttons on manage account, manage crawls, etc pages
- SocialcontrolsElement
- Used to the control buttons on manage account, manage groups, group feed, etc pages
- LRUCache
- Implements a least recently used cache
- DescriptionUpdateJob
- A media job to periodically update descriptions of Wiki resources using Description Search Sources
Table of Contents
- QUERY_AGENT_NAME = "QUERY_CACHER"
- TIME_BETWEEN_REQUEST_IN_SECONDS = 5
- YIOOP_URL = "http://localhost/"
- Script to cache run a sequence of queries against a yioop instance so that they can be cached
- e() : mixed
- shorthand for echo
- getTrainingFileNames() : array<string|int, mixed>
- Returns an array of filenames to be used for training the current task in TokenTool
- makeKwikiEntriesGetSeedSites() : mixed
- Generates knowledge wiki callouts for search results pages based on the first paragraph of a Wikipedia Page that matches a give qeury.
- getNextPage() : mixed
- Gets the next wiki page from a file handle pointing to the wiki dump file
- removeTags() : string
- Remove all occurrence of a open close tag pairs from $text
- getBraceTag() : array<string|int, mixed>
- Get a substring offset pair matching the input open close brace tag pattern
- getTagOffsetPage() : mixed
- Get the outer contents of an xml open/close tag pair from a text source together with a new offset location after
- getTopPages() : array<string|int, mixed>
- Returns title and page counts of the top $max_pages many entries in a $page_count_file for a locale $locale_tag
- smartOpen() : array<string|int, mixed>
- Gets a read file handle for $file_open appropriate for whether it is uncompressed, bz2 compressed, or gz compressed. It returns also function pointers to the functions needed to do reading and closing for the file handle.
- translateLocale() : mixed
- Translates Yioop web app strings to a given locale ($locale_tag) and writes the LOCALE_DIR/$locale_tag/configure.ini file for these translations.
- wikiHeaderPageToString() : string
- Converts an array of wiki header information and a wiki page contents string into a string suitable to be store into the GROUP_PAGE_HISTORY database table.
- translatePhrase() : mixed
- Translates a string from English to a given locale using an online translation tool.
- makeNWordGramsFiles() : mixed
- Makes an n or all word gram Bloom filter based on the supplied arguments Wikipedia files are assumed to have been place in the PREP_DIR before this is run and writes it into the resources folder of the given locale
- makeSuggestTrie() : mixed
- Makes a trie that can be used to make word suggestions as someone enters terms into the Yioop! search box. Outputs the result into the file suggest_trie.txt.gz in the supplied locale dir
- fileWithTrim() : array<string|int, mixed>
- Reads file into an array or outputs file not found. For each entry in array trims it. Any blank lines are deleted
- tl() : string
- Translate the supplied arguments into the current locale.
- e() : mixed
- shorthand for echo
- tl() : string
- Translate the supplied arguments into the current locale.
- e() : mixed
- shorthand for echo
- exceptionErrorHandler() : mixed
- Error handler so catch errors as exceptions too
- webError() : mixed
- Used to handle request errors in non-cli, non-webserver redirect case
- clean() : bool
- Used to clean trailing whitespace from files in a folder or just from a file given in the command line. If also removes final ?> characters to make php files conform with suggested coding guidelines. Similarly, adds a space between if, for, foreach, etc and ( if not present to make match PHP coding guidelines
- copyright() : bool
- Updates the copyright info (assuming in Yioop docs format) on files in supplied sub-folder/file. That is, it changes strings matching /2009 - \d\d\d\d/ to 2009 - current_year in those files/file.
- longlines() : bool
- Search and echos line numbers and lines for lines of length greater than 80 characters in files in supplied sub-folder/file,
- needsdocs() : bool
- Search and echos line numbers and lines for lines of length greater than 80 characters in files in supplied sub-folder/file,
- replace() : bool
- Performs a search and replace for given pattern in files in supplied sub-folder/file
- search() : bool
- Performs a search for given pattern in files in supplied sub-folder/file
- unit() : bool
- Used to run or list Yioop unit tests given in $args
- changeCopyrightFile() : mixed
- Callback function applied to each file in the directory being traversed by @see copyright(). It checks if the files is of the extension of a code file and if so trims whitespace from its lines and then updates the lines of the form 2009 - \d\d\d\d to the supplied copyright year
- cleanLinesFile() : mixed
- Callback function applied to each file in the directory being traversed by @see clean().
- searchFile() : mixed
- Callback function applied to each file in the directory being traversed by @see search(). Searches $filename matching $pattern and outputs line numbers and lines
- replaceFile() : mixed
- Callback function applied to each file in the directory being traversed by @see replace(). Searches $filename matching $pattern. Depending on $mode ($arg[2] as described in replace()), it outputs and replaces with $replace
- mapPath() : mixed
- Applies the function $callback to each file in $path
- excludedPath() : bool
- Checks if $path is amongst a list of paths which should be ignored
- bootstrap() : mixed
- Main entry point to the Yioop web app.
- checkCookieConsent() : bool
- Checks if a cookie consent form was obtained. This This function returns true if a session cookie was received from the browser, or a form variable saying cookies are okay was received, or the cookie Yioop profile says the consent mechanism is disabled
- configureRewrites() : mixed
- Used to setup and handles url rewriting for the Yioop Web app
- routeAppFile() : bool
- Used to handle routes that will eventually just serve files from either the APP_DIR These include files like css, scripts, suggest tries, images, and videos.
- routeBaseFile() : bool
- Used to handle routes that will eventually just serve files from either the BASE_DIR These include files like css, scripts, images, and robots.txt.
- routeDirect() : bool
- Used to route page requests to pages that are fixed Public Group wiki that should always be present. For example, 404 page.
- directUrl() : string
- Given the name of a fixed public group static page creates the url where it can be accessed in this instance of Yioop, making use of the defined variable REDIRECTS_ON.
- routeBlog() : bool
- Used to route page requests to for the website's public blog
- routeFeeds() : bool
- Used to route page requests for pages corresponding to a group, user, or thread feed. If redirects on then urls ending with /feed_type/id map to a page for the id'th item of that feed_type
- feedsUrl() : string
- Given the type of feed, the identifier of the feed instance, and which controller is being used creates the url where that feed item can be accessed from the instance of Yioop. It makes use of the defined variable REDIRECTS_ON.
- routeUserMessages() : bool
- routeController() : bool
- Used to route page requests to end-user controllers such as register, admin. urls ending with /controller_name will be routed to that controller.
- controllerUrl() : string
- Given the name of a controller for which an easy end-user link is useful creates the url where it can be accessed on this instance of Yioop, making use of the defined variable REDIRECTS_ON. Examples of end-user controllers would be the admin, and register controllers.
- routeSubsearch() : bool
- Used to route page requests for subsearches such as news, video, and images (site owner can define other). Urls of the form /s/subsearch will go the page handling the subsearch.
- subsearchUrl() : string
- Given the name of a subsearch creates the url where it can be accessed on this instance of Yioop, making use of the defined variable REDIRECTS_ON.
- routeSerpIcon() : bool
- Used to route requests for favicons for pages in search results
- serpIconUrl() : string
- Return the url to repquest the favicon for a page in the search resutls, making use of the defined variable REDIRECTS_ON.
- routeSuggest() : bool
- Used to route requests for the suggest-a-url link on the tools page.
- suggestUrl() : string
- Return the url for the suggest-a-url link on the more tools page, making use of the defined variable REDIRECTS_ON.
- routeWiki() : bool
- Used to route page requests for pages corresponding to a wiki page of group. If it is a wiki page for the public group viewed without being logged in, the route might come in as yioop_instance/p/page_name if redirects are on. If it is for a non-public wiki or page accessed with logged in the url will look like either: yioop_instance/group/group_id?a=wiki&page_name=some_name or yioop_instance/admin/group_id?a=wiki&page_name=some_name&csrf_token_string
- wikiUrl() : string
- Given the name of a wiki page, the group it belongs to, and which controller is being used creates the url where that wiki item can be accessed from the instance of Yioop. It makes use of the defined variable REDIRECTS_ON.
- main() : mixed
- Command-line shell for testing the class
- tl() : mixed
- import a tl function into Controller Namespace
- e() : mixed
- shorthand for echo
- localesWithStopwordsList() : array<string|int, mixed>
- Returns an array of locales that have a stop words list and a stop words remover method
- localeTagToIso639_2Tag() : string
- Converts a $locale_tag (major-minor) to an Iso 632-2 language name
- guessLocale() : string
- Attempts to guess the user's locale based on the request, session, and user-agent data
- guessLocaleFromString() : string
- Attempts to guess the user's locale based on a string sample
- checkQuery() : string
- Tries to find whether query belongs to a programming language
- guessLangEncoding() : string
- Tries to guess at a language tag based on the name of a character encoding
- guessEncodingHtmlXml() : mixed
- Tries to guess the encoding used for an Html document
- convertUtf8IfNeeded() : mixed
- Converts page data in a site associative array to UTF-8 if it is not already in UTF-8
- tl() : string
- Translate the supplied arguments into the current locale.
- setLocaleObject() : mixed
- Sets the language to be used for locale settings
- getLocaleTag() : string
- Gets the language tag (for instance, en_US for American English) of the locale that is currently being used. This function has the side effect of setting Yioop's current locale.
- getLocaleDirection() : string
- Returns the current language directions.
- getLocaleQueryStatistics() : array<string|int, mixed>
- Returns the query statistics info for the current llocalt.
- getBlockProgression() : string
- Returns the current locales method of writing blocks (things like divs or paragraphs).A language like English puts blocks one after another from the top of the page to the bottom. Other languages like classical Chinese list them from right to left.
- getWritingMode() : string
- Returns the writing mode of the current locale. This is a combination of the locale direction and the block progression. For instance, for English the writing mode is lr-tb (left-to-right top-to-bottom).
- w1256ToUTF8() : string
- Convert the string $str encoded in Windows-1256 into UTF-8
- utf8chr() : string
- Given a unicode codepoint convert it to UTF-8
- formatDateByLocale() : string
- Function for formatting a date string based on the locale.
- upgradeLocalesCheck() : mixed
- Checks to see if the locale data of Yioop! of a locale in the work dir is older than the currently running Yioop!
- upgradeLocales() : mixed
- If the locale data of Yioop! in the work directory is older than the currently running Yioop! then this function is called to at least try to copy the new strings into the old profile.
- upgradePublicHelpWiki() : mixed
- Used to force push the default Public and Wiki pages into the current database
- upgradeDatabaseWorkDirectoryCheck() : mixed
- Checks to see if the database data or work_dir folder of Yioop! is from an older version of Yioop! than the currently running Yioop!
- upgradeDatabaseWorkDirectory() : mixed
- If the database data of Yioop is older than the version of the currently running Yioop then this function is called to try upgrade the database to the new version
- updateVersionNumber() : mixed
- Update the database version number to a new number
- getWikiHelpPages() : mixed
- Reads the Help articles from default db and returns the array of pages.
- addActivityAtId() : mixed
- Used to insert a new activity into the database at a given activity_id
- updateTranslationForStringId() : mixed
- Adds or replaces a translation for a database message string for a given IANA locale tag.
- addRegexDelimiters() : string
- Adds delimiters to a regex that may or may not have them
- preg_search() : mixed
- search for a pcre pattern in a subject from a given offset, return position of first match if found -1 otherwise.
- preg_offset_replace() : string
- Replaces a pcre pattern with a replacement in $subject starting from some offset.
- parse_ini_with_fallback() : array<string|int, mixed>
- Yioop replacement for parse_ini_file($name, true) in case parse_ini_file is on the disable_functions list. Name has underscores to match original function. This function checks if parse_ini_file is disabled on not. If not, it just calls parse_ini_file; otherwise, it simulates it enough so that configure.ini files used for string translations can be read.
- getIniAssignMatch() : mixed
- Auxiliary function called from parse_ini_with_fallback to extract from the $matches array produced by the former function's preg_match what kind of assignment occurred in the ini file being parsed.
- charCopy() : mixed
- Copies from $source string beginning at position $start, $length many bytes to destination string
- vByteEncode() : string
- Encodes an integer using variable byte coding.
- vByteDecode() : int
- Decodes from a string using variable byte coding an integer.
- appendUnary() : mixed
- Appends a number re-encoded in unary to the end of an input string starting at a given bit offset into the string. Here n in unary has bit representation n-1 0's followed by a 1.
- decodeUnary() : int
- Decodes a unary number froman input string at a given bit offset. Here n in unary has bit representation n-1 0's followed by a 1.
- appendBits() : string
- Appends $num_bits bits from the start of the binary rep of $number beginning at offset $start_bit_offset of $input string overwriting any bits present. If $num_bits == -1, then appends all of $number.
- decodeBits() : int
- Decode $num_bits many bits from the $input string beginning at offset $start_bit_offset. The result of this operation is up $start_bit_offset by number of bits that were able to be decoded.
- appendGamma() : string
- Appends gamma code of $number beginning at offset $start_bit_offset of $input string overwriting any bits present. $start_bit_offset is updated to bit position after append.
- decodeGammaList() : array<string|int, mixed>
- Decodes up to $num_decode gamma encoded integers beginning at $start_bit_offset. $start_bit_offset is updated to the bit position after the decoded integers.
- appendRiceSequence() : string
- Appends using a Rice coding a sequence of integers $int_sequence at offset $start_bit_offset to the string $output, overwriting any bits present at that location. $start_bit_offset is updated to bit position after append.
- decodeRiceSequence() : array<string|int, mixed>
- Decodes up to $num_decode rice encoded difference list of integers beginning at $start_bit_offset. $start_bit_offset is updated to the bit position after the decoded integers. If $delta_start >= 0 then the first int is assumed to be the difference from $delta_start;
- encodePositionList() : string
- Encodes a list of integer positions of a term in a document. This is done as a gamma code of the first integer followed by the Rice coding of the remaining integers using a modulus based on the average gap between integers. If the number of positions is 1 or 2 then a gamma of each position only is used.
- decodePositionList() : array<string|int, mixed>
- Decodes up to $num_decode term in document position integers from string $input under the assumption $input is encoded as per
- encode255() : string
- Recodes a string in a 1-1 fashion to a string not involving \xFF (255). I.e., it maps characters \xFE -> \xFE\FD and \xFF -> \xFE\FE
- decode255() : string
- Decodes a string in a 1-1 fashion from a string not involving \xFF (255). I.e., it maps characters \xFE\FE -> \xFF and \xFE\FD -> \xFF
- encodeUnderscore() : string
- Recodes a string in a 1-1 fashion to a string not involving underscore (_). I.e., it maps characters - -> -- and _ -> -=
- decodeUnderscore() : string
- Decodes a string in a 1-1 fashion from a string not involving underscore (_). I.e., it maps characters -= -> _ and -- -> -
- packEncode255() : string
- Encodes a list of strings as their @see encode255 versions separated by \xFF's
- unpackDecode255() : array<string|int, mixed>
- Decodes a list of strings from a string that encoded as their @see encode255 of its elements separated by \xFF's
- packPosting() : string
- Makes an packed integer string from a docindex and the number of occurrences of a word in the document with that docindex.
- unpackPosting() : array<string|int, mixed>
- Given a packed integer string, uses the top three bytes to calculate a doc_index of a document in the shard, and uses the low order byte to computer a number of occurrences of a word in that document.
- addDocIndexPostings() : string
- This method is used while appending one index shard to another.
- deltaList() : array<string|int, mixed>
- Computes the difference of a list of integers.
- deDeltaList() : array<string|int, mixed>
- Given an array of differences of integers reconstructs the original list. This computes the inverse of the deltaList function
- encodeModified9() : string
- Encodes a sequence of integers x, such that 1 <= x <= 2<<28-1 as a string. NOTICE x>=1.
- packListModified9() : string
- Packs the contents of a single word of a sequence being encoded using Modified9.
- nextPostString() : string
- Returns the next complete posting string from $input_string being at offset.
- decodeModified9() : array<string|int, mixed>
- Decoded a sequence of positive integers from a string that has been encoded using Modified 9
- unpackListModified9() : array<string|int, mixed>
- Decode a single word with high two bits off according to modified 9
- docIndexModified9() : int
- Given an int encoding encoding a doc_index followed by a position list using Modified 9, extracts just the doc_index.
- unpackInt() : int
- Unpacks an int from a 4 char string
- packInt() : string
- Packs an int into a 4 char string
- unpackFloat() : float
- Unpacks a float from a 4 char string
- packFloat() : string
- Packs an float into a four char string
- renameSerializedObject() : string
- Used to change the namespace of a serialized php object (assumes doesn't have nested subobjects)
- getDomFromString() : DOMDocument
- Parses a provided string to make a DOM object. First tries to parse using XML and if this fails uses the more robust HTML Dom parser and manipulates the resulting DOM tree to make correspond to original tags for XML that isn't HTML
- getTags() : array<string|int, mixed>
- Returns an array of DOMDocuments for the nodes that match an xpath query on $dom, a DOMDocument
- toHexString() : string
- Converts a string to string where each char has been replaced by its hexadecimal equivalent
- toIntString() : string
- Converts a string to string where each char has been replaced by a Integer equivalent
- toBinString() : string
- Converts a string to string where each char has been replaced by its binary equivalent
- metricToInt() : int
- Converts a string of the form some int followed by K, M, or G.
- intToMetric() : string
- Converts a number to a string followed by nothing, K, M, G, T depending on whether number is < 1000, < 10^6, < 10^9, or < 10^(12)
- crawlLog() : mixed
- Logs a message to a logfile or the screen. The super-global field $_SERVER['LOG_TO_FILES'] determines if this will log to a file. If not, then in cli mode, will log to stdout, otherwise it will use error_log. When logging to file $_SERVER["NO_ROTATE_LOGS"] controls whether or not there will be a log file rotation. The first call to this method is typically used to set up a process to check for liveness. For example a call: crawlLog("\n\nInitialize logger..", $this->process_name, true); says $this->process_name should be checked for liveness as part of any subsequent logging activity such as a call crawlLog("Another Message"); (note subsequent call don't need to specify the process name).
- makeTimestamp() : string
- Used to make a log file entry time string of format: entry number, time in r format.
- crawlTimeoutLog() : bool
- Writes a log message $msg if more than LOG_TIMEOUT time has passed since the last time crawlTimeoutLog was called. Useful in loops to write a message as progress is made through the loop (but not on every iteration, but say every 30 seconds).
- crawlHash() : string
- Computes an 8 byte hash of a string for use in storing documents.
- crawlHashWord() : string
- Used to create a 20 byte hash of a string (typically a word or phrase with a wikipedia page). Format is 8 byte crawlHash of term (md5 of term two halves XOR'd), followed by a \x00, followed by the first 11 characters from the term. If there are not enough char's to make 20 bytes, then the string is padded with \x00s to 20bytes.
- canonicalTerm() : string
- Take a $term that might have come from adocuments and converts it to a string of 16 bytes which is either the original term padded by underscores or the first seven chars of the term followed by an underscore followed by the base64 encoding of the first 6 chars of its md5 hash.
- compareWordHashes() : int
- Used to compare to ids for index dictionary lookup. ids are a 8 byte crawlHash together with 12 byte non-hash suffix.
- base64Hash() : string
- Converts a crawl hash number to something closer to base64 coded but so doesn't get confused in urls or DBs
- unbase64Hash() : string
- Decodes a crawl hash number from base64 to raw ASCII
- webencode() : string
- Encodes a string in a format suitable for post data (mainly, base64, but str_replace data that might mess up post in result)
- webdecode() : string
- Decodes a string encoded by webencode
- crawlCrypt() : string
- The crawlHash function is used to encrypt passwords stored in the database.
- partitionByHash() : array<string|int, mixed>
- Used by a controller to take a table and return those rows in the table that a given queue_server would be responsible for handling
- calculatePartition() : int
- Used by a controller to say which queue_server should receive a given input
- changeInMicrotime() : float
- Measures the change in time in seconds between two timestamps to microsecond precision
- microTimestamp() : string
- Timestamp of current epoch with microsecond precision useful for situations where time() might cause too many collisions (account creation, etc)
- checkTimeInterval() : int
- Checks that a timestamp is within the time interval given by a start time (HH:mm) and a duration
- convertPixels() : int
- Converts a CSS unit string into its equivalent in pixels. This is used by @see SvgProcessor.
- countFiles() : int
- Returns the number of files in a folder
- makePath() : bool
- Creates folders along a filesystem path if they don't exist
- deleteFileOrDir() : mixed
- This is a callback function used in the process of recursively deleting a directory
- setWorldPermissions() : mixed
- This is a callback function used in the process of recursively chmoding to 777 all files in a folder
- fileInfo() : an
- This is a callback function used in the process of recursively calculating an array of file modification times and files sizes for a directory
- orderCallback() : int
- Callback function used to sort documents by a field
- stringOrderCallback() : int
- Callback function used to sort documents by a field where field is assume to be a string
- stringROrderCallback() : int
- Callback function used to sort documents by a field where field is assume to be a string
- rorderCallback() : int
- Callback function used to sort documents by a field in reverse order
- lessThan() : int
- Callback to check if $a is less than $b
- greaterThan() : int
- Callback to check if $a is greater than $b
- e() : mixed
- shorthand for echo
- remoteAddress() : mixed
- Compute the real remote address of the incoming connection including forwarding
- readInput() : string
- Used to read a line of input from the command-line
- readPassword() : string
- Used to read a line of input from the command-line (on unix machines without echoing it)
- readMessage() : string
- Used to read a several lines from the terminal up until a last line consisting of just a "."
- mimeType() : string
- Returns the mime type of the provided file name if it can be determined.
- generalIsA() : bool
- Checks if class_1 is the same as class_2 or has class_2 as a parent Behaves like 3 param version (last param true) of PHP is_a function that came into being with Version 5.3.9.
- stripAttributes() : string
- Given the contents of a start XML/HMTL tag strips out all the attributes non listed in $safe_attribute_list
- parseCsv() : array<string|int, mixed>
- Used to parse into a two dimensional array a string that contains CSV data.
- arraytoCsv() : string
- Converts an array of values to a comma separated value formatted string.
- diff() : string
- Computes a Unix-style diff of two strings. That is it only outputs lines which disagree between the two strings. It outputs +line if a line occurs in the second but not first string and -line if a line occurs in the first string but not the second.
- computeLCS() : mixed
- Computes the longest common subsequence of two arrays
- extractLCSFromTable() : mixed
- Extracts from a table of longest common sequence moves (probably calculated by @see computeLCS) and a starting coordinate $i, $j in that table, a longest common subsequence
- tail() : array<string|int, mixed>
- Returns an array of the last $num_lines many lines our of a file
- lineFilter() : array<string|int, mixed>
- Given an array of lines returns a subarray of those lines containing the filter string or filter array
- logLineTimestamp() : int
- Tries to extract a timestamp from a line which is presumed to come from a Yioop log file
- isPositiveInteger() : bool
- Returns whether an input can be parsed to a positive integer
- measureCall() : mixed
- Used to measure the memory footprint in bytes and time spent calling a method of an object. It also records number of time the method has been called.
- measureObject() : mixed
- Used to measure the memory footprint of an object in Yioop and save it to a statistics file No recording is done until an initial call to the function measureCall(null, save_statistics_file) where save_statistics_file is the name of the file you won't to store statistics to.
- measureObjectCall() : mixed
- General method called by for @see measureCall and @see measureObject Used to measure the memory footprint in bytes of an object or memory and time spent calling a method of an object. It also records number of time the method has been called. When used to call a method before initialization, just calls the method without any recording or timing. To initialize, an initial call to the function measureCall(null, save_statistics_file) where save_statistics_file is the name of the file you won't to store statistics to should be done.
- variableClone() : mixed
- Makes a deep copy of a variable regardless of its type
- garbageCollect() : int
- Runs various system garbage collection functions and returns number of bytes freed.
- utf8SafeSaveHtml() : string
- The dom method saveHTML has a tendency to replace UTF-8, non-ascii characters with html entities. This is supposed to save avoiding the replacement.
- utf8WordWrap() : string
- A UTF-8 safe version of PHP's wordwrap function that wraps a string to a given number of characters
- upgradeDatabaseVersion1() : mixed
- Upgrades a Version 0 version of the Yioop database to a Version 1 version
- upgradeDatabaseVersion2() : mixed
- Upgrades a Version 1 version of the Yioop database to a Version 2 version
- upgradeDatabaseVersion3() : mixed
- Upgrades a Version 2 version of the Yioop database to a Version 3 version
- upgradeDatabaseVersion4() : mixed
- Upgrades a Version 3 version of the Yioop database to a Version 4 version
- upgradeDatabaseVersion5() : mixed
- Upgrades a Version 4 version of the Yioop database to a Version 5 version
- upgradeDatabaseVersion6() : mixed
- Upgrades a Version 5 version of the Yioop database to a Version 6 version
- upgradeDatabaseVersion7() : mixed
- Upgrades a Version 6 version of the Yioop database to a Version 7 version
- upgradeDatabaseVersion8() : mixed
- Upgrades a Version 7 version of the Yioop database to a Version 8 version
- upgradeDatabaseVersion9() : mixed
- Upgrades a Version 8 version of the Yioop database to a Version 9 version
- upgradeDatabaseVersion10() : mixed
- Upgrades a Version 9 version of the Yioop database to a Version 10 version
- upgradeDatabaseVersion11() : mixed
- Upgrades a Version 10 version of the Yioop database to a Version 11 version
- upgradeDatabaseVersion12() : mixed
- Upgrades a Version 11 version of the Yioop database to a Version 12 version
- upgradeDatabaseVersion13() : mixed
- Upgrades a Version 12 version of the Yioop database to a Version 13 version
- upgradeDatabaseVersion14() : mixed
- Upgrades a Version 13 version of the Yioop database to a Version 14 version
- upgradeDatabaseVersion15() : mixed
- Upgrades a Version 14 version of the Yioop database to a Version 15 version
- upgradeDatabaseVersion16() : mixed
- Upgrades a Version 15 version of the Yioop database to a Version 16 version
- upgradeDatabaseVersion17() : mixed
- Upgrades a Version 16 version of the Yioop database to a Version 17 version
- upgradeDatabaseVersion18() : mixed
- Upgrades a Version 17 version of the Yioop database to a Version 18 version
- upgradeDatabaseVersion19() : mixed
- Upgrades a Version 18 version of the Yioop database to a Version 19 version This update has been superseded by the Version20 update and so its contents have been eliminated.
- upgradeDatabaseVersion20() : mixed
- Upgrades a Version 19 version of the Yioop database to a Version 20 version This is a major upgrade as the user table have changed. This also acts as a cumulative since version 0.98. It involves a web form that has only been localized to English
- upgradeDatabaseVersion21() : mixed
- Upgrades a Version 20 version of the Yioop database to a Version 21 version
- upgradeDatabaseVersion22() : mixed
- Upgrades a Version 21 version of the Yioop database to a Version 22 version
- upgradeDatabaseVersion23() : mixed
- Upgrades a Version 22 version of the Yioop database to a Version 23 version
- upgradeDatabaseVersion24() : mixed
- Upgrades a Version 23 version of the Yioop database to a Version 24 version
- upgradeDatabaseVersion25() : mixed
- Upgrades a Version 24 version of the Yioop database to a Version 25 version This version upgrade includes creation of Help group that holds help pages.
- upgradeDatabaseVersion26() : mixed
- Upgrades a Version 25 version of the Yioop database to a Version 26 version This version upgrade includes updation fo the Help pages in the database to work with the changes to the way Hyperlinks are specified in wiki markup.
- upgradeDatabaseVersion27() : mixed
- Upgrades a Version 26 version of the Yioop database to a Version 27 version
- upgradeDatabaseVersion28() : mixed
- Upgrades a Version 27 version of the Yioop database to a Version 28 version
- upgradeDatabaseVersion29() : mixed
- Upgrades a Version 28 version of the Yioop database to a Version 29 version
- upgradeDatabaseVersion30() : mixed
- Upgrades a Version 29 version of the Yioop database to a Version 30 version
- upgradeDatabaseVersion31() : mixed
- Upgrades a Version 30 version of the Yioop database to a Version 31 version
- upgradeDatabaseVersion32() : mixed
- Upgrades a Version 31 version of the Yioop database to a Version 32 version
- upgradeDatabaseVersion33() : mixed
- Upgrades a Version 32 version of the Yioop database to a Version 33 version
- upgradeDatabaseVersion34() : mixed
- Upgrades a Version 33 version of the Yioop database to a Version 34 version
- upgradeDatabaseVersion35() : mixed
- Upgrades a Version 34 version of the Yioop database to a Version 35 version
- upgradeDatabaseVersion36() : mixed
- Upgrades a Version 35 version of the Yioop database to a Version 36 version
- upgradeDatabaseVersion37() : mixed
- Upgrades a Version 36 version of the Yioop database to a Version 37 version
- upgradeDatabaseVersion38() : mixed
- Upgrades a Version 37 version of the Yioop database to a Version 38 version
- upgradeDatabaseVersion39() : mixed
- Upgrades a Version 38 version of the Yioop database to a Version 39 version
- upgradeDatabaseVersion40() : mixed
- Upgrades a Version 39 version of the Yioop database to a Version 40 version
- upgradeDatabaseVersion41() : mixed
- Upgrades a Version 40 version of the Yioop database to a Version 41 version
- upgradeDatabaseVersion42() : mixed
- Upgrades a Version 41 version of the Yioop database to a Version 42 version
- upgradeDatabaseVersion43() : mixed
- Upgrades a Version 42 version of the Yioop database to a Version 43 version
- upgradeDatabaseVersion44() : mixed
- Upgrades a Version 43 version of the Yioop database to a Version 44 version
- upgradeDatabaseVersion45() : mixed
- Upgrades a Version 44 version of the Yioop database to a Version 45 version
- upgradeDatabaseVersion46() : mixed
- Upgrades a Version 45 version of the Yioop database to a Version 46 version
- upgradeDatabaseVersion47() : mixed
- Upgrades a Version 46 version of the Yioop database to a Version 47 version
- upgradeDatabaseVersion48() : mixed
- Upgrades a Version 47 version of the Yioop database to a Version 48 version
- upgradeDatabaseVersion49() : mixed
- Upgrades a Version 48 version of the Yioop database to a Version 49 version
- upgradeDatabaseVersion50() : mixed
- Upgrades a Version 49 version of the Yioop database to a Version 50 version
- upgradeDatabaseVersion51() : mixed
- Upgrades a Version 50 version of the Yioop database to a Version 51 version
- upgradeDatabaseVersion52() : mixed
- Upgrades a Version 51 version of the Yioop database to a Version 52 version
- upgradeDatabaseVersion53() : mixed
- Upgrades a Version 52 version of the Yioop database to a Version 53 version
- upgradeDatabaseVersion54() : mixed
- Upgrades a Version 53 version of the Yioop database to a Version 54 version
- upgradeDatabaseVersion55() : mixed
- Upgrades a Version 54 version of the Yioop database to a Version 55 version
- upgradeDatabaseVersion57() : mixed
- Upgrades a Version 56 version of the Yioop database to a Version 5 version
- upgradeDatabaseVersion58() : mixed
- Upgrades a Version 57 version of the Yioop database to a Version 58 version
- upgradeDatabaseVersion59() : mixed
- Upgrades a Version 58 version of the Yioop database to a Version 59 version
- upgradeDatabaseVersion60() : mixed
- Upgrades a Version 59 version of the Yioop database to a Version 60 version
- upgradeDatabaseVersion61() : mixed
- Upgrades a Version 60 version of the Yioop database to a Version 61 version
- upgradeDatabaseVersion62() : mixed
- Upgrades a Version 61 version of the Yioop database to a Version 62 version
- upgradeDatabaseVersion64() : mixed
- Upgrades a Version 63 version of the Yioop database to a Version 64 version
- upgradeDatabaseVersion65() : mixed
- Upgrades a Version 64 version of the Yioop database to a Version 65 version
- upgradeDatabaseVersion66() : mixed
- Upgrades a Version 65 version of the Yioop database to a Version 66 version
- upgradeDatabaseVersion67() : mixed
- Upgrades a Version 66 version of the Yioop database to a Version 67 version
- upgradeDatabaseVersion68() : mixed
- Upgrades a Version 67 version of the Yioop database to a Version 68 version
- upgradeDatabaseVersion69() : mixed
- Upgrades a Version 68 version of the Yioop database to a Version 69 version
- upgradeDatabaseVersion70() : mixed
- Upgrades a Version 69 version of the Yioop database to a Version 70 version
- upgradeDatabaseVersion71() : mixed
- Upgrades a Version 70 version of the Yioop database to a Version 71 version
- upgradeDatabaseVersion72() : mixed
- Upgrades a Version 71 version of the Yioop database to a Version 72 version
- upgradeDatabaseVersion73() : mixed
- Upgrades a Version 72 version of the Yioop database to a Version 73 version
- upgradeDatabaseVersion74() : mixed
- Upgrades a Version 73 version of the Yioop database to a Version 74 version
- upgradeDatabaseVersion75() : mixed
- Upgrades a Version 74 version of the Yioop database to a Version 75 version
- upgradeDatabaseVersion76() : mixed
- Upgrades a Version 75 version of the Yioop database to a Version 76 version
- upgradeDatabaseVersion77() : mixed
- Upgrades a Version 76 version of the Yioop database to a Version 77 version
- upgradeDatabaseVersion78() : mixed
- Upgrades a Version 77 version of the Yioop database to a Version 78 version
- upgradeDatabaseVersion79() : mixed
- Upgrades a Version 78 version of the Yioop database to a Version 79 version
- upgradeDatabaseVersion80() : mixed
- Upgrades a Version 79 version of the Yioop database to a Version 80 version
- upgradeDatabaseVersion81() : mixed
- Upgrades a Version 80 version of the Yioop database to a Version 81 version
- webExit() : mixed
- Function to call instead of exit() to indicate that the script processing the current web page is done processing. Use this rather that exit(), as exit() will also terminate WebSite.
- makeTableCallback() : mixed
- Callback used by a preg_replace_callback in nextPage to make a table
- citeCallback() : string
- Used to convert {{cite }} to a numbered link to a citation
- fixLinksCallback() : string
- Used to changes spaces to underscores in links generated from our earlier matching rules
- base64EncodeCallback() : string
- Callback used to base64 encode the contents of nowiki tags so they won't be manipulated by wiki replacements.
- spaceEncodeCallback() : string
- Callback used to encode the contents of pre tags so they won't accidentally get sub-pre tags because a bunch of leading lines have spaces
- spanEncodeCallback() : string
- Callback used to encode the contents of span tags so they newlines within them don't accidentally get treated as new wiki paragraphs
- base64DecodeCallback() : string
- Callback used to base64 decode the contents of previously base64 encoded (@see base64EncodeCallback) nowiki tags after all mediawiki substitutions have been done
- spaceDecodeCallback() : string
- Cleans up pre tags after other wiki rules applied
- lessThanLocale() : int
- Function for comparing two locale arrays by locale tag so can sort
- tl() : string
- Translate the supplied arguments into the current locale.
- e() : mixed
- shorthand for echo
- tl() : string
- Translate the supplied arguments into the current locale.
- e() : mixed
- shorthand for echo
- tl() : string
- Translate the supplied arguments into the current locale.
- e() : mixed
- shorthand for echo
- tl() : string
- Translate the supplied arguments into the current locale.
- e() : mixed
- shorthand for echo
Constants
QUERY_AGENT_NAME
public
mixed
QUERY_AGENT_NAME
= "QUERY_CACHER"
TIME_BETWEEN_REQUEST_IN_SECONDS
public
mixed
TIME_BETWEEN_REQUEST_IN_SECONDS
= 5
YIOOP_URL
Script to cache run a sequence of queries against a yioop instance so that they can be cached
public
mixed
YIOOP_URL
= "http://localhost/"
Functions
e()
shorthand for echo
e(string $text) : mixed
Parameters
- $text : string
-
string to send to the current output
Return values
mixed —getTrainingFileNames()
Returns an array of filenames to be used for training the current task in TokenTool
getTrainingFileNames(array<string|int, mixed> $command_line_args[, int $start_index = 4 ]) : array<string|int, mixed>
Parameters
- $command_line_args : array<string|int, mixed>
-
supplied to TokenTool.php. Assume array of the format: [ ... max_file_names_to_consider, file_glob1, file_glob2, ...]
- $start_index : int = 4
-
index in $command_line_args of max_file_names_to_consider
Return values
array<string|int, mixed> —$file_names of files with training data
makeKwikiEntriesGetSeedSites()
Generates knowledge wiki callouts for search results pages based on the first paragraph of a Wikipedia Page that matches a give qeury.
makeKwikiEntriesGetSeedSites(string $locale_tag, string $page_count_file, string $wiki_dump_file, int $max_entries, int $max_seed_sites) : mixed
Also generates an initial list of potential seed sites for a crawl based off urls scraped from the wiki pages.
Parameters
- $locale_tag : string
-
the IANA language tag of the locale to create knowledge wiki entries and seed sites for
- $page_count_file : string
-
the file name of a a wiki page count dump file (or folder of such files). Such a file contains the names of wiki pages and how many times they were accessed
- $wiki_dump_file : string
-
a dump of wikipedia pages and meta pages
- $max_entries : int
-
maximum number of kwiki entries to create. Will pick the one with the highest counts in $page_count_file
- $max_seed_sites : int
-
maximum number of seed sites to add to Yioop's set of seed sites. Again chooses those with highest page count score
Return values
mixed —getNextPage()
Gets the next wiki page from a file handle pointing to the wiki dump file
getNextPage(resource $fr, function $read, int $block_size, mixed &$input_buffer) : mixed
Parameters
- $fr : resource
-
file handle (might be a compressed file handle, for example, corresponding to gzopen of bzopen)
- $read : function
-
a function for reading from thhe given file handle
- $block_size : int
-
size of blocks to use when reading
- $input_buffer : mixed
Return values
mixed —removeTags()
Remove all occurrence of a open close tag pairs from $text
removeTags(string $text, string $open, string $close) : string
Parameters
- $text : string
-
to remove tag pair from
- $open : string
-
string pattern for open tag
- $close : string
-
string pattern for close tag
Return values
string —text after tag removed
getBraceTag()
Get a substring offset pair matching the input open close brace tag pattern
getBraceTag(string $page, string $brace_open, string $brace_close, string $tag, int $offset) : array<string|int, mixed>
Parameters
- $page : string
-
source text to search for the tag in For example, lala {{infobox {{blah yoyoy}} }} dada.
- $brace_open : string
-
character sequence starting the tag region. For example {{
- $brace_close : string
-
character sequence ending the tag region. For example }}
- $tag : string
-
tag that might be associated with the opening of the the sequence. For example infobox.
- $offset : int
-
offset to start searching from
Return values
array<string|int, mixed> —ordered pair [substring containing the brace tag, offset after the tag]. If had "lala {{infobox {{blah yoyoy}} }} dada" as input and searched on {{, }}, infobox, 0 would get ["{{infobox {{blah yoyoy}}", 31]
getTagOffsetPage()
Get the outer contents of an xml open/close tag pair from a text source together with a new offset location after
getTagOffsetPage(string $page, string $tag, int $offset) : mixed
Parameters
- $page : string
-
text source to search the tag pair in
- $tag : string
-
the xml tag to look for
- $offset : int
-
offset to start searching after for the open/close pair
Return values
mixed —getTopPages()
Returns title and page counts of the top $max_pages many entries in a $page_count_file for a locale $locale_tag
getTopPages(string $page_count_file, string $locale_tag, int $max_pages[, array<string|int, mixed> $title_counts = [] ]) : array<string|int, mixed>
Parameters
- $page_count_file : string
-
page count file to use to search for title counts with respect to a locale
- $locale_tag : string
-
locale to get top pages for
- $max_pages : int
-
number of pages
- $title_counts : array<string|int, mixed> = []
-
title counts that might have come from analyzing a previous file. These will be in the output and contribute to $max_pages
Return values
array<string|int, mixed> —$title_counts wiki page titles => num_views associative array
smartOpen()
Gets a read file handle for $file_open appropriate for whether it is uncompressed, bz2 compressed, or gz compressed. It returns also function pointers to the functions needed to do reading and closing for the file handle.
smartOpen(string $file_name) : array<string|int, mixed>
Parameters
- $file_name : string
-
name of file want read file handle for
Return values
array<string|int, mixed> —[file_handle, read_function_ptr, close_function_ptr]
translateLocale()
Translates Yioop web app strings to a given locale ($locale_tag) and writes the LOCALE_DIR/$locale_tag/configure.ini file for these translations.
translateLocale(string $locale_tag, int $with_wiki_pages) : mixed
Currently, translations are done using the Yandex.translate (https://translate.yandex.com/) API.
Parameters
- $locale_tag : string
-
of locale to translate
- $with_wiki_pages : int
-
if this is <=0, public and help wiki pages are not translated, if it is 1, they are translated to the locale if the locale does not already have a translation. If it is >1 then it is force translated to locale.
Return values
mixed —wikiHeaderPageToString()
Converts an array of wiki header information and a wiki page contents string into a string suitable to be store into the GROUP_PAGE_HISTORY database table.
wikiHeaderPageToString(array<string|int, mixed> $wiki_header, string $wiki_page_data) : string
Parameters
- $wiki_header : array<string|int, mixed>
-
of wiki header information
- $wiki_page_data : string
-
mediawiki data
Return values
string —suitable to be stored in GROUP_PAGE_HISTORY
translatePhrase()
Translates a string from English to a given locale using an online translation tool.
translatePhrase(string $translate_text, string $locale_tag) : mixed
Parameters
- $translate_text : string
-
text to be translated
- $locale_tag : string
-
locale to translate to
Return values
mixed —translated string on success, false otherwise
makeNWordGramsFiles()
Makes an n or all word gram Bloom filter based on the supplied arguments Wikipedia files are assumed to have been place in the PREP_DIR before this is run and writes it into the resources folder of the given locale
makeNWordGramsFiles(array<string|int, mixed> $args) : mixed
Parameters
- $args : array<string|int, mixed>
-
command line arguments with first two elements of $argv removed. For details on which arguments do what see the $usage variable
Return values
mixed —makeSuggestTrie()
Makes a trie that can be used to make word suggestions as someone enters terms into the Yioop! search box. Outputs the result into the file suggest_trie.txt.gz in the supplied locale dir
makeSuggestTrie(string $dict_file, string $locale, string $end_marker) : mixed
Parameters
- $dict_file : string
-
where the word list is stored, one word per line
- $locale : string
-
which locale to write the suggest file to
- $end_marker : string
-
used to indicate end of word in the trie
Return values
mixed —fileWithTrim()
Reads file into an array or outputs file not found. For each entry in array trims it. Any blank lines are deleted
fileWithTrim( $file_name) : array<string|int, mixed>
Parameters
Return values
array<string|int, mixed> —of trimmed lines
tl()
Translate the supplied arguments into the current locale.
tl() : string
This function is a convenience copy of the same function
Tags
Return values
string —translated string
e()
shorthand for echo
e(string $text) : mixed
Parameters
- $text : string
-
string to send to the current output
Return values
mixed —tl()
Translate the supplied arguments into the current locale.
tl() : string
This function is a convenience copy of the same function
Tags
Return values
string —translated string
e()
shorthand for echo
e(string $text) : mixed
Parameters
- $text : string
-
string to send to the current output
Return values
mixed —exceptionErrorHandler()
Error handler so catch errors as exceptions too
exceptionErrorHandler(int $errno, string $errstr, string $errfile, int $errline) : mixed
Parameters
- $errno : int
-
number code of error
- $errstr : string
-
text of error message
- $errfile : string
-
filename of file in which error occurred
- $errline : int
-
line number of error
Return values
mixed —webError()
Used to handle request errors in non-cli, non-webserver redirect case
webError() : mixed
Return values
mixed —clean()
Used to clean trailing whitespace from files in a folder or just from a file given in the command line. If also removes final ?> characters to make php files conform with suggested coding guidelines. Similarly, adds a space between if, for, foreach, etc and ( if not present to make match PHP coding guidelines
clean(array<string|int, mixed> $args) : bool
Parameters
- $args : array<string|int, mixed>
-
$args[0] contains path to sub-folder/file
Return values
bool —$no_instructions false if should output CodeTool.php instructions
copyright()
Updates the copyright info (assuming in Yioop docs format) on files in supplied sub-folder/file. That is, it changes strings matching /2009 - \d\d\d\d/ to 2009 - current_year in those files/file.
copyright(array<string|int, mixed> $args) : bool
Parameters
- $args : array<string|int, mixed>
-
$args[0] contains path to sub-folder/file
Return values
bool —$no_instructions false if should output CodeTool.php instructions
longlines()
Search and echos line numbers and lines for lines of length greater than 80 characters in files in supplied sub-folder/file,
longlines(array<string|int, mixed> $args) : bool
Parameters
- $args : array<string|int, mixed>
-
$args[0] contains path to sub-folder/file
Return values
bool —$no_instructions false if should output CodeTool.php instructions
needsdocs()
Search and echos line numbers and lines for lines of length greater than 80 characters in files in supplied sub-folder/file,
needsdocs(array<string|int, mixed> $args) : bool
Parameters
- $args : array<string|int, mixed>
-
$args[0] contains path to sub-folder/file
Return values
bool —$no_instructions false if should output CodeTool.php instructions
replace()
Performs a search and replace for given pattern in files in supplied sub-folder/file
replace(array<string|int, mixed> $args) : bool
Parameters
- $args : array<string|int, mixed>
-
$args[0] contains path to sub-folder/file, $args[1] contains the regex searching for, $args[2] contains what it should be replaced with, $args[3] (defaults to effect) controls the mode of operation. One of "effect", "change", or "interactive". effect shows line number and lines matching pattern, but commits no changes; interactive for each match, prompts user if should do the change, change does a global search and replace without output
Return values
bool —$no_instructions false if should output CodeTool.php instructions
search()
Performs a search for given pattern in files in supplied sub-folder/file
search(array<string|int, mixed> $args) : bool
Parameters
- $args : array<string|int, mixed>
-
$args[0] contains path to sub-folder/file, $args[1] contains the regex searching for
Return values
bool —$no_instructions false if should output CodeTool.php instructions
unit()
Used to run or list Yioop unit tests given in $args
unit(array<string|int, mixed> $args) : bool
Parameters
- $args : array<string|int, mixed>
-
- if empty run all tests, if $args[0] == 'list' then list available test. If $args[0] == name_of_particular then run just that test. If $args[1] == name_of_particular case, then just run that test case of the particular test.
Return values
bool —whether $args made sense so could process
changeCopyrightFile()
Callback function applied to each file in the directory being traversed by @see copyright(). It checks if the files is of the extension of a code file and if so trims whitespace from its lines and then updates the lines of the form 2009 - \d\d\d\d to the supplied copyright year
changeCopyrightFile(string $filename[, mixed $set_year = false ]) : mixed
Parameters
- $filename : string
-
name of file to check for copyright lines and updated
- $set_year : mixed = false
-
if false then set the end of the copyright period to the current year, otherwise, if an int sets it to the value of the int
Return values
mixed —cleanLinesFile()
Callback function applied to each file in the directory being traversed by @see clean().
cleanLinesFile(string $filename) : mixed
Parameters
- $filename : string
-
name of file to clean lines for
Return values
mixed —searchFile()
Callback function applied to each file in the directory being traversed by @see search(). Searches $filename matching $pattern and outputs line numbers and lines
searchFile(string $filename[, mixed $set_pattern = false ]) : mixed
Parameters
- $filename : string
-
name of file to search in
- $set_pattern : mixed = false
-
if not false, then sets $set_pattern in $pattern to initialize the callback on subsequent calls. $pattern here is the search pattern
Return values
mixed —replaceFile()
Callback function applied to each file in the directory being traversed by @see replace(). Searches $filename matching $pattern. Depending on $mode ($arg[2] as described in replace()), it outputs and replaces with $replace
replaceFile(string $filename[, mixed $set_pattern = false ][, mixed $set_replace = false ][, mixed $set_mode = false ]) : mixed
Parameters
- $filename : string
-
name of file to search and replace in
- $set_pattern : mixed = false
-
if not false, then sets $set_pattern in $pattern to initialize the callback on subsequent calls. $pattern here is the search pattern
- $set_replace : mixed = false
-
if not false, then sets $set_replace in $replace to initialize the callback on subsequent calls.
- $set_mode : mixed = false
-
if not false, then sets $set_mode in $mode to initialize the callback on subsequent calls.
Return values
mixed —mapPath()
Applies the function $callback to each file in $path
mapPath(string $path, string $callback) : mixed
Parameters
- $path : string
-
to apply map $callback to
- $callback : string
-
function name to call with filename of each file in path
Return values
mixed —excludedPath()
Checks if $path is amongst a list of paths which should be ignored
excludedPath( $path) : bool
Parameters
Return values
bool —whether or not it should be ignored (true == ignore)
bootstrap()
Main entry point to the Yioop web app.
bootstrap([object $web_site = null ][, bool $start_new_session = true ]) : mixed
Initialization is done in a function to avoid polluting the global namespace with variables.
Parameters
- $web_site : object = null
- $start_new_session : bool = true
-
whether to start a session or not
Return values
mixed —checkCookieConsent()
Checks if a cookie consent form was obtained. This This function returns true if a session cookie was received from the browser, or a form variable saying cookies are okay was received, or the cookie Yioop profile says the consent mechanism is disabled
checkCookieConsent() : bool
Return values
bool —cookie consent (true) else false
configureRewrites()
Used to setup and handles url rewriting for the Yioop Web app
configureRewrites(object $web_site) : mixed
Developers can add new routes by creating a Routes class in the app_dir with a static method getRoutes which should return an associating array of incoming_path => handler function
Parameters
- $web_site : object
-
used to send error pages if configuration fails
Return values
mixed —routeAppFile()
Used to handle routes that will eventually just serve files from either the APP_DIR These include files like css, scripts, suggest tries, images, and videos.
routeAppFile(array<string|int, mixed> $route_args) : bool
Parameters
- $route_args : array<string|int, mixed>
-
of url parts (split on slash)
Return values
bool —whether was able to compute a route or not
routeBaseFile()
Used to handle routes that will eventually just serve files from either the BASE_DIR These include files like css, scripts, images, and robots.txt.
routeBaseFile(array<string|int, mixed> $route_args) : bool
Parameters
- $route_args : array<string|int, mixed>
-
of url parts (split on slash).
Return values
bool —whether was able to compute a route or not
routeDirect()
Used to route page requests to pages that are fixed Public Group wiki that should always be present. For example, 404 page.
routeDirect(array<string|int, mixed> $route_args) : bool
Parameters
- $route_args : array<string|int, mixed>
-
of url parts (split on slash).
Return values
bool —whether was able to compute a route or not
directUrl()
Given the name of a fixed public group static page creates the url where it can be accessed in this instance of Yioop, making use of the defined variable REDIRECTS_ON.
directUrl(string $name[, bool $with_delim = false ][, bool $with_base_url = false ]) : string
Parameters
- $name : string
-
of static page
- $with_delim : bool = false
-
whether it should be terminated with nothing or ? or &
- $with_base_url : bool = false
-
whether to use SHORT_BASE_URL or BASE_URL (true).
Return values
string —url for the page in question
routeBlog()
Used to route page requests to for the website's public blog
routeBlog(array<string|int, mixed> $route_args) : bool
Parameters
- $route_args : array<string|int, mixed>
-
of url parts (split on slash).
Return values
bool —whether was able to compute a route or not
routeFeeds()
Used to route page requests for pages corresponding to a group, user, or thread feed. If redirects on then urls ending with /feed_type/id map to a page for the id'th item of that feed_type
routeFeeds(array<string|int, mixed> $route_args) : bool
Parameters
- $route_args : array<string|int, mixed>
-
of url parts (split on slash).
Return values
bool —whether was able to compute a route or not
feedsUrl()
Given the type of feed, the identifier of the feed instance, and which controller is being used creates the url where that feed item can be accessed from the instance of Yioop. It makes use of the defined variable REDIRECTS_ON.
feedsUrl(string $type, int $id[, bool $with_delim = false ][, string $controller = "group" ][, bool $use_short_base_url = true ]) : string
Parameters
- $type : string
-
of feed: group, user, user messages, thread
- $id : int
-
the identifier for that feed.
- $with_delim : bool = false
-
whether it should be terminated with nothing or ? or &
- $controller : string = "group"
-
which controller is being used to access the feed: usually admin or group
- $use_short_base_url : bool = true
-
whether to create the url as a relative url using C\SHORT_BASE_URL or as a full url using C\BASE_URL (the latter is useful for mail notifications)
Return values
string —url for the page in question
routeUserMessages()
routeUserMessages(array<string|int, mixed> $route_args) : bool
Parameters
- $route_args : array<string|int, mixed>
-
of url parts (split on slash).
Return values
bool —whether was able to compute a route or not
routeController()
Used to route page requests to end-user controllers such as register, admin. urls ending with /controller_name will be routed to that controller.
routeController(array<string|int, mixed> $route_args) : bool
Parameters
- $route_args : array<string|int, mixed>
-
of url parts (split on slash).
Return values
bool —whether was able to compute a route or not
controllerUrl()
Given the name of a controller for which an easy end-user link is useful creates the url where it can be accessed on this instance of Yioop, making use of the defined variable REDIRECTS_ON. Examples of end-user controllers would be the admin, and register controllers.
controllerUrl(string $name[, bool $with_delim = false ]) : string
Parameters
- $name : string
-
of controller
- $with_delim : bool = false
-
whether it should be terminated with nothing or ? or &
Return values
string —url for the page in question
routeSubsearch()
Used to route page requests for subsearches such as news, video, and images (site owner can define other). Urls of the form /s/subsearch will go the page handling the subsearch.
routeSubsearch(array<string|int, mixed> $route_args) : bool
Parameters
- $route_args : array<string|int, mixed>
-
of url parts (split on slash).
Return values
bool —whether was able to compute a route or not
subsearchUrl()
Given the name of a subsearch creates the url where it can be accessed on this instance of Yioop, making use of the defined variable REDIRECTS_ON.
subsearchUrl(string $name[, bool $with_delim = false ]) : string
Examples of subsearches include news, video, and images. A site owner can add to these and delete from these.
Parameters
- $name : string
-
of subsearch
- $with_delim : bool = false
-
whether it should be terminated with nothing or ? or &
Return values
string —url for the page in question
routeSerpIcon()
Used to route requests for favicons for pages in search results
routeSerpIcon(array<string|int, mixed> $route_args) : bool
Parameters
- $route_args : array<string|int, mixed>
-
of url parts (split on slash).
Return values
bool —whether was able to compute a route or not
serpIconUrl()
Return the url to repquest the favicon for a page in the search resutls, making use of the defined variable REDIRECTS_ON.
serpIconUrl(mixed $url, mixed $crawl_time[, bool $with_delim = false ]) : string
Parameters
- $url : mixed
- $crawl_time : mixed
- $with_delim : bool = false
-
whether it should be terminated with nothing or ? or &
Return values
string —url for the page in question
routeSuggest()
Used to route requests for the suggest-a-url link on the tools page.
routeSuggest(array<string|int, mixed> $route_args) : bool
If redirects on, then /suggest routes to this suggest-a-url page.
Parameters
- $route_args : array<string|int, mixed>
-
of url parts (split on slash).
Return values
bool —whether was able to compute a route or not
suggestUrl()
Return the url for the suggest-a-url link on the more tools page, making use of the defined variable REDIRECTS_ON.
suggestUrl([bool $with_delim = false ]) : string
Parameters
- $with_delim : bool = false
-
whether it should be terminated with nothing or ? or &
Return values
string —url for the page in question
routeWiki()
Used to route page requests for pages corresponding to a wiki page of group. If it is a wiki page for the public group viewed without being logged in, the route might come in as yioop_instance/p/page_name if redirects are on. If it is for a non-public wiki or page accessed with logged in the url will look like either: yioop_instance/group/group_id?a=wiki&page_name=some_name or yioop_instance/admin/group_id?a=wiki&page_name=some_name&csrf_token_string
routeWiki(array<string|int, mixed> $route_args) : bool
Parameters
- $route_args : array<string|int, mixed>
-
of url parts (split on slash).
Return values
bool —whether was able to compute a route or not
wikiUrl()
Given the name of a wiki page, the group it belongs to, and which controller is being used creates the url where that wiki item can be accessed from the instance of Yioop. It makes use of the defined variable REDIRECTS_ON.
wikiUrl(string $name[, bool $with_delim = false ][, string $controller = "static" ][, int $id = CPUBLIC_GROUP_ID ]) : string
Parameters
- $name : string
-
of wiki page
- $with_delim : bool = false
-
whether it should be terminated with nothing or ? or &
- $controller : string = "static"
-
which controller is being used to access the feed: usually static (for the public group), admin, or group
- $id : int = CPUBLIC_GROUP_ID
-
the group the wiki page belongs to
Return values
string —url for the page in question
main()
Command-line shell for testing the class
main() : mixed
Return values
mixed —tl()
import a tl function into Controller Namespace
tl() : mixed
Return values
mixed —e()
shorthand for echo
e(string $text) : mixed
Parameters
- $text : string
-
string to send to the current output
Return values
mixed —localesWithStopwordsList()
Returns an array of locales that have a stop words list and a stop words remover method
localesWithStopwordsList() : array<string|int, mixed>
Return values
array<string|int, mixed> —list of locales that have a stopwords list;
localeTagToIso639_2Tag()
Converts a $locale_tag (major-minor) to an Iso 632-2 language name
localeTagToIso639_2Tag(string $locale_tag) : string
Parameters
- $locale_tag : string
-
want to convert
Return values
string —corresponding Iso 632-2 language tag
guessLocale()
Attempts to guess the user's locale based on the request, session, and user-agent data
guessLocale() : string
Return values
string —IANA language tag of the guessed locale
guessLocaleFromString()
Attempts to guess the user's locale based on a string sample
guessLocaleFromString(string $phrase_string[, string $locale_tag = null ]) : string
Parameters
- $phrase_string : string
-
used to make guess
- $locale_tag : string = null
-
language tag to use if can't guess -- if not provided uses current locale's value
Return values
string —IANA language tag of the guessed locale
checkQuery()
Tries to find whether query belongs to a programming language
checkQuery(string $query) : string
Parameters
- $query : string
-
query entered by user
Return values
string —$lang programming language for the the query provided
guessLangEncoding()
Tries to guess at a language tag based on the name of a character encoding
guessLangEncoding(string $encoding) : string
Parameters
- $encoding : string
-
a character encoding name
Return values
string —guessed language tag
guessEncodingHtmlXml()
Tries to guess the encoding used for an Html document
guessEncodingHtmlXml(string $html[, string $return_loc_info = false ]) : mixed
Parameters
- $html : string
-
a character encoding name
- $return_loc_info : string = false
-
if meta http-equiv info was used to find the encoding, then if $return_loc_info is true, we return the location of charset substring. This allows converting to UTF-8 later so cached pages will display correctly and redirects without char encoding won't be given a different hash.
Return values
mixed —either string or array if string then guessed encoding, if array guessed encoding, start_pos of where charset info came from, length
convertUtf8IfNeeded()
Converts page data in a site associative array to UTF-8 if it is not already in UTF-8
convertUtf8IfNeeded(array<string|int, mixed> &$site, string $page_field, string $encoding_field[, function $log_function = "" ]) : mixed
Parameters
- $site : array<string|int, mixed>
-
an associative of info about a web site
- $page_field : string
-
the field in the associative array that contains the $site's web page as a string.
- $encoding_field : string
-
the field in the associative array that contains the character encoding the page is currently in
- $log_function : function = ""
-
a callback function used to write log messages with, if desired.
Return values
mixed —tl()
Translate the supplied arguments into the current locale.
tl() : string
This function takes a variable number of arguments. The first being an identifier to translate. Additional arguments are used to interpolate values in for %s's in the translation.
Return values
string —translated string
setLocaleObject()
Sets the language to be used for locale settings
setLocaleObject(string $locale_tag) : mixed
Parameters
- $locale_tag : string
-
the tag of the language to use to determine locale settings
Return values
mixed —getLocaleTag()
Gets the language tag (for instance, en_US for American English) of the locale that is currently being used. This function has the side effect of setting Yioop's current locale.
getLocaleTag() : string
Return values
string —the tag of the language currently being used for locale settings
getLocaleDirection()
Returns the current language directions.
getLocaleDirection() : string
Return values
string —ltr or rtl depending on if the language is left-to-right or right-to-left
getLocaleQueryStatistics()
Returns the query statistics info for the current llocalt.
getLocaleQueryStatistics() : array<string|int, mixed>
Return values
array<string|int, mixed> —consisting of queries and elapses times for locale computations
getBlockProgression()
Returns the current locales method of writing blocks (things like divs or paragraphs).A language like English puts blocks one after another from the top of the page to the bottom. Other languages like classical Chinese list them from right to left.
getBlockProgression() : string
Return values
string —tb lr rl depending on the current locales block progression
getWritingMode()
Returns the writing mode of the current locale. This is a combination of the locale direction and the block progression. For instance, for English the writing mode is lr-tb (left-to-right top-to-bottom).
getWritingMode() : string
Return values
string —the locales writing mode
w1256ToUTF8()
Convert the string $str encoded in Windows-1256 into UTF-8
w1256ToUTF8(string $str) : string
Parameters
- $str : string
-
Windows-1256 string to convert
Return values
string —the UTF-8 equivalent
utf8chr()
Given a unicode codepoint convert it to UTF-8
utf8chr(int $code) : string
Parameters
- $code : int
-
the codepoint to convert
Return values
string —the corresponding UTF-8 string
formatDateByLocale()
Function for formatting a date string based on the locale.
formatDateByLocale( $timestamp, $locale_tag) : string
Parameters
Return values
string —formatted date string
upgradeLocalesCheck()
Checks to see if the locale data of Yioop! of a locale in the work dir is older than the currently running Yioop!
upgradeLocalesCheck(string $locale_tag) : mixed
Parameters
- $locale_tag : string
-
locale to check directory of
Return values
mixed —upgradeLocales()
If the locale data of Yioop! in the work directory is older than the currently running Yioop! then this function is called to at least try to copy the new strings into the old profile.
upgradeLocales() : mixed
Return values
mixed —upgradePublicHelpWiki()
Used to force push the default Public and Wiki pages into the current database
upgradePublicHelpWiki(resource &$db) : mixed
Parameters
- $db : resource
-
datasource to use to upgrade
Return values
mixed —upgradeDatabaseWorkDirectoryCheck()
Checks to see if the database data or work_dir folder of Yioop! is from an older version of Yioop! than the currently running Yioop!
upgradeDatabaseWorkDirectoryCheck() : mixed
Return values
mixed —upgradeDatabaseWorkDirectory()
If the database data of Yioop is older than the version of the currently running Yioop then this function is called to try upgrade the database to the new version
upgradeDatabaseWorkDirectory() : mixed
Return values
mixed —updateVersionNumber()
Update the database version number to a new number
updateVersionNumber(object &$db, int $number) : mixed
Parameters
- $db : object
-
datasource for Yioop database
- $number : int
-
the new database number
Return values
mixed —getWikiHelpPages()
Reads the Help articles from default db and returns the array of pages.
getWikiHelpPages() : mixed
Return values
mixed —addActivityAtId()
Used to insert a new activity into the database at a given activity_id
addActivityAtId(resource &$db, string $string_id, string $method_name, int $activity_id) : mixed
Inserting at an ID rather than at the end is useful since activities are displayed in admin panel in order of increasing id.
Parameters
- $db : resource
-
database handle where Yioop database stored
- $string_id : string
-
message identifier to give translations for for activity
- $method_name : string
-
admin_controller method to be called to perform this activity
- $activity_id : int
-
the id location at which to create this activity activity at and below this location will be shifted down by 1.
Return values
mixed —updateTranslationForStringId()
Adds or replaces a translation for a database message string for a given IANA locale tag.
updateTranslationForStringId(resource &$db, string $string_id, string $locale_tag, string $translation) : mixed
Parameters
- $db : resource
-
database handle where Yioop database stored
- $string_id : string
-
message identifier to give translation for
- $locale_tag : string
-
the IANA language tag to update the strings of
- $translation : string
-
the translation for $string_id in the language $locale_tag
Return values
mixed —addRegexDelimiters()
Adds delimiters to a regex that may or may not have them
addRegexDelimiters(string $expression) : string
Parameters
- $expression : string
-
a regex
Return values
string —rgex with delimiters if not there
preg_search()
search for a pcre pattern in a subject from a given offset, return position of first match if found -1 otherwise.
preg_search(string $pattern, string $subject, int $offset[, bool $return_match = false ]) : mixed
Parameters
- $pattern : string
-
a Perl compatible regular expression
- $subject : string
-
to search for pattern in
- $offset : int
-
character offset into $subject to begin searching from
- $return_match : bool = false
-
whether to return as well what the match was for the pattern
Return values
mixed —if $return_match is false then the integer position of first match, otherwise, it returns the ordered pair [$pos, $match].
preg_offset_replace()
Replaces a pcre pattern with a replacement in $subject starting from some offset.
preg_offset_replace(string $pattern, string $replacement, string $subject, int $offset) : string
Parameters
- $pattern : string
-
a Perl compatible regular expression
- $replacement : string
-
what to replace the pattern with
- $subject : string
-
to search for pattern in
- $offset : int
-
character offset into $subject to begin searching from
Return values
string —result of the replacements
parse_ini_with_fallback()
Yioop replacement for parse_ini_file($name, true) in case parse_ini_file is on the disable_functions list. Name has underscores to match original function. This function checks if parse_ini_file is disabled on not. If not, it just calls parse_ini_file; otherwise, it simulates it enough so that configure.ini files used for string translations can be read.
parse_ini_with_fallback(string $file) : array<string|int, mixed>
Parameters
- $file : string
-
filename of ini data to parse into an array
Return values
array<string|int, mixed> —data parse from file
getIniAssignMatch()
Auxiliary function called from parse_ini_with_fallback to extract from the $matches array produced by the former function's preg_match what kind of assignment occurred in the ini file being parsed.
getIniAssignMatch(string $matches) : mixed
Parameters
- $matches : string
-
produced by a preg_match in parse_ini_with_fallback
Return values
mixed —value of ini file assignment
charCopy()
Copies from $source string beginning at position $start, $length many bytes to destination string
charCopy(string $source, string &$destination, int $start, int $length[, string $timeout_msg = "" ]) : mixed
Parameters
- $source : string
-
string to copy from
- $destination : string
-
string to copy to
- $start : int
-
starting offset
- $length : int
-
number of bytes to copy
- $timeout_msg : string = ""
-
message to print if taking more than 30 seconds
Return values
mixed —vByteEncode()
Encodes an integer using variable byte coding.
vByteEncode(int $pos_int) : string
Parameters
- $pos_int : int
-
integer to encode
Return values
string —a string of 1-5 chars depending on how bit $pos_int was
vByteDecode()
Decodes from a string using variable byte coding an integer.
vByteDecode(string $str, int &$offset) : int
Parameters
- $str : string
-
string to use for decoding
- $offset : int
-
byte offset into string when var int stored
Return values
int —the decoded integer
appendUnary()
Appends a number re-encoded in unary to the end of an input string starting at a given bit offset into the string. Here n in unary has bit representation n-1 0's followed by a 1.
appendUnary(int $number, mixed $input, mixed &$start_bit_offset[, mixed $just_bit_offset = false ]) : mixed
Parameters
- $number : int
-
number to append
- $input : mixed
- $start_bit_offset : mixed
- $just_bit_offset : mixed = false
Return values
mixed —either the resulting string or its length
decodeUnary()
Decodes a unary number froman input string at a given bit offset. Here n in unary has bit representation n-1 0's followed by a 1.
decodeUnary(string $input, int &$start_bit_offset) : int
Parameters
- $input : string
-
the string that we want to decode a unary number from
- $start_bit_offset : int
-
the starting bit offset in $input to start decoding from. After the call it will be the position after the decode
Return values
int —the decoded unary number
appendBits()
Appends $num_bits bits from the start of the binary rep of $number beginning at offset $start_bit_offset of $input string overwriting any bits present. If $num_bits == -1, then appends all of $number.
appendBits(int $number, string $input, int &$start_bit_offset[, $num_bits = -1 ]) : string
Parameters
- $number : int
-
to append
- $input : string
-
the string to append to.
- $start_bit_offset : int
-
starting location to begin append from
- $num_bits : = -1
-
number of bits of $input to append.
Return values
string —resulting string
decodeBits()
Decode $num_bits many bits from the $input string beginning at offset $start_bit_offset. The result of this operation is up $start_bit_offset by number of bits that were able to be decoded.
decodeBits(string $input, int &$start_bit_offset, int $num_bits) : int
Parameters
- $input : string
-
string to decode bits from
- $start_bit_offset : int
-
bit offset to start decoding from in $input
- $num_bits : int
-
number of bits tot try to decode
Return values
int —the number decoded
appendGamma()
Appends gamma code of $number beginning at offset $start_bit_offset of $input string overwriting any bits present. $start_bit_offset is updated to bit position after append.
appendGamma(int $number, string $input, int &$start_bit_offset) : string
Parameters
- $number : int
-
to append
- $input : string
-
the string to append to.
- $start_bit_offset : int
-
starting bit location to begin append from
Return values
string —resulting string
decodeGammaList()
Decodes up to $num_decode gamma encoded integers beginning at $start_bit_offset. $start_bit_offset is updated to the bit position after the decoded integers.
decodeGammaList(string $input, int &$start_bit_offset, int $num_decode) : array<string|int, mixed>
Parameters
- $input : string
-
the string to decode from
- $start_bit_offset : int
-
starting bit location to decode from
- $num_decode : int
-
number of int's to decode
Return values
array<string|int, mixed> —decoded int's
appendRiceSequence()
Appends using a Rice coding a sequence of integers $int_sequence at offset $start_bit_offset to the string $output, overwriting any bits present at that location. $start_bit_offset is updated to bit position after append.
appendRiceSequence(array<string|int, mixed> $int_sequence, int $modulus, string $output, int &$start_bit_offset[, int $delta_start = -1 ]) : string
Encoding is done as a difference list. If $delta_start is set to a value other than >= then the first gap is assumed to be from int $delta_start
Parameters
- $int_sequence : array<string|int, mixed>
-
int's to append
- $modulus : int
-
i in the 2^i modulus to use for Rice code
- $output : string
-
the string to append to.
- $start_bit_offset : int
-
starting bit location to begin append from
- $delta_start : int = -1
-
if >= 0 previous int to use for difference list otherwise the first integer is encoded as itself rather than a difference
Return values
string —resulting string
decodeRiceSequence()
Decodes up to $num_decode rice encoded difference list of integers beginning at $start_bit_offset. $start_bit_offset is updated to the bit position after the decoded integers. If $delta_start >= 0 then the first int is assumed to be the difference from $delta_start;
decodeRiceSequence(string $input, int &$start_bit_offset, int $num_decode[, int $delta_start = -1 ]) : array<string|int, mixed>
Parameters
- $input : string
-
the string to decode from
- $start_bit_offset : int
-
starting bit location to decode from
- $num_decode : int
-
number of int's to decode
- $delta_start : int = -1
-
if >= 0 previous int to use for difference list otherwise the first integer is decoded as itself rather than a difference
Return values
array<string|int, mixed> —decoded int's
encodePositionList()
Encodes a list of integer positions of a term in a document. This is done as a gamma code of the first integer followed by the Rice coding of the remaining integers using a modulus based on the average gap between integers. If the number of positions is 1 or 2 then a gamma of each position only is used.
encodePositionList(array<string|int, mixed> $positions) : string
Parameters
- $positions : array<string|int, mixed>
-
integer term positions
Return values
string —encoded position list
decodePositionList()
Decodes up to $num_decode term in document position integers from string $input under the assumption $input is encoded as per
decodePositionList(string $input, int $num_decode) : array<string|int, mixed>
Parameters
- $input : string
-
string to decode from
- $num_decode : int
-
number of integer to decode
Tags
Return values
array<string|int, mixed> —decoded positions
encode255()
Recodes a string in a 1-1 fashion to a string not involving \xFF (255). I.e., it maps characters \xFE -> \xFE\FD and \xFF -> \xFE\FE
encode255(string $str) : string
Parameters
- $str : string
-
to be encoded
Return values
string —encoded string without \xFF
decode255()
Decodes a string in a 1-1 fashion from a string not involving \xFF (255). I.e., it maps characters \xFE\FE -> \xFF and \xFE\FD -> \xFF
decode255(string $str) : string
Parameters
- $str : string
-
to be frcoded
Return values
string —decoded string
encodeUnderscore()
Recodes a string in a 1-1 fashion to a string not involving underscore (_). I.e., it maps characters - -> -- and _ -> -=
encodeUnderscore(string $str) : string
Parameters
- $str : string
-
to be encoded
Return values
string —encoded string without _
decodeUnderscore()
Decodes a string in a 1-1 fashion from a string not involving underscore (_). I.e., it maps characters -= -> _ and -- -> -
decodeUnderscore(string $str) : string
Parameters
- $str : string
-
to be frcoded
Return values
string —decoded string
packEncode255()
Encodes a list of strings as their @see encode255 versions separated by \xFF's
packEncode255(array<string|int, mixed> $strs) : string
Parameters
- $strs : array<string|int, mixed>
-
strings to encode as a single string
Return values
string —encoded list
unpackDecode255()
Decodes a list of strings from a string that encoded as their @see encode255 of its elements separated by \xFF's
unpackDecode255(string $encoded_strs) : array<string|int, mixed>
Parameters
- $encoded_strs : string
-
string to decode into a list of strings
Return values
array<string|int, mixed> —decoded list
packPosting()
Makes an packed integer string from a docindex and the number of occurrences of a word in the document with that docindex.
packPosting(int $doc_index, array<string|int, mixed> $position_list[, bool $delta = true ]) : string
Parameters
- $doc_index : int
-
index (i.e., a count of which document it is rather than a byte offset) of a document in the document string
- $position_list : array<string|int, mixed>
-
integer positions word occurred in that doc
- $delta : bool = true
-
if true then stores the position_list as a sequence of differences (a delta list)
Return values
string —a modified9 (our compression scheme) packed string containing this info.
unpackPosting()
Given a packed integer string, uses the top three bytes to calculate a doc_index of a document in the shard, and uses the low order byte to computer a number of occurrences of a word in that document.
unpackPosting(string $posting, int &$offset[, bool $dedelta = true ]) : array<string|int, mixed>
Parameters
- $posting : string
-
a string containing a doc index position list pair coded encoded using modified9
- $offset : int
-
a offset into the string where the modified9 posting is encoded
- $dedelta : bool = true
-
if true then assumes the list is a sequence of differences (a delta list) and undoes the difference to get the original sequence
Return values
array<string|int, mixed> —consisting of integer doc_index and a subarray consisting of integer positions of word in doc.
addDocIndexPostings()
This method is used while appending one index shard to another.
addDocIndexPostings(string &$postings, int $add_offset) : string
Given a string of postings adds $add_offset add to each offset to the document map in each posting.
Parameters
- $postings : string
-
a string of index shard postings
- $add_offset : int
-
an fixed amount to add to each postings doc map offset
Return values
string —$new_postings where each doc offset has had $add_offset added to it
deltaList()
Computes the difference of a list of integers.
deltaList(array<string|int, mixed> $list) : array<string|int, mixed>
i.e., (a1, a2, a3, a4) becomes (a1, a2-a1, a3-a2, a4-a3)
Parameters
- $list : array<string|int, mixed>
-
a nondecreasing list of integers
Return values
array<string|int, mixed> —the corresponding list of differences of adjacent integers
deDeltaList()
Given an array of differences of integers reconstructs the original list. This computes the inverse of the deltaList function
deDeltaList(array<string|int, mixed> &$delta_list) : array<string|int, mixed>
Parameters
- $delta_list : array<string|int, mixed>
-
a list of nonegative integers
Tags
Return values
array<string|int, mixed> —a nondecreasing list of integers
encodeModified9()
Encodes a sequence of integers x, such that 1 <= x <= 2<<28-1 as a string. NOTICE x>=1.
encodeModified9(array<string|int, mixed> $list) : string
The encoded string is a sequence of 4 byte words (packed int's). The high order 2 bits of a given word indicate whether or not to look at the next word. The codes are as follows: 11 start of encoded string, 10 continue four more bytes, 01 end of encoded, and 00 indicates whole sequence encoded in one word.
After the high order 2 bits, the next most significant bits indicate the format of the current word. There are nine possibilities: 00 - 1 28 bit number, 01 - 2 14 bit numbers, 10 - 3 9 bit numbers, 1100 - 4 6 bit numbers, 1101 - 5 5 bit numbers, 1110 6 4 bit numbers, 11110 - 7 3 bit numbers, 111110 - 12 2 bit numbers, 111111 - 24 1 bit numbers.
Parameters
- $list : array<string|int, mixed>
-
a list of positive integers satsfying above
Return values
string —encoded string
packListModified9()
Packs the contents of a single word of a sequence being encoded using Modified9.
packListModified9(int $continue_bits, int $cnt, array<string|int, mixed> $pack_list) : string
Parameters
- $continue_bits : int
-
the high order 2 bits of the word
- $cnt : int
-
the number of element that will be packed in this word
- $pack_list : array<string|int, mixed>
-
a list of positive integers to pack into word
Tags
Return values
string —encoded 4 byte string
nextPostString()
Returns the next complete posting string from $input_string being at offset.
nextPostString(string &$input_string, int &$offset) : string
Does not do any decoding.
Parameters
- $input_string : string
-
a string of postings
- $offset : int
-
an offset to this string which will be updated after call
Return values
string —undecoded posting
decodeModified9()
Decoded a sequence of positive integers from a string that has been encoded using Modified 9
decodeModified9(string $input_string, int &$offset) : array<string|int, mixed>
Parameters
- $input_string : string
-
string to decode from
- $offset : int
-
where to string in the string, after decode points to where one was after decoding.
Tags
Return values
array<string|int, mixed> —sequence of positive integers that were decoded
unpackListModified9()
Decode a single word with high two bits off according to modified 9
unpackListModified9(string $encoded_list) : array<string|int, mixed>
Parameters
- $encoded_list : string
-
four byte string to decode
Return values
array<string|int, mixed> —sequence of integers that results from the decoding.
docIndexModified9()
Given an int encoding encoding a doc_index followed by a position list using Modified 9, extracts just the doc_index.
docIndexModified9(int $encoded_list) : int
Parameters
- $encoded_list : int
-
in the just described format
Return values
int —a doc index into an index shard document map.
unpackInt()
Unpacks an int from a 4 char string
unpackInt(string $str) : int
Parameters
- $str : string
-
where to extract int from
Return values
int —extracted integer
packInt()
Packs an int into a 4 char string
packInt(int $my_int) : string
Parameters
- $my_int : int
-
the integer to pack
Return values
string —the packed string
unpackFloat()
Unpacks a float from a 4 char string
unpackFloat(string $str) : float
Parameters
- $str : string
-
where to extract int from
Return values
float —extracted float
packFloat()
Packs an float into a four char string
packFloat(float $my_float) : string
Parameters
- $my_float : float
-
the float to pack
Return values
string —the packed string
renameSerializedObject()
Used to change the namespace of a serialized php object (assumes doesn't have nested subobjects)
renameSerializedObject(string $class_name, string $object_string) : string
Parameters
- $class_name : string
-
new fully qualified name with namespace
- $object_string : string
-
serialized object
Return values
string —serialized object with new name
getDomFromString()
Parses a provided string to make a DOM object. First tries to parse using XML and if this fails uses the more robust HTML Dom parser and manipulates the resulting DOM tree to make correspond to original tags for XML that isn't HTML
getDomFromString(string $to_parse) : DOMDocument
Parameters
- $to_parse : string
-
the string to parse a DOMDocument from
Return values
DOMDocument —computed based on the provided string
getTags()
Returns an array of DOMDocuments for the nodes that match an xpath query on $dom, a DOMDocument
getTags(DOMDocument $dom, string $query) : array<string|int, mixed>
Parameters
- $dom : DOMDocument
-
document to run xpath query on
- $query : string
-
xpath query to run
Return values
array<string|int, mixed> —of DOMDocuments one for each node matching the xpath query in the original DOMDocument
toHexString()
Converts a string to string where each char has been replaced by its hexadecimal equivalent
toHexString(string $str) : string
Parameters
- $str : string
-
what we want rewritten in hex
Return values
string —the hexified string
toIntString()
Converts a string to string where each char has been replaced by a Integer equivalent
toIntString(string $str) : string
Parameters
- $str : string
-
what we want rewritten in hex
Return values
string —the hexified string
toBinString()
Converts a string to string where each char has been replaced by its binary equivalent
toBinString(string $str) : string
Parameters
- $str : string
-
what we want rewritten in hex
Return values
string —the binary string
metricToInt()
Converts a string of the form some int followed by K, M, or G.
metricToInt(string $metric_num) : int
into its integer equivalent. For example 4K would become 4000, 16M would become 16000000, and 1G would become 1000000000 Note not using base 2 for K, M, G
Parameters
- $metric_num : string
-
metric number to convert
Return values
int —number the metric string corresponded to
intToMetric()
Converts a number to a string followed by nothing, K, M, G, T depending on whether number is < 1000, < 10^6, < 10^9, or < 10^(12)
intToMetric(int $num) : string
Parameters
- $num : int
-
number to convert
Return values
string —number the metric string corresponded to
crawlLog()
Logs a message to a logfile or the screen. The super-global field $_SERVER['LOG_TO_FILES'] determines if this will log to a file. If not, then in cli mode, will log to stdout, otherwise it will use error_log. When logging to file $_SERVER["NO_ROTATE_LOGS"] controls whether or not there will be a log file rotation. The first call to this method is typically used to set up a process to check for liveness. For example a call: crawlLog("\n\nInitialize logger..", $this->process_name, true); says $this->process_name should be checked for liveness as part of any subsequent logging activity such as a call crawlLog("Another Message"); (note subsequent call don't need to specify the process name).
crawlLog(string $msg[, string $lname = null ][, bool $check_process_handler = false ]) : mixed
Parameters
- $msg : string
-
message to log. If empty then no message written
- $lname : string = null
-
name of log file in the LOG_DIR directory, rotated logs will also use this as their basename followed by a number followed by gzipped (since they are gzipped (older versions of Yioop used bzip Some distros don't have bzip but do have gzip. Also gzip was being used elsewhere in Yioop, so to remove the dependency bzip was replaced )).
- $check_process_handler : bool = false
-
by default set to false. After the first time set to true, as long as in subsequent calls set to false, processHandler will be called to check how long the code has run since the last time processHandler called.
Return values
mixed —makeTimestamp()
Used to make a log file entry time string of format: entry number, time in r format.
makeTimestamp([int $time = -1 ]) : string
Parameters
- $time : int = -1
-
a unix timestamp
Return values
string —[line_count_in_log r_formatted_date]
crawlTimeoutLog()
Writes a log message $msg if more than LOG_TIMEOUT time has passed since the last time crawlTimeoutLog was called. Useful in loops to write a message as progress is made through the loop (but not on every iteration, but say every 30 seconds).
crawlTimeoutLog(mixed $msg) : bool
Parameters
- $msg : mixed
-
usually a string with what to be printed out after the timeout period. If $msg === true then clears the timeout cache
Return values
bool —whether a log message was written
crawlHash()
Computes an 8 byte hash of a string for use in storing documents.
crawlHash(string $string[, bool $raw = false ]) : string
An eight byte hash was chosen so that the odds of collision even for a few billion documents via the birthday problem are still reasonable. If the raw flag is set to false then an 11 byte base64 encoding of the 8 byte hash is returned. The hash is calculated as the xor of the two halves of the 16 byte md5 of the string. (8 bytes takes less storage which is useful for keeping more doc info in memory)
Parameters
- $string : string
-
the string to hash
- $raw : bool = false
-
whether to leave raw or base 64 encode
Return values
string —the hash of $string
crawlHashWord()
Used to create a 20 byte hash of a string (typically a word or phrase with a wikipedia page). Format is 8 byte crawlHash of term (md5 of term two halves XOR'd), followed by a \x00, followed by the first 11 characters from the term. If there are not enough char's to make 20 bytes, then the string is padded with \x00s to 20bytes.
crawlHashWord(string $string[, bool $raw = false ]) : string
Parameters
- $string : string
-
word to hash
- $raw : bool = false
-
whether to base64Hash the result
Return values
string —first 8 bytes of md5 of $string concatenated with \x00 to indicate the hash is of a word not a phrase concatenated with the padded to 11 byte $meta_string.
canonicalTerm()
Take a $term that might have come from adocuments and converts it to a string of 16 bytes which is either the original term padded by underscores or the first seven chars of the term followed by an underscore followed by the base64 encoding of the first 6 chars of its md5 hash.
canonicalTerm(string $term) : string
Base64 used to make this all nice and printable.
Parameters
- $term : string
-
to made into a canonical form
Return values
string —canonicalize by apbove version of term.
compareWordHashes()
Used to compare to ids for index dictionary lookup. ids are a 8 byte crawlHash together with 12 byte non-hash suffix.
compareWordHashes(string $id1, string $id2) : int
Parameters
- $id1 : string
-
20 byte word id to compare
- $id2 : string
-
20 byte word id to compare
Return values
int —negative if $id1 smaller, positive if bigger, and 0 if same
base64Hash()
Converts a crawl hash number to something closer to base64 coded but so doesn't get confused in urls or DBs
base64Hash(string $string) : string
Parameters
- $string : string
-
a hash to base64 encode
Return values
string —the encoded hash
unbase64Hash()
Decodes a crawl hash number from base64 to raw ASCII
unbase64Hash(string $base64) : string
Parameters
- $base64 : string
-
a hash to decode
Return values
string —the decoded hash
webencode()
Encodes a string in a format suitable for post data (mainly, base64, but str_replace data that might mess up post in result)
webencode(string $str) : string
Parameters
- $str : string
-
string to encode
Return values
string —encoded string
webdecode()
Decodes a string encoded by webencode
webdecode(string $str) : string
Parameters
- $str : string
-
string to encode
Return values
string —encoded string
crawlCrypt()
The crawlHash function is used to encrypt passwords stored in the database.
crawlCrypt(string $string[, int $salt = null ]) : string
It tries to use the best version the Blowfish variant of php's crypt function available on the current system.
Parameters
- $string : string
-
the string to encrypt
- $salt : int = null
-
salt value to be used (needed to verify if a password is valid)
Return values
string —the crypted string where crypting is done using crawlHash
partitionByHash()
Used by a controller to take a table and return those rows in the table that a given queue_server would be responsible for handling
partitionByHash(array<string|int, mixed> $table, string $field, int $num_partition, int $instance[, object $callback = null ]) : array<string|int, mixed>
Parameters
- $table : array<string|int, mixed>
-
an array of rows of associative arrays which a queue_server might need to process
- $field : string
-
column of $table whose values should be used for partitioning
- $num_partition : int
-
number of queue_servers to choose between
- $instance : int
-
the id of the particular server we are interested in
- $callback : object = null
-
function or static method that might be applied to input before deciding the responsible queue_server. For example, if input was a url we might want to get the host before deciding on the queue_server
Return values
array<string|int, mixed> —the reduced table that the $instance queue_server is responsible for
calculatePartition()
Used by a controller to say which queue_server should receive a given input
calculatePartition(string $input, int $num_partition[, object $callback = null ]) : int
Parameters
- $input : string
-
can view as a key that might be processes by a queue_server. For example, in some cases input might be a url and we want to determine which queue_server should be responsible for queuing that url
- $num_partition : int
-
number of queue_servers to choose between
- $callback : object = null
-
function or static method that might be applied to input before deciding the responsible queue_server. For example, if the input was a url we might want to get the host before deciding on the queue_server
Return values
int —id of server responsible for input
changeInMicrotime()
Measures the change in time in seconds between two timestamps to microsecond precision
changeInMicrotime(string $start[, string $end = null ]) : float
Parameters
- $start : string
-
starting time with microseconds
- $end : string = null
-
ending time with microseconds, if null use current time
Return values
float —time difference in seconds
microTimestamp()
Timestamp of current epoch with microsecond precision useful for situations where time() might cause too many collisions (account creation, etc)
microTimestamp() : string
Return values
string —timestamp to microsecond of time in second since start of current epoch
checkTimeInterval()
Checks that a timestamp is within the time interval given by a start time (HH:mm) and a duration
checkTimeInterval(string $start_time, string $duration[, int $time = -1 ]) : int
Parameters
- $start_time : string
-
string of the form (HH:mm)
- $duration : string
-
string containing an int in seconds
- $time : int = -1
-
a Unix timestamp.
Return values
int —-1 if the time of day of $time is not within the given interval. Otherwise, the Unix timestamp at which the interval will be over for the same day as $time.
convertPixels()
Converts a CSS unit string into its equivalent in pixels. This is used by @see SvgProcessor.
convertPixels(string $value) : int
Parameters
- $value : string
-
a number followed by a legal CSS unit
Return values
int —a number in pixels
countFiles()
Returns the number of files in a folder
countFiles(string $folder) : int
Parameters
- $folder : string
-
path to folder to count
Return values
int —number of files
makePath()
Creates folders along a filesystem path if they don't exist
makePath(string $path) : bool
Parameters
- $path : string
-
a file system path
Return values
bool —success or failure
deleteFileOrDir()
This is a callback function used in the process of recursively deleting a directory
deleteFileOrDir(string $file_or_dir) : mixed
Parameters
- $file_or_dir : string
-
the filename or directory name to be deleted
Tags
Return values
mixed —setWorldPermissions()
This is a callback function used in the process of recursively chmoding to 777 all files in a folder
setWorldPermissions(string $file) : mixed
Parameters
- $file : string
-
the filename or directory name to be chmod
Tags
Return values
mixed —fileInfo()
This is a callback function used in the process of recursively calculating an array of file modification times and files sizes for a directory
fileInfo(string $file) : an
Parameters
- $file : string
-
a name of a file in the file system
Return values
an —array whose single element contain an associative array with the size and modification time of the file
orderCallback()
Callback function used to sort documents by a field
orderCallback(string $word_doc_a, string $word_doc_b[, string $order_field = null ]) : int
Should be initialized before using in usort with a call like: orderCallback($tmp, $tmp, "field_want");
Parameters
- $word_doc_a : string
-
doc id of first document to compare
- $word_doc_b : string
-
doc id of second document to compare
- $order_field : string = null
-
which field of these associative arrays to sort by
Return values
int —-1 if first doc bigger 1 otherwise
stringOrderCallback()
Callback function used to sort documents by a field where field is assume to be a string
stringOrderCallback(string $word_doc_a, string $word_doc_b[, string $order_field = null ]) : int
Should be initialized before using in usort with a call like: stringOrderCallback($tmp, $tmp, "field_want");
Parameters
- $word_doc_a : string
-
doc id of first document to compare
- $word_doc_b : string
-
doc id of second document to compare
- $order_field : string = null
-
which field of these associative arrays to sort by
Return values
int —-1 if first doc smaller 1 otherwise
stringROrderCallback()
Callback function used to sort documents by a field where field is assume to be a string
stringROrderCallback(string $word_doc_a, string $word_doc_b[, string $order_field = null ]) : int
Should be initialized before using in usort with a call like: stringROrderCallback($tmp, $tmp, "field_want");
Parameters
- $word_doc_a : string
-
doc id of first document to compare
- $word_doc_b : string
-
doc id of second document to compare
- $order_field : string = null
-
which field of these associative arrays to sort by
Return values
int —-1 if first doc bigger 1 otherwise
rorderCallback()
Callback function used to sort documents by a field in reverse order
rorderCallback(string $word_doc_a, string $word_doc_b[, string $order_field = null ]) : int
Should be initialized before using in usort with a call like: rorderCallback($tmp, $tmp, "field_want");
Parameters
- $word_doc_a : string
-
doc id of first document to compare
- $word_doc_b : string
-
doc id of second document to compare
- $order_field : string = null
-
which field of these associative arrays to sort by
Return values
int —1 if first doc bigger -1 otherwise
lessThan()
Callback to check if $a is less than $b
lessThan(float $a, float $b) : int
Used to help sort document results returned in PhraseModel called in IndexArchiveBundle
Parameters
- $a : float
-
first value to compare
- $b : float
-
second value to compare
Tags
Return values
int —-1 if $a is less than $b; 1 otherwise
greaterThan()
Callback to check if $a is greater than $b
greaterThan(float $a, float $b) : int
Used to help sort document results returned in PhraseModel called in IndexArchiveBundle
Parameters
- $a : float
-
first value to compare
- $b : float
-
second value to compare
Tags
Return values
int —-1 if $a is greater than $b; 1 otherwise
e()
shorthand for echo
e(string $text) : mixed
Parameters
- $text : string
-
string to send to the current output
Return values
mixed —remoteAddress()
Compute the real remote address of the incoming connection including forwarding
remoteAddress() : mixed
Return values
mixed —readInput()
Used to read a line of input from the command-line
readInput() : string
Return values
string —from the command-line
readPassword()
Used to read a line of input from the command-line (on unix machines without echoing it)
readPassword() : string
Return values
string —from the command-line
readMessage()
Used to read a several lines from the terminal up until a last line consisting of just a "."
readMessage() : string
Return values
string —from the command-line
mimeType()
Returns the mime type of the provided file name if it can be determined.
mimeType(string $file_name[, bool $use_extension = false ]) : string
Parameters
- $file_name : string
-
(name of file including path to figure out mime type for)
- $use_extension : bool = false
-
whether to just try to guess from the file extension rather than looking at the file
Return values
string —mime type or unknown if can't be determined
generalIsA()
Checks if class_1 is the same as class_2 or has class_2 as a parent Behaves like 3 param version (last param true) of PHP is_a function that came into being with Version 5.3.9.
generalIsA(mixed $class_1, mixed $class_2) : bool
Parameters
- $class_1 : mixed
-
object or string class name to see if in class2
- $class_2 : mixed
-
object or string class name to see if contains class1
Return values
bool —equal or contains class
stripAttributes()
Given the contents of a start XML/HMTL tag strips out all the attributes non listed in $safe_attribute_list
stripAttributes(string $start_tag_contents[, array<string|int, mixed> $safe_attribute_list = [] ]) : string
Parameters
- $start_tag_contents : string
-
the contents of an HTML/XML tag. I.e., if the tag was <tag stuff> then $start_tag_contents could be stuff
- $safe_attribute_list : array<string|int, mixed> = []
-
a list of attributes which should be kept
Return values
string —containing only safe attributes and their values
parseCsv()
Used to parse into a two dimensional array a string that contains CSV data.
parseCsv(string $csv_string) : array<string|int, mixed>
Parameters
- $csv_string : string
-
string with csv data
Return values
array<string|int, mixed> —two dimensional array of elements from csv
arraytoCsv()
Converts an array of values to a comma separated value formatted string.
arraytoCsv(array<string|int, mixed> $arr) : string
Parameters
- $arr : array<string|int, mixed>
-
values to convert
Return values
string —CSV string after conversion
diff()
Computes a Unix-style diff of two strings. That is it only outputs lines which disagree between the two strings. It outputs +line if a line occurs in the second but not first string and -line if a line occurs in the first string but not the second.
diff(string $data1, string $data2[, bool $html = false ]) : string
Parameters
- $data1 : string
-
first string to compare
- $data2 : string
-
second string to compare
- $html : bool = false
-
whether to output html highlighting
Return values
string —representing info about where $data1 and $data2 don't match
computeLCS()
Computes the longest common subsequence of two arrays
computeLCS(array<string|int, mixed> $lines1, array<string|int, mixed> $lines2, int $offset) : mixed
Parameters
- $lines1 : array<string|int, mixed>
-
an array of lines to compute LCS of
- $lines2 : array<string|int, mixed>
-
an array of lines to compute LCS of
- $offset : int
-
an offset to shift over array addresses in output by
Return values
mixed —extractLCSFromTable()
Extracts from a table of longest common sequence moves (probably calculated by @see computeLCS) and a starting coordinate $i, $j in that table, a longest common subsequence
extractLCSFromTable(array<string|int, mixed> $lcs_moves, array<string|int, mixed> $lines, int $i, int $j, int $offset, array<string|int, mixed> &$lcs) : mixed
Parameters
- $lcs_moves : array<string|int, mixed>
-
a table of move computed by computeLCS
- $lines : array<string|int, mixed>
-
from first of the two arrays computing LCS of
- $i : int
-
a line number in string 1
- $j : int
-
a line number in string 2
- $offset : int
-
a number to add to each line number output into $lcs. This is useful if we have trimmed off the initially common lines from our two strings we are trying to compute the LCS of
- $lcs : array<string|int, mixed>
-
an array of triples (index_string1, index_string2, line) the indexes indicate the line number in each string, line is the line in common the two strings
Return values
mixed —tail()
Returns an array of the last $num_lines many lines our of a file
tail(string $file_name, string $num_lines) : array<string|int, mixed>
Parameters
- $file_name : string
-
name of file to return lines from
- $num_lines : string
-
number of lines to retrieve
Return values
array<string|int, mixed> —retrieved lines
lineFilter()
Given an array of lines returns a subarray of those lines containing the filter string or filter array
lineFilter(string $lines, mixed $filters[, bool $case_insensitive = true ]) : array<string|int, mixed>
Parameters
- $lines : string
-
to search
- $filters : mixed
-
either string to filter lines with or an array of strings (any of which can be present to pass the filter)
- $case_insensitive : bool = true
-
whether search should be done case insensitively or not.
Return values
array<string|int, mixed> —lines containing the string
logLineTimestamp()
Tries to extract a timestamp from a line which is presumed to come from a Yioop log file
logLineTimestamp(string $line) : int
Parameters
- $line : string
-
to search
Return values
int —timestamp of that log entry
isPositiveInteger()
Returns whether an input can be parsed to a positive integer
isPositiveInteger(mixed $input) : bool
Parameters
- $input : mixed
Return values
bool —whether $input can be parsed to a positive integer.
measureCall()
Used to measure the memory footprint in bytes and time spent calling a method of an object. It also records number of time the method has been called.
measureCall(object $object, string $method[, mixed $arguments = [] ][, string $call_name = "" ]) : mixed
Just calls the method without any recording or timing until an initial call to the function measureCall(null, save_statistics_file) where save_statistics_file is the name of the file you won't to store statistics to.
Parameters
- $object : object
-
name of object whose method we want to call and measure
- $method : string
-
method we're calling
- $arguments : mixed = []
- $call_name : string = ""
-
name to use when outputting stats for this call, defaults to $method.
Return values
mixed —whatever method would normally returned when called as above
measureObject()
Used to measure the memory footprint of an object in Yioop and save it to a statistics file No recording is done until an initial call to the function measureCall(null, save_statistics_file) where save_statistics_file is the name of the file you won't to store statistics to.
measureObject(object $object[, string $save_file = "" ][, mixed $class_name = "" ]) : mixed
Parameters
- $object : object
-
name of object whose size we want to measure
- $save_file : string = ""
-
statistics file to write info to
- $class_name : mixed = ""
Return values
mixed —measureObjectCall()
General method called by for @see measureCall and @see measureObject Used to measure the memory footprint in bytes of an object or memory and time spent calling a method of an object. It also records number of time the method has been called. When used to call a method before initialization, just calls the method without any recording or timing. To initialize, an initial call to the function measureCall(null, save_statistics_file) where save_statistics_file is the name of the file you won't to store statistics to should be done.
measureObjectCall(object $object, string $method[, mixed $arguments = [] ][, string $call_name = "" ]) : mixed
Parameters
- $object : object
-
name of object whose method we want to call and measure
- $method : string
-
method we're calling
- $arguments : mixed = []
- $call_name : string = ""
-
name to use when outputting stats for this call, defaults to $method.
Return values
mixed —whatever method would normally returned when called as above
variableClone()
Makes a deep copy of a variable regardless of its type
variableClone(mixed $var) : mixed
Parameters
- $var : mixed
-
variable to deep copy
Return values
mixed —the deep copy
garbageCollect()
Runs various system garbage collection functions and returns number of bytes freed.
garbageCollect() : int
Return values
int —number of bytes freed
utf8SafeSaveHtml()
The dom method saveHTML has a tendency to replace UTF-8, non-ascii characters with html entities. This is supposed to save avoiding the replacement.
utf8SafeSaveHtml(DOMDocument $dom) : string
What it does is to first save the dom, then it replaces htmlentities of the form &single_char; or &#some_number; with the UTF-8 they correspond to. It leaves all other entities as they are
Parameters
- $dom : DOMDocument
Return values
string —output of saving html
utf8WordWrap()
A UTF-8 safe version of PHP's wordwrap function that wraps a string to a given number of characters
utf8WordWrap(string $string[, int $width = 75 ][, string $break = "
" ][, bool $cut = false ]) : string
Parameters
- $string : string
-
the input string
- $width : int = 75
-
the number of characters at which the string will be wrapped
- $break : string = " "
-
string used to break a line into two
- $cut : bool = false
-
whether to always force wrap at $width characters even if word hasn't ended
Return values
string —the given string wrapped at the specified length
upgradeDatabaseVersion1()
Upgrades a Version 0 version of the Yioop database to a Version 1 version
upgradeDatabaseVersion1(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade
Return values
mixed —upgradeDatabaseVersion2()
Upgrades a Version 1 version of the Yioop database to a Version 2 version
upgradeDatabaseVersion2(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade
Return values
mixed —upgradeDatabaseVersion3()
Upgrades a Version 2 version of the Yioop database to a Version 3 version
upgradeDatabaseVersion3(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade
Return values
mixed —upgradeDatabaseVersion4()
Upgrades a Version 3 version of the Yioop database to a Version 4 version
upgradeDatabaseVersion4(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade
Return values
mixed —upgradeDatabaseVersion5()
Upgrades a Version 4 version of the Yioop database to a Version 5 version
upgradeDatabaseVersion5(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade
Return values
mixed —upgradeDatabaseVersion6()
Upgrades a Version 5 version of the Yioop database to a Version 6 version
upgradeDatabaseVersion6(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade
Return values
mixed —upgradeDatabaseVersion7()
Upgrades a Version 6 version of the Yioop database to a Version 7 version
upgradeDatabaseVersion7(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade
Return values
mixed —upgradeDatabaseVersion8()
Upgrades a Version 7 version of the Yioop database to a Version 8 version
upgradeDatabaseVersion8(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade
Return values
mixed —upgradeDatabaseVersion9()
Upgrades a Version 8 version of the Yioop database to a Version 9 version
upgradeDatabaseVersion9(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade
Return values
mixed —upgradeDatabaseVersion10()
Upgrades a Version 9 version of the Yioop database to a Version 10 version
upgradeDatabaseVersion10(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade
Return values
mixed —upgradeDatabaseVersion11()
Upgrades a Version 10 version of the Yioop database to a Version 11 version
upgradeDatabaseVersion11(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade
Return values
mixed —upgradeDatabaseVersion12()
Upgrades a Version 11 version of the Yioop database to a Version 12 version
upgradeDatabaseVersion12(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade
Return values
mixed —upgradeDatabaseVersion13()
Upgrades a Version 12 version of the Yioop database to a Version 13 version
upgradeDatabaseVersion13(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade
Return values
mixed —upgradeDatabaseVersion14()
Upgrades a Version 13 version of the Yioop database to a Version 14 version
upgradeDatabaseVersion14(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade
Return values
mixed —upgradeDatabaseVersion15()
Upgrades a Version 14 version of the Yioop database to a Version 15 version
upgradeDatabaseVersion15(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade
Return values
mixed —upgradeDatabaseVersion16()
Upgrades a Version 15 version of the Yioop database to a Version 16 version
upgradeDatabaseVersion16(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade
Return values
mixed —upgradeDatabaseVersion17()
Upgrades a Version 16 version of the Yioop database to a Version 17 version
upgradeDatabaseVersion17(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade
Return values
mixed —upgradeDatabaseVersion18()
Upgrades a Version 17 version of the Yioop database to a Version 18 version
upgradeDatabaseVersion18(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade
Return values
mixed —upgradeDatabaseVersion19()
Upgrades a Version 18 version of the Yioop database to a Version 19 version This update has been superseded by the Version20 update and so its contents have been eliminated.
upgradeDatabaseVersion19(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade
Return values
mixed —upgradeDatabaseVersion20()
Upgrades a Version 19 version of the Yioop database to a Version 20 version This is a major upgrade as the user table have changed. This also acts as a cumulative since version 0.98. It involves a web form that has only been localized to English
upgradeDatabaseVersion20(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade
Return values
mixed —upgradeDatabaseVersion21()
Upgrades a Version 20 version of the Yioop database to a Version 21 version
upgradeDatabaseVersion21(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade
Return values
mixed —upgradeDatabaseVersion22()
Upgrades a Version 21 version of the Yioop database to a Version 22 version
upgradeDatabaseVersion22(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade
Return values
mixed —upgradeDatabaseVersion23()
Upgrades a Version 22 version of the Yioop database to a Version 23 version
upgradeDatabaseVersion23(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade
Return values
mixed —upgradeDatabaseVersion24()
Upgrades a Version 23 version of the Yioop database to a Version 24 version
upgradeDatabaseVersion24(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade
Return values
mixed —upgradeDatabaseVersion25()
Upgrades a Version 24 version of the Yioop database to a Version 25 version This version upgrade includes creation of Help group that holds help pages.
upgradeDatabaseVersion25(object &$db) : mixed
Help Group is created with GROUP_ID=HELP_GROUP_ID. If a Group with Group_ID=HELP_GROUP_ID already exists, then that GROUP is moved to the end of the GROUPS table(Max group id is used).
Parameters
- $db : object
-
data source to use to upgrade
Return values
mixed —upgradeDatabaseVersion26()
Upgrades a Version 25 version of the Yioop database to a Version 26 version This version upgrade includes updation fo the Help pages in the database to work with the changes to the way Hyperlinks are specified in wiki markup.
upgradeDatabaseVersion26(object &$db) : mixed
The changes were implemented to point all articles with page names containing %20 to be able to work with '_' and vice versa.
Parameters
- $db : object
-
data source to use to upgrade
Return values
mixed —upgradeDatabaseVersion27()
Upgrades a Version 26 version of the Yioop database to a Version 27 version
upgradeDatabaseVersion27(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade
Return values
mixed —upgradeDatabaseVersion28()
Upgrades a Version 27 version of the Yioop database to a Version 28 version
upgradeDatabaseVersion28(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade
Return values
mixed —upgradeDatabaseVersion29()
Upgrades a Version 28 version of the Yioop database to a Version 29 version
upgradeDatabaseVersion29(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade
Return values
mixed —upgradeDatabaseVersion30()
Upgrades a Version 29 version of the Yioop database to a Version 30 version
upgradeDatabaseVersion30(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade
Return values
mixed —upgradeDatabaseVersion31()
Upgrades a Version 30 version of the Yioop database to a Version 31 version
upgradeDatabaseVersion31(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade
Return values
mixed —upgradeDatabaseVersion32()
Upgrades a Version 31 version of the Yioop database to a Version 32 version
upgradeDatabaseVersion32(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade
Return values
mixed —upgradeDatabaseVersion33()
Upgrades a Version 32 version of the Yioop database to a Version 33 version
upgradeDatabaseVersion33(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade
Return values
mixed —upgradeDatabaseVersion34()
Upgrades a Version 33 version of the Yioop database to a Version 34 version
upgradeDatabaseVersion34(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade
Return values
mixed —upgradeDatabaseVersion35()
Upgrades a Version 34 version of the Yioop database to a Version 35 version
upgradeDatabaseVersion35(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade
Return values
mixed —upgradeDatabaseVersion36()
Upgrades a Version 35 version of the Yioop database to a Version 36 version
upgradeDatabaseVersion36(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade
Return values
mixed —upgradeDatabaseVersion37()
Upgrades a Version 36 version of the Yioop database to a Version 37 version
upgradeDatabaseVersion37(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade
Return values
mixed —upgradeDatabaseVersion38()
Upgrades a Version 37 version of the Yioop database to a Version 38 version
upgradeDatabaseVersion38(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade
Return values
mixed —upgradeDatabaseVersion39()
Upgrades a Version 38 version of the Yioop database to a Version 39 version
upgradeDatabaseVersion39(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade
Return values
mixed —upgradeDatabaseVersion40()
Upgrades a Version 39 version of the Yioop database to a Version 40 version
upgradeDatabaseVersion40(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade
Return values
mixed —upgradeDatabaseVersion41()
Upgrades a Version 40 version of the Yioop database to a Version 41 version
upgradeDatabaseVersion41(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade
Return values
mixed —upgradeDatabaseVersion42()
Upgrades a Version 41 version of the Yioop database to a Version 42 version
upgradeDatabaseVersion42(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade
Return values
mixed —upgradeDatabaseVersion43()
Upgrades a Version 42 version of the Yioop database to a Version 43 version
upgradeDatabaseVersion43(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade.
Return values
mixed —upgradeDatabaseVersion44()
Upgrades a Version 43 version of the Yioop database to a Version 44 version
upgradeDatabaseVersion44(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade.
Return values
mixed —upgradeDatabaseVersion45()
Upgrades a Version 44 version of the Yioop database to a Version 45 version
upgradeDatabaseVersion45(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade.
Return values
mixed —upgradeDatabaseVersion46()
Upgrades a Version 45 version of the Yioop database to a Version 46 version
upgradeDatabaseVersion46(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade.
Return values
mixed —upgradeDatabaseVersion47()
Upgrades a Version 46 version of the Yioop database to a Version 47 version
upgradeDatabaseVersion47(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade.
Return values
mixed —upgradeDatabaseVersion48()
Upgrades a Version 47 version of the Yioop database to a Version 48 version
upgradeDatabaseVersion48(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade.
Return values
mixed —upgradeDatabaseVersion49()
Upgrades a Version 48 version of the Yioop database to a Version 49 version
upgradeDatabaseVersion49(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade.
Return values
mixed —upgradeDatabaseVersion50()
Upgrades a Version 49 version of the Yioop database to a Version 50 version
upgradeDatabaseVersion50(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade.
Return values
mixed —upgradeDatabaseVersion51()
Upgrades a Version 50 version of the Yioop database to a Version 51 version
upgradeDatabaseVersion51(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade.
Return values
mixed —upgradeDatabaseVersion52()
Upgrades a Version 51 version of the Yioop database to a Version 52 version
upgradeDatabaseVersion52(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade.
Return values
mixed —upgradeDatabaseVersion53()
Upgrades a Version 52 version of the Yioop database to a Version 53 version
upgradeDatabaseVersion53(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade.
Return values
mixed —upgradeDatabaseVersion54()
Upgrades a Version 53 version of the Yioop database to a Version 54 version
upgradeDatabaseVersion54(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade.
Return values
mixed —upgradeDatabaseVersion55()
Upgrades a Version 54 version of the Yioop database to a Version 55 version
upgradeDatabaseVersion55(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade.
Return values
mixed —upgradeDatabaseVersion57()
Upgrades a Version 56 version of the Yioop database to a Version 5 version
upgradeDatabaseVersion57(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade.
Return values
mixed —upgradeDatabaseVersion58()
Upgrades a Version 57 version of the Yioop database to a Version 58 version
upgradeDatabaseVersion58(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade.
Return values
mixed —upgradeDatabaseVersion59()
Upgrades a Version 58 version of the Yioop database to a Version 59 version
upgradeDatabaseVersion59(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade.
Return values
mixed —upgradeDatabaseVersion60()
Upgrades a Version 59 version of the Yioop database to a Version 60 version
upgradeDatabaseVersion60(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade.
Return values
mixed —upgradeDatabaseVersion61()
Upgrades a Version 60 version of the Yioop database to a Version 61 version
upgradeDatabaseVersion61(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade.
Return values
mixed —upgradeDatabaseVersion62()
Upgrades a Version 61 version of the Yioop database to a Version 62 version
upgradeDatabaseVersion62(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade.
Return values
mixed —upgradeDatabaseVersion64()
Upgrades a Version 63 version of the Yioop database to a Version 64 version
upgradeDatabaseVersion64(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade.
Return values
mixed —upgradeDatabaseVersion65()
Upgrades a Version 64 version of the Yioop database to a Version 65 version
upgradeDatabaseVersion65(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade.
Return values
mixed —upgradeDatabaseVersion66()
Upgrades a Version 65 version of the Yioop database to a Version 66 version
upgradeDatabaseVersion66(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade.
Return values
mixed —upgradeDatabaseVersion67()
Upgrades a Version 66 version of the Yioop database to a Version 67 version
upgradeDatabaseVersion67(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade.
Return values
mixed —upgradeDatabaseVersion68()
Upgrades a Version 67 version of the Yioop database to a Version 68 version
upgradeDatabaseVersion68(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade.
Return values
mixed —upgradeDatabaseVersion69()
Upgrades a Version 68 version of the Yioop database to a Version 69 version
upgradeDatabaseVersion69(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade.
Return values
mixed —upgradeDatabaseVersion70()
Upgrades a Version 69 version of the Yioop database to a Version 70 version
upgradeDatabaseVersion70(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade.
Return values
mixed —upgradeDatabaseVersion71()
Upgrades a Version 70 version of the Yioop database to a Version 71 version
upgradeDatabaseVersion71(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade
Return values
mixed —upgradeDatabaseVersion72()
Upgrades a Version 71 version of the Yioop database to a Version 72 version
upgradeDatabaseVersion72(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade
Return values
mixed —upgradeDatabaseVersion73()
Upgrades a Version 72 version of the Yioop database to a Version 73 version
upgradeDatabaseVersion73(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade
Return values
mixed —upgradeDatabaseVersion74()
Upgrades a Version 73 version of the Yioop database to a Version 74 version
upgradeDatabaseVersion74(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade
Return values
mixed —upgradeDatabaseVersion75()
Upgrades a Version 74 version of the Yioop database to a Version 75 version
upgradeDatabaseVersion75(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade
Return values
mixed —upgradeDatabaseVersion76()
Upgrades a Version 75 version of the Yioop database to a Version 76 version
upgradeDatabaseVersion76(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade
Return values
mixed —upgradeDatabaseVersion77()
Upgrades a Version 76 version of the Yioop database to a Version 77 version
upgradeDatabaseVersion77(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade
Return values
mixed —upgradeDatabaseVersion78()
Upgrades a Version 77 version of the Yioop database to a Version 78 version
upgradeDatabaseVersion78(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade
Return values
mixed —upgradeDatabaseVersion79()
Upgrades a Version 78 version of the Yioop database to a Version 79 version
upgradeDatabaseVersion79(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade
Return values
mixed —upgradeDatabaseVersion80()
Upgrades a Version 79 version of the Yioop database to a Version 80 version
upgradeDatabaseVersion80(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade
Return values
mixed —upgradeDatabaseVersion81()
Upgrades a Version 80 version of the Yioop database to a Version 81 version
upgradeDatabaseVersion81(object &$db) : mixed
Parameters
- $db : object
-
datasource to use to upgrade
Return values
mixed —webExit()
Function to call instead of exit() to indicate that the script processing the current web page is done processing. Use this rather that exit(), as exit() will also terminate WebSite.
webExit([string $err_msg = "" ]) : mixed
Parameters
- $err_msg : string = ""
-
error message to send on exiting
Tags
Return values
mixed —makeTableCallback()
Callback used by a preg_replace_callback in nextPage to make a table
makeTableCallback(array<string|int, mixed> $matches) : mixed
Parameters
- $matches : array<string|int, mixed>
-
of table cells
Return values
mixed —citeCallback()
Used to convert {{cite }} to a numbered link to a citation
citeCallback(array<string|int, mixed> $matches[, int $init = -1 ]) : string
Parameters
- $matches : array<string|int, mixed>
-
from regular expression to check for {{cite }}
- $init : int = -1
-
used to initialize counter for citations
Return values
string —a HTML link to citation in current document
fixLinksCallback()
Used to changes spaces to underscores in links generated from our earlier matching rules
fixLinksCallback(array<string|int, mixed> $matches) : string
Parameters
- $matches : array<string|int, mixed>
-
from regular expression to check for links
Return values
string —result of correcting link
base64EncodeCallback()
Callback used to base64 encode the contents of nowiki tags so they won't be manipulated by wiki replacements.
base64EncodeCallback(array<string|int, mixed> $matches) : string
Parameters
- $matches : array<string|int, mixed>
-
$matches[1] should contain the contents of a nowiki tag
Return values
string —base 64 encoded contents surrounded by an escaped nowiki tag.
spaceEncodeCallback()
Callback used to encode the contents of pre tags so they won't accidentally get sub-pre tags because a bunch of leading lines have spaces
spaceEncodeCallback(array<string|int, mixed> $matches) : string
Parameters
- $matches : array<string|int, mixed>
-
$matches[1] should contain the contents of a pre tag
Return values
string —encoded contents surrounded by an escaped pre tag.
spanEncodeCallback()
Callback used to encode the contents of span tags so they newlines within them don't accidentally get treated as new wiki paragraphs
spanEncodeCallback(array<string|int, mixed> $matches) : string
Parameters
- $matches : array<string|int, mixed>
-
$matches[1] should contain the contents of a span tag
Return values
string —encoded contents surrounded by an escaped pre tag.
base64DecodeCallback()
Callback used to base64 decode the contents of previously base64 encoded (@see base64EncodeCallback) nowiki tags after all mediawiki substitutions have been done
base64DecodeCallback(array<string|int, mixed> $matches) : string
Parameters
- $matches : array<string|int, mixed>
-
$matches[1] should contain the contents of a nowiki tag
Return values
string —base 64 decoded, entity decoded contents.
spaceDecodeCallback()
Cleans up pre tags after other wiki rules applied
spaceDecodeCallback(array<string|int, mixed> $matches) : string
Parameters
- $matches : array<string|int, mixed>
-
$matches[1] should contain the contents of a pre tag
Return values
string —cleaned contents surrounded by a pre-formatted tag.
lessThanLocale()
Function for comparing two locale arrays by locale tag so can sort
lessThanLocale(array<string|int, mixed> $a, array<string|int, mixed> $b) : int
Parameters
- $a : array<string|int, mixed>
-
an associative array of locale info
- $b : array<string|int, mixed>
-
an associative array of locale info
Return values
int —-1, 0, or 1 depending on which is alphabetically smaller or if they are the same size
tl()
Translate the supplied arguments into the current locale.
tl() : string
This function is a convenience copy of the same function
Tags
Return values
string —translated string
e()
shorthand for echo
e(string $text) : mixed
Parameters
- $text : string
-
string to send to the current output
Return values
mixed —tl()
Translate the supplied arguments into the current locale.
tl() : string
This function is a convenience copy of the same function
Tags
Return values
string —translated string
e()
shorthand for echo
e(string $text) : mixed
Parameters
- $text : string
-
string to send to the current output
Return values
mixed —tl()
Translate the supplied arguments into the current locale.
tl() : string
This function is a convenience copy of the same function
Tags
Return values
string —translated string
e()
shorthand for echo
e(string $text) : mixed
Parameters
- $text : string
-
string to send to the current output
Return values
mixed —tl()
Translate the supplied arguments into the current locale.
tl() : string
This function is a convenience copy of the same function
Tags
Return values
string —translated string
e()
shorthand for echo
e(string $text) : mixed
Parameters
- $text : string
-
string to send to the current output