WikiParser.php
SeekQuarry/Yioop -- Open Source Pure PHP Search Engine, Crawler, and Indexer
Copyright (C) 2009 - 2023 Chris Pollett chris@pollett.org
LICENSE:
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program. If not, see https://www.gnu.org/licenses/.
END LICENSE
Tags
Interfaces, Classes, Traits and Enums
- WikiParser
- Class with methods to parse mediawiki documents, both within Yioop, and when Yioop indexes mediawiki dumps as from Wikipedia.
Table of Contents
- makeTableCallback() : mixed
- Callback used by a preg_replace_callback in nextPage to make a table
- citeCallback() : string
- Used to convert {{cite }} to a numbered link to a citation
- fixLinksCallback() : string
- Used to changes spaces to underscores in links generated from our earlier matching rules
- base64EncodeCallback() : string
- Callback used to base64 encode the contents of nowiki tags so they won't be manipulated by wiki replacements.
- spaceEncodeCallback() : string
- Callback used to encode the contents of pre tags so they won't accidentally get sub-pre tags because a bunch of leading lines have spaces
- spanEncodeCallback() : string
- Callback used to encode the contents of span tags so they newlines within them don't accidentally get treated as new wiki paragraphs
- base64DecodeCallback() : string
- Callback used to base64 decode the contents of previously base64 encoded (@see base64EncodeCallback) nowiki tags after all mediawiki substitutions have been done
- spaceDecodeCallback() : string
- Cleans up pre tags after other wiki rules applied
Functions
makeTableCallback()
Callback used by a preg_replace_callback in nextPage to make a table
makeTableCallback(array<string|int, mixed> $matches) : mixed
Parameters
- $matches : array<string|int, mixed>
-
of table cells
Return values
mixed —citeCallback()
Used to convert {{cite }} to a numbered link to a citation
citeCallback(array<string|int, mixed> $matches[, int $init = -1 ]) : string
Parameters
- $matches : array<string|int, mixed>
-
from regular expression to check for {{cite }}
- $init : int = -1
-
used to initialize counter for citations
Return values
string —a HTML link to citation in current document
fixLinksCallback()
Used to changes spaces to underscores in links generated from our earlier matching rules
fixLinksCallback(array<string|int, mixed> $matches) : string
Parameters
- $matches : array<string|int, mixed>
-
from regular expression to check for links
Return values
string —result of correcting link
base64EncodeCallback()
Callback used to base64 encode the contents of nowiki tags so they won't be manipulated by wiki replacements.
base64EncodeCallback(array<string|int, mixed> $matches) : string
Parameters
- $matches : array<string|int, mixed>
-
$matches[1] should contain the contents of a nowiki tag
Return values
string —base 64 encoded contents surrounded by an escaped nowiki tag.
spaceEncodeCallback()
Callback used to encode the contents of pre tags so they won't accidentally get sub-pre tags because a bunch of leading lines have spaces
spaceEncodeCallback(array<string|int, mixed> $matches) : string
Parameters
- $matches : array<string|int, mixed>
-
$matches[1] should contain the contents of a pre tag
Return values
string —encoded contents surrounded by an escaped pre tag.
spanEncodeCallback()
Callback used to encode the contents of span tags so they newlines within them don't accidentally get treated as new wiki paragraphs
spanEncodeCallback(array<string|int, mixed> $matches) : string
Parameters
- $matches : array<string|int, mixed>
-
$matches[1] should contain the contents of a span tag
Return values
string —encoded contents surrounded by an escaped pre tag.
base64DecodeCallback()
Callback used to base64 decode the contents of previously base64 encoded (@see base64EncodeCallback) nowiki tags after all mediawiki substitutions have been done
base64DecodeCallback(array<string|int, mixed> $matches) : string
Parameters
- $matches : array<string|int, mixed>
-
$matches[1] should contain the contents of a nowiki tag
Return values
string —base 64 decoded, entity decoded contents.
spaceDecodeCallback()
Cleans up pre tags after other wiki rules applied
spaceDecodeCallback(array<string|int, mixed> $matches) : string
Parameters
- $matches : array<string|int, mixed>
-
$matches[1] should contain the contents of a pre tag
Return values
string —cleaned contents surrounded by a pre-formatted tag.