Yioop_V9.5_Source_Code_Documentation

archive_bundle_iterators

Interfaces, Classes, Traits and Enums

ArcArchiveBundleIterator
Used to iterate through the records of a collection of arc files stored in a WebArchiveBundle folder. Arc is the file format of the Internet Archive http://www.archive.org/web/researcher/ArcFileFormat.php. Iteration would be for the purpose making an index of these records
ArchiveBundleIterator
Abstract class used to model iterating documents indexed in an WebArchiveBundle or set of such bundles.
DatabaseBundleIterator
Used to iterate through the records that result from an SQL query to a database
MediaWikiArchiveBundleIterator
Used to iterate through a collection of .xml.bz2 media wiki files stored in a WebArchiveBundle folder. Here these media wiki files contain the kinds of documents used by wikipedia. Iteration would be for the purpose making an index of these records
MixArchiveBundleIterator
Used to do an archive crawl based on the results of a crawl mix.
OdpRdfArchiveBundleIterator
Used to iterate through the records of a collection of one or more open directory RDF files stored in a WebArchiveBundle folder. Open Directory file can be found at http://rdf.dmoz.org/ . Iteration would be for the purpose making an index of these records
TextArchiveBundleIterator
Used to iterate through the records of a collection of text or compressed text-oriented records
WarcArchiveBundleIterator
Used to iterate through the records of a collection of warc files stored in a WebArchiveBundle folder. Warc is the newer file format of the Internet Archive and other for digital preservation: http://www.digitalpreservation.gov/formats/fdd/fdd000236.shtml http://archive-access.sourceforge.net/warc/ Iteration is done for the purpose making an index of these records
WebArchiveBundleIterator
Class used to model iterating documents indexed in an WebArchiveBundle. This would typically be for the purpose of re-indexing these documents.

Search results