
in package
implements CrawlConstants


Chris Pollett

Interfaces, Classes, Traits and Enums

Shared constants and enums used by components that are involved in the crawling process

Table of Contents

getArchiveKind()  : string
Given a folder name, determines the kind of bundle (if any) it holds.
run()  : mixed
The main code for the dictionary updater, updates the the dictionary for the IndexDocumentBundle at $bundle_path running on channel $channel from its current next_partition to process to the current save partition. Partitions are groups of documents that have been downloaded, but whose words ave not necessarily been add to the dicitionary for the bundle.



Given a folder name, determines the kind of bundle (if any) it holds.

public static getArchiveKind(string $archive_path) : string

It does this based on the expected location of the description.txt file, or arc_description.ini (in the case of a non-yioop archive)

$archive_path : string

the path to archive folder

Return values

the archive bundle type, either: WebArchiveBundle or IndexArchiveBundle


The main code for the dictionary updater, updates the the dictionary for the IndexDocumentBundle at $bundle_path running on channel $channel from its current next_partition to process to the current save partition. Partitions are groups of documents that have been downloaded, but whose words ave not necessarily been add to the dicitionary for the bundle.

public static run(int $channel, string $bundle_path) : mixed
$channel : int

the channel the crawl is running on. Used in naming lock files

$bundle_path : string

the path to the IndexDocumentBundle or FeedDucumentBundle we are adding dictionary info for

Return values


Search results