Yioop_V9.5_Source_Code_Documentation

DictionaryUpdater
in package
implements CrawlConstants

Tags
author

Chris Pollett

Interfaces, Classes, Traits and Enums

CrawlConstants
Shared constants and enums used by components that are involved in the crawling process

Table of Contents

getArchiveKind()  : string
Given a folder name, determines the kind of bundle (if any) it holds.
run()  : mixed
The main code for the dictionary updater, updates the the dictionary for the IndexDocumentBundle at $bundle_path running on channel $channel from its current next_partition to process to the current save partition. Partitions are groups of documents that have been downloaded, but whose words ave not necessarily been add to the dicitionary for the bundle.

Methods

getArchiveKind()

Given a folder name, determines the kind of bundle (if any) it holds.

public static getArchiveKind(string $archive_path) : string

It does this based on the expected location of the description.txt file, or arc_description.ini (in the case of a non-yioop archive)

Parameters
$archive_path : string

the path to archive folder

Return values
string

the archive bundle type, either: WebArchiveBundle or IndexArchiveBundle

run()

The main code for the dictionary updater, updates the the dictionary for the IndexDocumentBundle at $bundle_path running on channel $channel from its current next_partition to process to the current save partition. Partitions are groups of documents that have been downloaded, but whose words ave not necessarily been add to the dicitionary for the bundle.

public static run(int $channel, string $bundle_path) : mixed
Parameters
$channel : int

the channel the crawl is running on. Used in naming lock files

$bundle_path : string

the path to the IndexDocumentBundle or FeedDucumentBundle we are adding dictionary info for

Return values
mixed

        

Search results