Yioop_V9.5_Source_Code_Documentation

PriorityQueue extends StringArray
in package
implements CrawlConstants

Code used to manage a memory efficient priority queue.

Weights for the queue must be flaots. The queue itself is implemented using heaps

Tags
author

Chris Pollett

Interfaces, Classes, Traits and Enums

CrawlConstants
Shared constants and enums used by components that are involved in the crawling process

Table of Contents

DEFAULT_SAVE_FREQUENCY  = 50000
If not specified in the constructor, this will be the number of operations between saves
$adjust_weight_callback  : string
Name of a function to be called if an element in the priority has moved because its weight was adjusted this allows other data structures the ability to know where the element has moved to in the priority queue
$count  : int
Number of items that are currently stored in the queue
$data_size  : int
Size of each item in bytes to be stored
$filename  : string
Name of the file in which to store the PersistentStructure
$min_or_max  : string
When the polling the queue returns the least or most weighted value
$notifier  : object
An object that implements the Notifier interface (for instance, WebQueueArchive)
$num_values  : int
Number of values that can be stored in the priority queue
$save_frequency  : int
Number of operation between saves. If == -1 never save using checkSave
$string_array  : string
Character string used to store the packed data of the StringArray
$string_array_size  : int
Number of bytes of storage need by the string array
$unsaved_operations  : int
Number of operations since the last save
$value_size  : int
Number of bytes needed to store a value associated with a weight
$weight_size  : int
Number of bytes needed to store a weight in the queue
__construct()  : mixed
Makes a priority queue (implemented as an array heap) with the given operating parameters
adjustWeight()  : mixed
Add $delta to the $ith element in the priority queue and then adjusts the queue to store the heap property
checkSave()  : mixed
Add one to the unsaved_operations count. If this goes above the save_frquency then save the PersistentStructure to secondary storage
compare()  : int
Computes the difference of the two values $value1 and $value2
get()  : string
Looks up the ith item in the StringArray
getContents()  : array<string|int, mixed>
Return the contents of the priority queue as an array of value weight pairs.
getRow()  : array<string|int, mixed>
Gets the ith element of the PriorityQueue viewed as an array
insert()  : mixed
Inserts a new item into the priority queue.
load()  : object
Load a PersistentStructure from a file
normalize()  : mixed
Scales the weights of elements in the queue so that the sum fo the new weights is $new_total
peek()  : mixed
Gets the data stored at the ith location in the priority queue
percolateDown()  : mixed
If the ith element in the PriorityQueue violates the heap property with some child node (children should be of lower priority than the parent), this function tries modify the heap to restore the heap property.
percolateUp()  : int
If the $ith element in the PriorityQueue violates the heap property with its parent node (children should be of lower priority than the parent), this function tries modify the heap to restore the heap property.
poll()  : mixed
Removes and returns the ith element out of the Priority queue.
printContents()  : mixed
Pretty prints the contents of the queue viewed as an array.
put()  : mixed
Puts data into the ith item of the StringArray
putRow()  : mixed
Add data to the $i row of the priority queue viewed as an array Calls the notifier associated with this queue about the change in data's location
save()  : mixed
Save the PersistentStructure to its filename This method is generic but super memory inefficient, so reimplement for subclasses is needed
totalWeight()  : int
Computes and returns the weight of all items in priority queue

Constants

DEFAULT_SAVE_FREQUENCY

If not specified in the constructor, this will be the number of operations between saves

public int DEFAULT_SAVE_FREQUENCY = 50000

Properties

$adjust_weight_callback

Name of a function to be called if an element in the priority has moved because its weight was adjusted this allows other data structures the ability to know where the element has moved to in the priority queue

public string $adjust_weight_callback

$count

Number of items that are currently stored in the queue

public int $count

$data_size

Size of each item in bytes to be stored

public int $data_size

$filename

Name of the file in which to store the PersistentStructure

public string $filename

$min_or_max

When the polling the queue returns the least or most weighted value

public string $min_or_max

$notifier

An object that implements the Notifier interface (for instance, WebQueueArchive)

public object $notifier

$num_values

Number of values that can be stored in the priority queue

public int $num_values

$save_frequency

Number of operation between saves. If == -1 never save using checkSave

public int $save_frequency

$string_array

Character string used to store the packed data of the StringArray

public string $string_array

$string_array_size

Number of bytes of storage need by the string array

public int $string_array_size

$unsaved_operations

Number of operations since the last save

public int $unsaved_operations

$value_size

Number of bytes needed to store a value associated with a weight

public int $value_size

$weight_size

Number of bytes needed to store a weight in the queue

public int $weight_size = 4

Methods

__construct()

Makes a priority queue (implemented as an array heap) with the given operating parameters

public __construct(string $fname, int $num_values, int $value_size, string $min_or_max[, object $notifier = null ][, int $save_frequency = self::DEFAULT_SAVE_FREQUENCY ][, string $adjust_weight_callback = null ]) : mixed
Parameters
$fname : string

filename to store the data associated with the queue

$num_values : int

number of values the queue can hold

$value_size : int

the size in a bytes of a value

$min_or_max : string

whether this priority queue return least or most weight values when polled

$notifier : object = null

object to call when a value changes in the queue

$save_frequency : int = self::DEFAULT_SAVE_FREQUENCY

how often the data in the queue should be save to disk. (It's default location is RAM)

$adjust_weight_callback : string = null

name of a function to be called if an element in the priority has moved because its weight was adjusted this allows other data structures the ability to know where the element has moved to in the priority queue

Return values
mixed

adjustWeight()

Add $delta to the $ith element in the priority queue and then adjusts the queue to store the heap property

public adjustWeight(int $i, int $delta) : mixed
Parameters
$i : int

element whose weight should be adjusted

$delta : int

how much to change the weight by

Return values
mixed

checkSave()

Add one to the unsaved_operations count. If this goes above the save_frquency then save the PersistentStructure to secondary storage

public checkSave() : mixed
Return values
mixed

compare()

Computes the difference of the two values $value1 and $value2

public compare(int $value1, int $value2) : int

Which is subtracted from which is determined by whether this is a min_or_max priority queue

Parameters
$value1 : int

a value to take the difference between

$value2 : int

the other value

Return values
int

the differences

get()

Looks up the ith item in the StringArray

public get(int $i) : string
Parameters
$i : int

array index of item to look up

Return values
string

the looked-up item of length $this->data_size

getContents()

Return the contents of the priority queue as an array of value weight pairs.

public getContents() : array<string|int, mixed>
Return values
array<string|int, mixed>

contents of the queue

getRow()

Gets the ith element of the PriorityQueue viewed as an array

public getRow(int $i) : array<string|int, mixed>
Parameters
$i : int

element to get

Return values
array<string|int, mixed>

value stored in queue together with its weight as a two element array

insert()

Inserts a new item into the priority queue.

public insert(string $data, int $weight) : mixed
Parameters
$data : string

what to insert into the queue

$weight : int

how much the new data should be weighted

Return values
mixed

index location in queue where item was stored if successful, otherwise false.

load()

Load a PersistentStructure from a file

public static load(string $fname) : object
Parameters
$fname : string

the name of the file to load the PersistentStructure from

Return values
object

the PersistentStructure loaded

normalize()

Scales the weights of elements in the queue so that the sum fo the new weights is $new_total

public normalize([int $new_total = CNUM_URLS_QUEUE_RAM ]) : mixed

This function is used periodically to prevent the queue from being gummed up because all of the weights stored in it are too small.

Parameters
$new_total : int = CNUM_URLS_QUEUE_RAM

what the new sum of weights of elements in the queue will be after normalization

Return values
mixed

peek()

Gets the data stored at the ith location in the priority queue

public peek([int $i = 1 ]) : mixed
Parameters
$i : int = 1

location to return data from

Return values
mixed

array data if the value of $i is between 1 and count, false otherwise

percolateDown()

If the ith element in the PriorityQueue violates the heap property with some child node (children should be of lower priority than the parent), this function tries modify the heap to restore the heap property.

public percolateDown(int $i) : mixed
Parameters
$i : int

node to consider in restoring the heap property

Return values
mixed

percolateUp()

If the $ith element in the PriorityQueue violates the heap property with its parent node (children should be of lower priority than the parent), this function tries modify the heap to restore the heap property.

public percolateUp(int $i) : int
Parameters
$i : int

node to consider in restoring the heap property

Return values
int

final position $ith node ends up at

poll()

Removes and returns the ith element out of the Priority queue.

public poll([int $i = 1 ]) : mixed

Since this is a priority queue the first element in the queue will either be the min or max (depending on queue type) element stored. If $i is not in range an error message is written to the log. This operation also performs a check to see if the queue should be saved to disk

Parameters
$i : int = 1

element to get out of the queue

Return values
mixed

array data if the value of $i is between 1 and count, false otherwise

printContents()

Pretty prints the contents of the queue viewed as an array.

public printContents() : mixed
Return values
mixed

put()

Puts data into the ith item of the StringArray

public put(int $i, string $data) : mixed
Parameters
$i : int

array index of where to store data

$data : string

at least $this->data_size many bytes of data to store

Return values
mixed

putRow()

Add data to the $i row of the priority queue viewed as an array Calls the notifier associated with this queue about the change in data's location

public putRow(int $i, array<string|int, mixed> $row) : mixed
Parameters
$i : int

location to add data

$row : array<string|int, mixed>

data to add (a two element array in the form key, int value).

Return values
mixed

save()

Save the PersistentStructure to its filename This method is generic but super memory inefficient, so reimplement for subclasses is needed

public save() : mixed
Return values
mixed

totalWeight()

Computes and returns the weight of all items in priority queue

public totalWeight() : int
Return values
int

weight of all items stored in the priority queue


        

Search results