Yioop_V9.5_Source_Code_Documentation

BloomFilterFile extends PersistentStructure
in package

Code used to manage a bloom filter in-memory and in file.

A Bloom filter is used to store a set of objects. It can support inserts into the set and it can also be used to check membership in the set.

Tags
author

Chris Pollett

Table of Contents

DEFAULT_SAVE_FREQUENCY  = 50000
If not specified in the constructor, this will be the number of operations between saves
$count  : int
Number of items currently stored in this filter
$filename  : string
Name of the file in which to store the PersistentStructure
$filter  : string
Packed string used to store the Bloom filters
$filter_size  : int
Size in bits of the packed string array used to store the filter's contents
$max_gram_len  : int
Maximum length for an n-gram (only used where bloom filter used to store n-grams)
$num_keys  : int
Number of bit positions in the Bloom filter used to say an item is in the filter
$save_frequency  : int
Number of operation between saves. If == -1 never save using checkSave
$unsaved_operations  : int
Number of operations since the last save
__construct()  : mixed
Initializes the fields of the BloomFilter and its base PersistentStructure.
add()  : mixed
Inserts the provided item into the Bloomfilter
checkSave()  : mixed
Add one to the unsaved_operations count. If this goes above the save_frquency then save the PersistentStructure to secondary storage
contains()  : bool
Checks if the BloomFilter contains the provided $value
getBit()  : bool
Looks up the value of the ith bit position in the filter
getHashBitPositionArray()  : int
Hashes $value to a bit position in the BloomFilter
load()  : object
Load a PersistentStructure from a file
save()  : mixed
Save the PersistentStructure to its filename This method is generic but super memory inefficient, so reimplement for subclasses is needed
setBit()  : mixed
Sets to true the ith bit position in the filter.

Constants

DEFAULT_SAVE_FREQUENCY

If not specified in the constructor, this will be the number of operations between saves

public int DEFAULT_SAVE_FREQUENCY = 50000

Properties

$count

Number of items currently stored in this filter

public int $count

$filename

Name of the file in which to store the PersistentStructure

public string $filename

$filter

Packed string used to store the Bloom filters

public string $filter

$filter_size

Size in bits of the packed string array used to store the filter's contents

public int $filter_size

$max_gram_len

Maximum length for an n-gram (only used where bloom filter used to store n-grams)

public int $max_gram_len

$num_keys

Number of bit positions in the Bloom filter used to say an item is in the filter

public int $num_keys

$save_frequency

Number of operation between saves. If == -1 never save using checkSave

public int $save_frequency

$unsaved_operations

Number of operations since the last save

public int $unsaved_operations

Methods

__construct()

Initializes the fields of the BloomFilter and its base PersistentStructure.

public __construct(string $fname, int $num_values[, int $save_frequency = self::DEFAULT_SAVE_FREQUENCY ]) : mixed
Parameters
$fname : string

name of the file to store the BloomFilter data in

$num_values : int

the maximum number of values that will be stored in the BloomFilter. Filter will be sized so the odds of a false positive are roughly one over this value

$save_frequency : int = self::DEFAULT_SAVE_FREQUENCY

how often to store the BloomFilter to disk

Return values
mixed

add()

Inserts the provided item into the Bloomfilter

public add(string $value) : mixed
Parameters
$value : string

item to add to filter

Return values
mixed

checkSave()

Add one to the unsaved_operations count. If this goes above the save_frquency then save the PersistentStructure to secondary storage

public checkSave() : mixed
Return values
mixed

contains()

Checks if the BloomFilter contains the provided $value

public contains(string $value) : bool
Parameters
$value : string

item to check if is in the BloomFilter

Return values
bool

whether $value was in the filter or not

getBit()

Looks up the value of the ith bit position in the filter

public getBit(int $i) : bool
Parameters
$i : int

the position to look up

Return values
bool

the value of the looked up position

getHashBitPositionArray()

Hashes $value to a bit position in the BloomFilter

public getHashBitPositionArray(string $value, int $num_keys) : int
Parameters
$value : string

value to map to a bit position in the filter

$num_keys : int

number of bit positions in the Bloom filter used to say an item isin the filter

Return values
int

the bit position mapped to

load()

Load a PersistentStructure from a file

public static load(string $fname) : object
Parameters
$fname : string

the name of the file to load the PersistentStructure from

Return values
object

the PersistentStructure loaded

save()

Save the PersistentStructure to its filename This method is generic but super memory inefficient, so reimplement for subclasses is needed

public save() : mixed
Return values
mixed

setBit()

Sets to true the ith bit position in the filter.

public setBit(int $i) : mixed
Parameters
$i : int

the position to set to true

Return values
mixed

        

Search results