
BloomFilterFile extends PersistentStructure
in package

Code used to manage a bloom filter in-memory and in file.

A Bloom filter is used to store a set of objects. It can support inserts into the set and it can also be used to check membership in the set.


Chris Pollett

Table of Contents

If not specified in the constructor, this will be the number of operations between saves
$count  : int
Number of items currently stored in this filter
$filename  : string
Name of the file in which to store the PersistentStructure
$filter  : string
Packed string used to store the Bloom filters
$filter_size  : int
Size in bits of the packed string array used to store the filter's contents
$max_gram_len  : int
Maximum length for an n-gram (only used where bloom filter used to store n-grams)
$num_keys  : int
Number of bit positions in the Bloom filter used to say an item is in the filter
$save_frequency  : int
Number of operation between saves. If == -1 never save using checkSave
$unsaved_operations  : int
Number of operations since the last save
__construct()  : mixed
Initializes the fields of the BloomFilter and its base PersistentStructure.
add()  : mixed
Inserts the provided item into the Bloomfilter
checkSave()  : mixed
Add one to the unsaved_operations count. If this goes above the save_frquency then save the PersistentStructure to secondary storage
contains()  : bool
Checks if the BloomFilter contains the provided $value
getBit()  : bool
Looks up the value of the ith bit position in the filter
getHashBitPositionArray()  : int
Hashes $value to a bit position in the BloomFilter
load()  : object
Load a PersistentStructure from a file
save()  : mixed
Save the PersistentStructure to its filename This method is generic but super memory inefficient, so reimplement for subclasses is needed
setBit()  : mixed
Sets to true the ith bit position in the filter.



If not specified in the constructor, this will be the number of operations between saves

public int DEFAULT_SAVE_FREQUENCY = 50000



Number of items currently stored in this filter

public int $count


Name of the file in which to store the PersistentStructure

public string $filename


Packed string used to store the Bloom filters

public string $filter


Size in bits of the packed string array used to store the filter's contents

public int $filter_size


Maximum length for an n-gram (only used where bloom filter used to store n-grams)

public int $max_gram_len


Number of bit positions in the Bloom filter used to say an item is in the filter

public int $num_keys


Number of operation between saves. If == -1 never save using checkSave

public int $save_frequency


Number of operations since the last save

public int $unsaved_operations



Initializes the fields of the BloomFilter and its base PersistentStructure.

public __construct(string $fname, int $num_values[, int $save_frequency = self::DEFAULT_SAVE_FREQUENCY ]) : mixed
$fname : string

name of the file to store the BloomFilter data in

$num_values : int

the maximum number of values that will be stored in the BloomFilter. Filter will be sized so the odds of a false positive are roughly one over this value

$save_frequency : int = self::DEFAULT_SAVE_FREQUENCY

how often to store the BloomFilter to disk

Return values


Inserts the provided item into the Bloomfilter

public add(string $value) : mixed
$value : string

item to add to filter

Return values


Add one to the unsaved_operations count. If this goes above the save_frquency then save the PersistentStructure to secondary storage

public checkSave() : mixed
Return values


Checks if the BloomFilter contains the provided $value

public contains(string $value) : bool
$value : string

item to check if is in the BloomFilter

Return values

whether $value was in the filter or not


Looks up the value of the ith bit position in the filter

public getBit(int $i) : bool
$i : int

the position to look up

Return values

the value of the looked up position


Hashes $value to a bit position in the BloomFilter

public getHashBitPositionArray(string $value, int $num_keys) : int
$value : string

value to map to a bit position in the filter

$num_keys : int

number of bit positions in the Bloom filter used to say an item isin the filter

Return values

the bit position mapped to


Load a PersistentStructure from a file

public static load(string $fname) : object
$fname : string

the name of the file to load the PersistentStructure from

Return values

the PersistentStructure loaded


Save the PersistentStructure to its filename This method is generic but super memory inefficient, so reimplement for subclasses is needed

public save() : mixed
Return values


Sets to true the ith bit position in the filter.

public setBit(int $i) : mixed
$i : int

the position to set to true

Return values


Search results