WebArchive
in package
Code used to manage web archive files
Tags
Table of Contents
- WEB_ARCHIVE_VERSION = 1.0
- Version number to use in the WebArchive header if constructing a new archive
- $compressor : object
- Filter object used to compress/uncompress objects stored in archive
- $count : int
- number of item in archive
- $filename : string
- Filename used to store the web archive.
- $is_string : bool
- Says whether the archive is a string archive
- $iterator_pos : int
- Current offset into the web archive the iterator for the archive is at (at most one iterator / archive -- oh well)
- $storage : string
- If archive is stored as a string rather than persistently to disk then $storage is used to hold the string
- $version : float
- version number of the current archive
- __construct() : mixed
- Makes or initializes a WebArchive object using the supplied parameters
- addObjects() : mixed
- Adds objects to the WebArchive
- close() : mixed
- Closes a file handle (which should be of a web archive)
- currentObjects() : array<string|int, mixed>
- Returns $num many objects from the web archive starting at the current iterator position, leaving the iterator position unchanged
- getObjects() : array<string|int, mixed>
- Gets $num many objects out of the web archive starting at byte $offset
- nextObjects() : array<string|int, mixed>
- Returns $num many objects from the web archive starting at the current iterator position. The iterator is advance to the object after the last one returned
- open() : resource
- Open the web archive file associated with this WebArchive object.
- readInfoBlock() : array<string|int, mixed>
- Read the info block associated with this web archive.
- reset() : mixed
- Resets the iterator for this web archive to the first object in the archive
- seekEndObjects() : int
- Seeks in the WebArchive file to the end of the last Object.
- writeInfoBlock() : mixed
- Serializes and applies the compressor to an info block and write it at the end of the web archive The info block is meta data for the archive stored at the end of the WebArchive file. The particular meta is up to who is using the web archive; however, count and archive version number are always stored
Constants
WEB_ARCHIVE_VERSION
Version number to use in the WebArchive header if constructing a new archive
public
mixed
WEB_ARCHIVE_VERSION
= 1.0
Properties
$compressor
Filter object used to compress/uncompress objects stored in archive
public
object
$compressor
$count
number of item in archive
public
int
$count
$filename
Filename used to store the web archive.
public
string
$filename
$is_string
Says whether the archive is a string archive
public
bool
$is_string
$iterator_pos
Current offset into the web archive the iterator for the archive is at (at most one iterator / archive -- oh well)
public
int
$iterator_pos
$storage
If archive is stored as a string rather than persistently to disk then $storage is used to hold the string
public
string
$storage
$version
version number of the current archive
public
float
$version
Methods
__construct()
Makes or initializes a WebArchive object using the supplied parameters
public
__construct(string $fname, string $compressor[, bool $fast_construct = false ][, bool $is_string = false ]) : mixed
Parameters
- $fname : string
-
filename to use to store archive to disk
- $compressor : string
-
what kind of Compressor object should be used to read and write objects in the archive
- $fast_construct : bool = false
-
do we read the info block of the web archive as part of the constructing process
- $is_string : bool = false
-
says whether the archive stores to string rather than a file
Return values
mixed —addObjects()
Adds objects to the WebArchive
public
addObjects(string $offset_field, array<string|int, mixed> &$objects[, array<string|int, mixed> $data = null ][, string $callback = null ][, bool $return_flag = true ]) : mixed
Parameters
- $offset_field : string
-
field in objects to return the byte offset at which they were stored
- $objects : array<string|int, mixed>
-
references to objects that will be stored the offset field in these references will be adjusted if
- $data : array<string|int, mixed> = null
-
data to write in the WebArchive's info block
- $callback : string = null
-
name of a callback $callback($data, $new_objects, $offset_field) used to modify $data before it is written to the info block. For instance, we can add offset info to data.
- $return_flag : bool = true
-
if true rather than adjust the offsets by reference, create copy objects and adjust their offsets and return
Return values
mixed —adjusted objects or void
close()
Closes a file handle (which should be of a web archive)
public
close(resource $fh) : mixed
Parameters
- $fh : resource
-
filehandle to close
Return values
mixed —currentObjects()
Returns $num many objects from the web archive starting at the current iterator position, leaving the iterator position unchanged
public
currentObjects(int $num) : array<string|int, mixed>
Parameters
- $num : int
-
number of objects to return
Return values
array<string|int, mixed> —an array of objects from the web archive
getObjects()
Gets $num many objects out of the web archive starting at byte $offset
public
getObjects(int $offset, int $num[, bool $next_flag = true ][, resource $fh = null ][, int $max_size = CMAX_ARCHIVE_OBJECT_SIZE ]) : array<string|int, mixed>
If the $next_flag is true the archive iterator is advance and if $fh is not null then it is assumed to be an open resource pointing to the archive (saving the time to open it).
Parameters
- $offset : int
-
a valid byte offset into a web archive
- $num : int
-
number of objects to return
- $next_flag : bool = true
-
whether to advance the archive iterator
- $fh : resource = null
-
either null or a file resource to the archive
- $max_size : int = CMAX_ARCHIVE_OBJECT_SIZE
-
maximum size returned object should be, use as a sanity check against corrupted archives
Return values
array<string|int, mixed> —the $num objects beginning at $offset
nextObjects()
Returns $num many objects from the web archive starting at the current iterator position. The iterator is advance to the object after the last one returned
public
nextObjects(int $num) : array<string|int, mixed>
Parameters
- $num : int
-
number of objects to return
Return values
array<string|int, mixed> —an array of objects from the web archive
open()
Open the web archive file associated with this WebArchive object.
public
open([string $mode = "r" ]) : resource
Parameters
- $mode : string = "r"
-
read/write mode to open file with
Return values
resource —a file resource for the web archive
readInfoBlock()
Read the info block associated with this web archive.
public
readInfoBlock() : array<string|int, mixed>
The info block is meta data for the archive stored at the end of the WebArchive file. The particular meta is up to who is using the web archive.
Return values
array<string|int, mixed> —the contents of the info block
reset()
Resets the iterator for this web archive to the first object in the archive
public
reset() : mixed
Return values
mixed —seekEndObjects()
Seeks in the WebArchive file to the end of the last Object.
public
seekEndObjects(resource $fh) : int
The last $compressed_int_len bytes of a WebArchive say the length of an info block in bytes
Parameters
- $fh : resource
-
resource for the WebArchive file
Return values
int —offset length of info block
writeInfoBlock()
Serializes and applies the compressor to an info block and write it at the end of the web archive The info block is meta data for the archive stored at the end of the WebArchive file. The particular meta is up to who is using the web archive; however, count and archive version number are always stored
public
writeInfoBlock([resource $fh = null ][, array<string|int, mixed> &$data = null ]) : mixed
Parameters
- $fh : resource = null
-
resource for the web archive file. If null the web archive is open first and close when the data is written
- $data : array<string|int, mixed> = null
-
data to write into the info block of the archive