Yioop_V9.5_Source_Code

NaiveBayes extends ClassifierAlgorithm
in package

Application

Implements the Naive Bayes text classification algorithm.

This class also provides a method to sample a beta vector from a dataset, making it easy to generate several slightly-different classifiers for the same dataset in order to form classifier committees.

$beta

Beta vector of feature weights resulting from the training phase. The dot product of this vector with a feature vector yields the log likelihood that the feature vector describes a document belonging to the trained-for class.


    public
        array<string|int, mixed>
    $beta

$debug

Flag used to control level of debug messages for now 0 == no messages, anything else causes messages to be output


    public
        int
    $debug
     = 0

$epsilon

Parameter used to weight negative examples.


    public
        float
    $epsilon
     = 1.0

$gamma

Parameter used to weight positive examples.


    public
        float
    $gamma
     = 1.0

classify()

Returns the pseudo-probability that a new instance is a positive example of the class the beta vector was trained to recognize. It only makes sense to try classification after at least some training has been done on a dataset that includes both positive and negative examples of the target class.


    public
                    classify(array<string|int, mixed> $x) : mixed

Parameters

$x : array<string|int, mixed>: feature vector represented by an associative array mapping features to their weights

Return values

mixed —

log()

Write a message to log file depending on debug level for this subpackage


    public
                    log(string $message) : mixed

Parameters

$message : string: what to write to the log

Return values

mixed —

logit()

Computes the log odds of a numerator and denominator, corresponding to the number of positive and negative examples exhibiting some feature.


    public
                    logit(int $pos, int $neg) : float

Parameters

$pos : int: count of positive examples exhibiting some feature
$neg : int: count of negative examples

Return values

float —

log odds of seeing the feature in a positive example

sampleBeta()

Constructs beta by sampling from the Gamma distribution for each feature, parameterized by the number of times the feature appears in positive examples, with a scale/rate of 1. This function is used to construct classifier committees.


    public
                    sampleBeta(object $features) : mixed

Parameters

$features : object: Features instance for the training set, used to determine how often a given feature occurs in positive and negative examples

Return values

mixed —

sampleGammaDeviate()

Computes a Gamma deviate with beta = 1 and integral, small alpha. With these assumptions, the deviate is just the sum of alpha exponential deviates. Each exponential deviate is just the negative log of a uniform deviate, so the sum of the logs is just the negative log of the products of the uniform deviates.


    public
                    sampleGammaDeviate(int $alpha) : float

Parameters

$alpha : int: parameter to Gamma distribution (in practice, a count of occurrences of some feature)

Return values

float —

a deviate from the Gamma distribution parameterized by $alpha

train()

Computes the beta vector from the given examples and labels. The examples are represented as a sparse matrix where each row is an example and each column a feature, and the labels as an array where each value is either 1 or -1, corresponding to a positive or negative example. Note that the first feature (column 0) corresponds to an intercept term, and is equal to 1 for every example.


    public
                    train(object $X, array<string|int, mixed> $y) : mixed

Parameters

$X : object: SparseMatrix of training examples
$y : array<string|int, mixed>: example labels

Return values

mixed —

Yioop_V9.5_Source_Code_Documentation

NaiveBayes extends ClassifierAlgorithm
in package

Application

Tags

Table of Contents

Properties

$beta

$debug

$epsilon

$gamma

Methods

classify()

Parameters

Return values

log()

Parameters

Return values

logit()

Parameters

Return values

sampleBeta()

Parameters

Return values

sampleGammaDeviate()

Parameters

Return values

train()

Parameters

Return values

Search results

NaiveBayes extends ClassifierAlgorithm in package Application

Tags

Table of Contents

Properties

$beta

$debug

$epsilon

$gamma

Methods

classify()

Parameters

Return values

log()

Parameters

Return values

logit()

Parameters

Return values

sampleBeta()

Parameters

Return values

sampleGammaDeviate()

Parameters

Return values

train()

Parameters

Return values

NaiveBayes extends ClassifierAlgorithm
in package

Application