classifiers
Interfaces, Classes, Traits and Enums
- BinaryFeatures
- A concrete Features subclass that represents a document as a binary
vector where a one indicates that a feature is present in the document, and
a zero indicates that it is not. The absent features are ignored, so the
binary vector is actually sparse, containing only those feature indices
where the value is one.
- ChiSquaredFeatureSelection
- A subclass of FeatureSelection that implements chi-squared feature
selection.
- Classifier
- The primary interface for building and using classifiers. An instance of
this class represents a single classifier in memory, but the class also
provides static methods to manage classifiers on disk.
- ClassifierAlgorithm
- An abstract class shared by classification algorithms that implement a
common interface.
- Features
- Manages a dataset's features, providing a standard interface for converting
documents to feature vectors, and for accessing feature statistics.
- FeatureSelection
- This is an abstract class that specifies an interface for selecting top
features from a dataset.
- LassoRegression
- Implements the logistic regression text classification algorithm using lasso
regression and a cyclic coordinate descent optimization step.
- InvertedData
- Stores a data matrix in an inverted index on columns with non-zero entries.
- NaiveBayes
- Implements the Naive Bayes text classification algorithm.
- SparseMatrix
- A sparse matrix implementation based on an associative array of associative
arrays.
- WeightedFeatures
- A concrete Features subclass that represents a document as a
vector of feature weights, where weights are computed using a modified form
of TF * IDF. This feature mapping is experimental, and may not work
correctly.