onelearn.AMFClassifier

class onelearn.AMFClassifier(n_classes, n_estimators=10, step=1.0, loss='log', use_aggregation=True, dirichlet=None, split_pure=False, n_jobs=1, n_samples_increment=1024, random_state=None, verbose=False)[source]

Aggregated Mondrian Forest classifier for online learning. This algorithm is truly online, in the sense that a single pass is performed, and that predictions can be produced anytime.

Each node in a tree predicts according to the distribution of the labels it contains. This distribution is regularized using a “Jeffreys” prior with parameter dirichlet. For each class with count labels in the node and n_samples samples in it, the prediction of a node is given by

(count + dirichlet) / (n_samples + dirichlet * n_classes)

The prediction for a sample is computed as the aggregated predictions of all the subtrees along the path leading to the leaf node containing the sample. The aggregation weights are exponential weights with learning rate step and loss loss when use_aggregation is True.

This computation is performed exactly thanks to a context tree weighting algorithm. More details can be found in the paper cited in references below.

The final predictions are the average class probabilities predicted by each of the n_estimators trees in the forest.

Note

All the parameters of AMFClassifier become read-only after the first call to partial_fit

References

  1. Mourtada, S. Gaiffas and E. Scornet, AMF: Aggregated Mondrian Forests for Online Learning, arXiv:1906.10529, 2019
__init__(n_classes, n_estimators=10, step=1.0, loss='log', use_aggregation=True, dirichlet=None, split_pure=False, n_jobs=1, n_samples_increment=1024, random_state=None, verbose=False)[source]

Instantiates a AMFClassifier instance.

Parameters:
  • n_classes (int) – Number of expected classes in the labels. This is required since we don’t know the number of classes in advance in a online setting.
  • n_estimators (int, default = 10) – The number of trees in the forest.
  • step (float, default = 1) – Step-size for the aggregation weights. Default is 1 for classification with the log-loss, which is usually the best choice.
  • loss ({"log"}, default = "log") – The loss used for the computation of the aggregation weights. Only “log” is supported for now, namely the log-loss for multi-class classification.
  • use_aggregation (bool, default = True) – Controls if aggregation is used in the trees. It is highly recommended to leave it as True.
  • dirichlet (float or None, default = None) – Regularization level of the class frequencies used for predictions in each node. Default is dirichlet=0.5 for n_classes=2 and dirichlet=0.01 otherwise.
  • split_pure (bool, default = False) – Controls if nodes that contains only sample of the same class should be split (“pure” nodes). Default is False, namely pure nodes are not split, but True can be sometimes better.
  • n_jobs (int, default = 1) – Sets the number of threads used to grow the tree in parallel. The default is n_jobs=1, namely single-threaded. Fow now, this parameter has no effect and only a single thread can be used.
  • n_samples_increment (int, default = 1024) – Sets the minimum amount of memory which is pre-allocated each time extra memory is required for new samples and new nodes. Decreasing it can slow down training. If you know that each partial_fit will be called with approximately n samples, you can set n_samples_increment = n if n is larger than the default.
  • random_state (int or None, default = None) – Controls the randomness involved in the trees.
  • verbose (bool, default = False) – Controls the verbosity when fitting and predicting.

Methods

__init__(n_classes[, n_estimators, step, …]) Instantiates a AMFClassifier instance.
get_nodes_df(idx_tree)
partial_fit(X, y[, classes]) Updates the classifier with the given batch of samples.
partial_fit_helper(X, y) Updates the classifier with the given batch of samples.
predict_helper(X) Helper method for the predictions of the given features vectors.
predict_proba(X) Predicts the class probabilities for the given features vectors.
predict_proba_tree(X, tree) Predicts the class probabilities for the given features vectors using a single tree at given index tree.
weighted_depth_helper(X)

Attributes

dirichlet Regularization level of the class frequencies.
loss The loss used for the computation of the aggregation weights.
n_classes Number of expected classes in the labels.
n_estimators Number of trees in the forest.
n_features Number of features used during training.
n_jobs Number of expected classes in the labels.
n_samples_increment Amount of memory pre-allocated each time extra memory is required.
random_state Controls the randomness involved in the trees.
split_pure Controls if nodes that contains only sample of the same class should be split.
step Step-size for the aggregation weights.
use_aggregation Controls if aggregation is used in the trees.
verbose Controls the verbosity when fitting and predicting.
dirichlet

Regularization level of the class frequencies.

Type:float or None
loss

The loss used for the computation of the aggregation weights.

Type:str
n_classes

Number of expected classes in the labels.

Type:int
n_estimators

Number of trees in the forest.

Type:int
n_features

Number of features used during training.

Type:int
n_jobs

Number of expected classes in the labels.

Type:int
n_samples_increment

Amount of memory pre-allocated each time extra memory is required.

Type:int
partial_fit(X, y, classes=None)[source]

Updates the classifier with the given batch of samples.

Parameters:
  • X (np.ndarray, shape=(n_samples, n_features)) – Input features matrix.
  • y (np.ndarray) – Input labels vector.
  • classes (None) – Must not be used, only here for backwards compatibility
Returns:

output – Updated instance of AMFClassifier

Return type:

AMFClassifier

partial_fit_helper(X, y)[source]

Updates the classifier with the given batch of samples.

Parameters:
  • X (np.ndarray, shape=(n_samples, n_features)) – Input features matrix.
  • y (np.ndarray) – Input labels vector.
  • classes (None) – Must not be used, only here for backwards compatibility
Returns:

output – Updated instance of AMFClassifier

Return type:

AMFClassifier

predict_helper(X)[source]

Helper method for the predictions of the given features vectors. This is used in the predict and predict_proba methods of AMFRegressor and AMFClassifier.

Parameters:X (np.ndarray, shape=(n_samples, n_features)) – Input features matrix to predict for.
Returns:output – Returns the predictions for the input features
Return type:np.ndarray
predict_proba(X)[source]

Predicts the class probabilities for the given features vectors.

Parameters:X (np.ndarray, shape=(n_samples, n_features)) – Input features matrix to predict for.
Returns:output – Returns the predicted class probabilities for the input features
Return type:np.ndarray, shape=(n_samples, n_classes)
predict_proba_tree(X, tree)[source]

Predicts the class probabilities for the given features vectors using a single tree at given index tree. Should be used only for debugging or visualisation purposes.

Parameters:
  • X (np.ndarray, shape=(n_samples, n_features)) – Input features matrix to predict for.
  • tree (int) – Index of the tree, must be between 0 and n_estimators - 1
Returns:

output – Returns the predicted class probabilities for the input features

Return type:

np.ndarray, shape=(n_samples, n_classes)

random_state

Controls the randomness involved in the trees.

Type:int or None
split_pure

Controls if nodes that contains only sample of the same class should be split.

Type:bool
step

Step-size for the aggregation weights.

Type:float
use_aggregation

Controls if aggregation is used in the trees.

Type:bool
verbose

Controls the verbosity when fitting and predicting.

Type:bool