RAkELd: random label space partitioning with Label Powerset¶
-
class
skmultilearn.ensemble.
RakelD
(base_classifier=None, labelset_size=3, base_classifier_require_dense=None)[source]¶ Bases:
skmultilearn.base.base.MLClassifierBase
Distinct RAndom k-labELsets multi-label classifier.
Divides the label space in to equal partitions of size k, trains a Label Powerset classifier per partition and predicts by summing the result of all trained classifiers.
Parameters: - base_classifier (sklearn.base) – the base classifier that will be used in a class, will be
automatically put under
self.classifier
for future access. - base_classifier_require_dense ([bool, bool]) – whether the base classifier requires [input, output] matrices
in dense representation, will be automatically
put under
self.require_dense
- labelset_size (int) – the desired size of each of the partitions, parameter k according to paper Default is 3, according to paper it has the best results
-
classifier_
¶ the underneath classifier that perform the label space partitioning using a random clusterer
skmultilearn.ensemble.RandomLabelSpaceClusterer
Type: skmultilearn.ensemble.LabelSpacePartitioningClassifier
References
If you use this class please cite the paper introducing the method:
@ARTICLE{5567103, author={G. Tsoumakas and I. Katakis and I. Vlahavas}, journal={IEEE Transactions on Knowledge and Data Engineering}, title={Random k-Labelsets for Multilabel Classification}, year={2011}, volume={23}, number={7}, pages={1079-1089}, doi={10.1109/TKDE.2010.164}, ISSN={1041-4347}, month={July}, }
Examples
Here’s a simple example of how to use this class with a base classifier from scikit-learn to teach non-overlapping classifiers each trained on at most four labels:
from sklearn.naive_bayes import GaussianNB from skmultilearn.ensemble import RakelD classifier = RakelD( base_classifier=GaussianNB(), base_classifier_require_dense=[True, True], labelset_size=4 ) classifier.fit(X_train, y_train) prediction = classifier.predict(X_train, y_train)
-
fit
(X, y)[source]¶ Fit classifier to multi-label data
Parameters: - X (numpy.ndarray or scipy.sparse) – input features, can be a dense or sparse matrix of size
(n_samples, n_features)
- y (numpy.ndaarray or scipy.sparse {0,1}) – binary indicator matrix with label assignments, shape
(n_samples, n_labels)
Returns: Return type: fitted instance of self
- X (numpy.ndarray or scipy.sparse) – input features, can be a dense or sparse matrix of size
-
predict
(X)[source]¶ Predict label assignments
Parameters: X (numpy.ndarray or scipy.sparse.csc_matrix) – input features of shape (n_samples, n_features)
Returns: binary indicator matrix with label assignments with shape (n_samples, n_labels)
Return type: scipy.sparse of int
-
predict_proba
(X)[source]¶ Predict label probabilities
Parameters: X (numpy.ndarray or scipy.sparse.csc_matrix) – input features of shape (n_samples, n_features)
Returns: binary indicator matrix with probability of label assignment with shape (n_samples, n_labels)
Return type: scipy.sparse of float
- base_classifier (sklearn.base) – the base classifier that will be used in a class, will be
automatically put under