Type Prediction¶
Supervised classifier to predict the type of a cluster.
- class gecco.types.TypeBinarizer(sklearn.preprocessing.MultiLabelBinarizer)[source]¶
A
MultiLabelBinarizer
working withClusterType
instances.- transform(y: List[ClusterType]) Iterable[Iterable[int]] [source]¶
Transform the given label sets.
- Parameters:
y (iterable of iterables) – A set of labels (any orderable and hashable object) for each sample. If the
classes
parameter is set,y
will not be iterated.- Returns:
y_indicator (array or CSR matrix, shape (n_samples, n_classes)) – A matrix such that
y_indicator[i, j] = 1
iffclasses_[j]
is iny[i]
, and 0 otherwise.
- inverse_transform(yt: NDArray[numpy.bool_]) Iterable[ClusterType] [source]¶
Transform the given indicator matrix into label sets.
- Parameters:
yt ({ndarray, sparse matrix} of shape (n_samples, n_classes)) – A matrix containing only 1s ands 0s.
- Returns:
y (list of tuples) – The set of labels for each sample such that
y[i]
consists ofclasses_[j]
for eachyt[i, j] == 1
.
- class gecco.types.TypeClassifier(object)[source]¶
A wrapper to predict the type of a
Cluster
.- classmethod trained(model_path: Optional[str] = None) TypeClassifier [source]¶
Create a new
TypeClassifier
pre-trained with embedded data.- Parameters:
model_path (
str
, optional) – The path to the model directory obtained with thegecco train
command. IfNone
given, use the embedded training data.- Returns:
TypeClassifier
– A random forest model that can be used to perform cluster type predictions without training first.