API Reference

Data Model

ClusterType

An immutable storage for the type of a gene cluster.

Strand

A flag to declare on which DNA strand a gene is located.

Domain

A conserved region within a protein.

Protein

A sequence of amino-acids translated from a gene.

Gene

A nucleotide sequence coding a protein.

Cluster

A sequence of contiguous genes.

ClusterTable

A table storing condensed information from several clusters.

FeatureTable

A table storing condensed domain annotations from different genes.

ORF Extraction

ORFFinder

An abstract base class to provide a generic ORF finder.

PyrodigalFinder

An ORFFinder that uses the Pyrodigal bindings to Prodigal.

Domain Annotation

PyHMMER

A domain annotator that uses pyhmmer.

BGC Detection

ClusterCRF

A wrapper for sklearn_crfsuite.CRF to work with the GECCO data model.

BGC Extraction

ClusterRefiner

A post-processor to extract contiguous clusters from CRF predictions.

Type Prediction

TypeBinarizer

A MultiLabelBinarizer working with ClusterType instances.

TypeClassifier

A wrapper to predict the type of a Cluster.

InterPro Metadata

InterPro

A subset of the InterPro database exposing domain metadata.

InterProEntry

A single entry in the InterPro database.