ORF Extraction

Generic protocol for ORF detection in DNA sequences.

class gecco.orf.ORFFinder(object)[source]

An abstract base class to provide a generic ORF finder.

abstract find_genes()[source]

Find all genes from a DNA sequence.

class gecco.orf.PyrodigalFinder(ORFFinder)[source]

An ORFFinder that uses the Pyrodigal bindings to Prodigal.

Prodigal is a fast and reliable protein-coding gene prediction for prokaryotic genomes, with support for draft genomes and metagenomes.

__init__(metagenome: bool = True, cpus: int = 0) → None[source]

Create a new PyrodigalFinder instance.

  • metagenome (bool) – Whether or not to run PRODIGAL in metagenome mode, defaults to True.

  • cpus (int) – The number of threads to use to run Pyrodigal in parallel. Pass 0 to use the number of CPUs on the machine.


Find all genes contained in a sequence of DNA records.

  • records (iterable of SeqRecord) – An iterable of DNA records in which to find genes.

  • progress (callable, optional) – A progress callback of signature progress(record, total) that will be called everytime a record has been processed successfully, with record being the SeqRecord instance, and total being the total number of records to process.