ORF Extraction¶
Generic protocol for ORF detection in DNA sequences.
- class gecco.orf.ORFFinder(object)[source]¶
An abstract base class to provide a generic ORF finder.
- abstract find_genes(records: Iterable[Bio.SeqRecord.SeqRecord], progress: Optional[Callable[[Bio.SeqRecord.SeqRecord, int], None]] = None) Iterable[gecco.model.Gene] [source]¶
Find all genes from a DNA sequence.
- class gecco.orf.PyrodigalFinder(ORFFinder)[source]¶
An
ORFFinder
that uses the Pyrodigal bindings to Prodigal.Prodigal is a fast and reliable protein-coding gene prediction for prokaryotic genomes, with support for draft genomes and metagenomes.
- __init__(metagenome: bool = True, mask: bool = False, cpus: int = 0) None [source]¶
Create a new
PyrodigalFinder
instance.- Parameters
metagenome (bool) – Whether or not to run Prodigal in metagenome mode, defaults to
True
.mask (bool) – Whether or not to mask genes running across regions containing unknown nucleotides, defaults to
False
.cpus (int) – The number of threads to use to run Pyrodigal in parallel. Pass
0
to use the number of CPUs on the machine.
- find_genes(records: typing.Iterable[Bio.SeqRecord.SeqRecord], progress: typing.Optional[typing.Callable[[Bio.SeqRecord.SeqRecord, int], None]] = None, *, pool_factory: typing.Union[typing.Type[multiprocessing.pool.Pool], typing.Callable[[typing.Optional[int]], multiprocessing.pool.Pool]] = <class 'multiprocessing.pool.ThreadPool'>) Iterator[gecco.model.Gene] [source]¶
Find all genes contained in a sequence of DNA records.
- Parameters
records (iterable of
SeqRecord
) – An iterable of DNA records in which to find genes.progress (callable, optional) – A progress callback of signature
progress(record, total)
that will be called everytime a record has been processed successfully, withrecord
being theSeqRecord
instance, andtotal
being the total number of records to process.
- Keyword Arguments
pool_factory (
type
) – The callable for creating pools, defaults to themultiprocessing.pool.ThreadPool
class, butmultiprocessing.pool.Pool
is also supported.- Yields
Gene
– An iterator over all the genes found in the given records.