Fasta
A module for parsing FASTA text files.
- class gfftk.fasta.FASTA(fasta_file)
Bases:
objectClass for handling FASTA files.
- get_seq(contig)
Get the sequence for a contig.
- Args:
contig (str): Contig name
- Returns:
str: Sequence for the contig
- gfftk.fasta.fasta2dict(fasta, full_header=False)
Read FASTA file to dictionary.
This is same as biopython SeqIO.to_dict(), return dictionary keyed by contig name and value is the sequence string.
- gfftk.fasta.fasta2headers(fasta, full_header=False)
Read FASTA file set of headers.
Simple function to read FASTA file and return set of contig names
- gfftk.fasta.fasta2lengths(fasta, full_header=False)
Read FASTA file to dictionary of sequence lengths.
Reads FASTA file (optionally gzipped) and returns dictionary of contig header names as keys with length of sequences as values
- gfftk.fasta.getSeqRegions(seqs, header, coordinates, coords=False)
From sequence dictionary return spliced coordinates.
Takes a sequence dictionary (ie from fasta2dict), the contig name (header) and the coordinates to fetch (list of tuples)
- Parameters:
- Returns:
result – returns spliced DNA sequence
- Return type:
- gfftk.fasta.translate(dna, strand, phase, table=1)
Translates DNA sequence into proteins.
Takes DNA (or rather cDNA sequence) and translates to proteins/amino acids. It requires the DNA sequence, the strand, translation phase, and translation table.