iCLIP.utils.TranscriptCoordInterconverter¶
-
class
iCLIP.utils.TranscriptCoordInterconverter(transcript, introns=False)¶ Interconvert between genome domain and transcript domain coordinates.
As there are expected to be many calls against the same transcript, time can be saved by precomputation. Overlapping exons are merged.
Parameters: - transcript (sequence of CGAT.GTF.Entry) – Set of GTF entires representing a transcript.
- introns (bool, optional) – Use introns instead of exons (see below).
-
transcript_id¶ Value of the transcript_id field of the transcript in quesiton. Taken from the transcript_id field of the first entry in transcript.
Type: str
-
strand¶ Strand of the transcript. Taken from the strand field of the first entry in transcript.
Type: str
-
offset¶ Position of the start of the transcript in genome coordinates
Type: int
-
genome_intervals¶ Coordinates of exons (or introns) in genome space as the difference from offset. These are sorted in transcript order (see below)
Type: list of tuples of int
-
transcript_intervals¶ Coordinates of exons (or introns) in transcript space. That is absolute distance from transcription start site after splicing.
Type: list of tuples of int
-
length¶ Total length of intervals (exons or introns) in the transcript
Type: int
Notes
Imagine the following transcript:
chr1 protein_coding exon 100 108 . - . transcript_id "t1"; gene_id "g1"; chr1 protein_coding exon 112 119 . - . transcript_id "t1"; gene_id "g1"; chr1 protein_coding exon 100 108 . - . transcript_id "t1"; gene_id "g1";
We can visualise the relationship between the different coordinate domains as below:
Genome coordinates: 1 1 1 1 0 1 2 3 0123456789012345678901234567890 Transcript: |<<<<<<|----|<<<<<|-----|<<<<<| Transcript Coords: 2 1 0 10987654 3210987 6543210 with `introns=True`: 8765 43210Thus the intervals representing the exons in the transcript domain are (0, 7), (7,14), (14, 22), and the genome base 115 corresponds to transcript base 10.
TranscriptCoordInterconverter.genome2transcript should be the interverse of TranscriptCoordInterconverter.transcript2genome.
That is if:
myConverter = TranscriptCoordInterverter(transcript)
then:
myConverter.genome2transcript(myConverter.transcript2genome(x)) == x
and:
myConverter.transcript2genome(myConverter.genome2transcript(x)) == x
-
__init__(transcript, introns=False)¶ Pre compute the conversions for each exon
Methods
genome2transcript(pos)Convert genome coordinate into transcript coordinates. genome_interval2transcript(interval)Convert an interval in genomic coordinates into an interval in transcript-coordinates. transcript2genome(pos)Convert transcript coordinates into genome coordinates. transcript_interval2genome_intervals(interval)Convert an interval in transcript coordinates and convert to a list of intervals in genome coordinates.