iCLIP.meta.get_binding_matrix

iCLIP.meta.get_binding_matrix(bamfile, gtflike_iterator, align_at=0, bin_size=25, left_margin=500, right_margin=10000)

Get matrix containing the binding counts across all the genes in the iterator binned into equal sized bins

Transcripts/genes are collapsed to just their exons, but sequence up and down stream of the ends of the transcript/gene is added. Zero count transcripts/genes are excluded.

Parameters:
  • bam (*_getter func) – Function to access cross-links from, as returned by make_getter()
  • gtflike_iterator (iter of CGAT.GTF.Entry) – iterator returning sequences of CGAT.GTF.Entry objects. Note overlapping exons will be merged.
  • align_at (int or pandas.Series, optional) – Position in transcript at which to align profiles form each transcript. The defaults means the start of the gene/transcript. If int same position is used for every gene/transcript. If pandas.Series, index should contain transcript_ids.
  • bin_size (int, optional) – Number of bases to include in a single bin. Depth will be summed across these bins.
  • right_margin (left_margin,) – Number of bases to include to the left and right of the alignment point.
Returns:

Columns are bins and row are genes.

Return type:

pandas.Dataframe