ISAR: Isoform Structure Alignment Representation

The structures of eukaryotic genes are complex and complicated. Many coding sequences have been observed and are being observed through various experimental techniques. A convenient and comprehensive cross-species representation of isoforms can help to comparatively analyze the expression, function and evolution of (alternative) transcripts. We address this issue by introducing the Isoform Structure Alignment Representation (ISAR). ISAR is a data structure (iDAG) and algorithm for a gene, transcript, and exon-intron structure aware and consistent Multiple Sequence Alignment (MSA) of isoforms from sets of orthologous and paralogous genes. An efficient algorithm constructs ISAR(iDAG)s from large sets of gene and isoform sequences by successively integrating highly confident candidate alignments. The approach is based on partially ordered sets and novel operations allowing the representation of maximal consistent alignments in a sparse graph data structure. Candidate alignments are obtained from diverse sources enabling the integration and conversion of any set of given alignments into an ISAR. The ISARs allow for the systematic classification and detailed exploration of the exon-intron structure across large sets of phylogenetic taxa and the efficient prediction of new isoforms across phylogenetically distant species.

Isoform Structure Representation for human RAB1A, RAB1B and yeast YTP1

Quick Start

Start ISAR from the command line:

java -jar ISAR-1.0.jar -in < input files > -out < output file >

Compute an ISAR and produce a FASTA multiple alignment output and a graphical representation of the alignment for the sequences provided in the ISAR example directory:
java -jar ISAR-1.0.jar -in examples/RAB1A.fasta,examples/RAB1B.fasta,examples/YPT1.fasta \
     -out RAB -fasta -exon_view

Generate inputs and compute ISAR for human and mouse PAX6 isoforms.
perl scripts/ -s homo_sapiens -g ENSG00000007372 > PAX6_human.fasta
perl scripts/ -s mus_musculus -g ENSMUSG00000027168 > PAX6_mouse.fasta

java -jar ISAR-1.0.jar -in PAX6_mouse.fasta,PAX6_human.fasta -out PAX6 -fasta -exon_view

See the README in the ISAR distribution for further details.

Conserved Exon Skippings

We systematically identified exon skipping events and checked their conservation in 10 species ranging from human to yeast. The classifications are provided for download:

Conserved Events README


In case of problems/questions concerning ISAR, please do not hesitate to contact Robert Pesch, or Gergely Csaba