Note that throughout this analysis, we are considering "humanized" alignments; that is, multiple alignments have all the positions that are gapped in humans removed. This leaves us, ideally, with the human protein coding sequence and the homologous residues in other species.
Having looked at many of the protein-coding* parts of alignments generated by multiple genomic sequence aligners, we were dissatisfied. Many of them do not look like good protein alignments (e.g. gap lengths not multiples of 3; internal stop codons; missing start codons). We realised that some could be improved by using a protein-aware aligner, and we have used the Prank alignment program to achieve this (see below for details of procedure).
*according to annotation of the human sequences
You can see the re-aligned transcripts by going to this main index page and clicking on the link for the region you want (e.g. ENm001). Then find the gene/transcript combination you're interested in (left-hand column) and compare the alignments you get by clicking on 'alignment' in the "prank" (re-aligned) and "tba" (original alignment) columns. For example, for region ENm001, gene AC106873.3, transcript 004, compare the start of the dog sequence in the prank realignment (click here) and the tba alignment (click here).
Computer-readable versions of the re-aligned protein-coding regions are available from Ari Loytynoja (ari@ebi.ac.uk).
We also think that some people might be interested in looking at the flanking regions of the exons too, for example to study splice sites and regions immediately up-/down-stream of start/stop codons, etc. If you'd like access to the exon-level re-alignments, including flanking regions, contact Ari Loytynoja (ari@ebi.ac.uk).
(We are hoping to improve this procedure by extending each organism's sequence by 50bp up- and down-stream, instead of relying on the TBA alignment at this stage. Note that even this will not help in the cases where TBA has completely missed an exon in an organism, as simply extending the multiple alignment by 50bp in each direction will not generally find a missing exon.)