Abstract HMG2002
Categorization and characterization
of transcript-confirmed constitutively and alternatively
spliced introns and exons from human.
Francis Clark1
and T. A. Thanaraj2,3
ABSTRACT
By spliced alignment of human DNA and transcript sequence
data we constructed a data set of transcript-confirmed
exons and introns from 2793 genes, 796 of which (28%)
were seen to have multiple isoforms. We find that over
one third of human exons can translate in more than
one frame, and that this is highly correlated with G+C
content. Introns containing adenosine at donor site
position +3 (A3), rather than guanosine (G3), are more
common in low G+C regions, while the converse is true
in high G+C regions. These two classes of introns are
shown to have distinct lengths, consensus sequences,
and correlations among splice signals, leading to the
hypothesis that A3 donor sites are associated with exon
definition, and G3 donor sites with intron definition.
Minor classes of introns, including GC-AG, U12-type
GT-AG, weak, and putative AG-dependant introns are identified
and characterized. Cassette exons are more prevalent
in low G+C regions, while exon isoforms are more prevalent
in high G+C regions. Cassette exon events outnumber
other alternative events, while exon isoform events
involve truncation twice as often as extension, and
occur at acceptor sites twice as often as at donor sites.
Alternative splicing is usually associated with weak
splice signals, and in a majority of cases, preserves
the coding frame. The reported characteristics of constitutive
and alternative splice signals, and the hypotheses offered
regarding alternative splicing and genome organization,
have important implications for experimental research
into RNA processing. The ‘AltExtron’ data
sets are available at http://www.bit.uq.edu.au/altExtron/
and http://www.ebi.ac.uk/asd/altextron/.
(1) Advanced Computational Modelling Centre, University
of Queensland, 4072, Australia. email: fc@maths.uq.edu.au.
(2) European Bioinformatics Institute, Wellcome Trust
Genome Campus, Hinxton, Cambridge, CB10 1SD, UK. email:
thanaraj@ebi.ac.uk.
(3) To whom correspondence may be addressed. email:
thanaraj@ebi.ac.uk.
 |