spacer

AltExtron Database - Flat files

AltExtron is a computer generated high quality dataset of human transcript-confirmed constitutive and alternative exons and introns, and the delineated events. On this page links to the various data files are given - documentation of the file format can be found within the individual file links. As part of the general documentation illustrative diagrams of splice signals and alternative events are available.

It is important to note that although there are three sections on this page, all data is an integral part of the AltExtron database as it can be queried through web interfaces (as mentioned in the access pages).

Release 1 of the AltExtron database is available for nine species - all based on the gene data from the February 2003 release of EMBL. The tar files below unpack into the same files as listed in the next section in this page - also the documentation remains the same.

Arabidopsis arabi.webdata.tar(35 Mb)
C. elegans celegans.webdata.tar(25 Mb)
Chicken chicken.webdata.tar(800 Kb)
Cow cow.webdata.tar(450 Kb)
Drosophila droso.webdata.tar(30 Mb)
Human human.webdata.tar(55 Mb)
Mouse mouse.webdata.tar (10 Mb)
Rat rat.webdata.tar(920 Kb)
Zebrafish zfish.webdata.tar(2 Mb)


  Confirmed Genes with at least one confirmed  
  Introns

a
Exons

b
Intron/exon

c
Alternative
event
d


d/c
Human 42922 29853 5584 2581 46%
Droso 33241 20595 7881 1302 16%
Mouse 11461 8288 1766 658 37%
C.elegans 41214 23043 9552 598 6%
Arabi 67455 51733 13302 758 5.7%
Chicken 1115 811 228 42  
Cow 594 367 177 20  
Rat 1263 846 339 34  
Zebrafish 1227 829 257 21  


Distribution of observed alternative events.
  Intron retention Exon isoforms Intron isoforms Cassette exons
Human 383 2046 3132 3844
Droso 111 494 1294 797
Mouse 127 440 718 267
C.elegans 50 229 424 267
Arabi 156 372 633 70
Chicken 4 26 37 27
Cow 0 12 20 25
Rat 1 15 26 22
Zebrafish 0 11 16 9

Prototype Human Data

The data files in this section point to the prototype human AltExtron database. The tar files (from the latest build) hyper-linked above for multiple organisms contain a similar organisation as that of below.

The sequence and alignment data:

The intron and exon flat and table files:

The alternatively spliced introns and exons:

Other Data:

The data given above is the subject of the paper "Categorisation and characterization of transcript-confirmed constitutively and alternatively spliced introns and exons from human". Clark, F. and T.A. Thanaraj. (Human Molecular Genetics 2002 11: 451-464). [abstract]

Human-mouse comparison

The data given here are human introns and alternative splicing events that are conserved in mouse.

Notes on the data files - read this first

I. Results of matching human splice junctions with mouse transcript sequences.

I.1.1 Conserved Human Constitutive Introns I.1.2 Sequences of human and mouse exon constructs for the Conserved Constit Introns I.2.1 Conserved Human Alternative Introns I.2.2 Sequences of human and mouse exon constructs for the Conserved Alt Introns

II. Results of the analysis of mapping the matching mouse transcript sequences onto mouse draft genome sequence.

II.1 Summary of the analysis of matching the mouse exon constructs with mouse genome sequence. II.2.1 Human splice junctions that are unambiguously seen in mouse genes also
II.2.2 Notes on the above data file II.2.1.
II.2.2 Human splice junctions that are seen (with some ambiguity) in mouse genes also
II.2.3 Lengths of the conserved introns as seen in human, and in mouse

III. Delineating the conserved alternative splicing events.

III.1.1 Conserved Human Alternatively spliced EventsIII.1.2 Effects on protein sequences due to conserved Human Alternatively spliced Events III.1.3 Protein product names of the genes in which conserved alternative splice events were observed

The data given in this section is the subject of the paper "Conservation of Human Alternative splice Events in Mouse". Thanaraj, T.A., F. Clark and J. Muilu. Communicated

Human GC-AG introns

A specialised dataset on GC-AG introns isoforms.

This data is subject of paper "GC-AG alternative intron isoforms with weak donor sites show enhanced consensus at acceptor exon positions". Thanaraj, T.A. and F. Clark. Nucleic Acids Research 29:2581-2593 (2001). [abstract, full text pdf, full text html]

MIrror site

All data on this page is mirrored at the University of Queenlands at the following url: http://bit.uq.edu.au/altExtron/

spacer
spacer