Experiments and array designs in ArrayExpress are given unique accession numbers in the format of
- E-XXXX-n for experiments
- A-XXXX-n for array designs
XXXX represents a four letter code and n is a number.
E.g. E-MEXP-568, A-UHNC-18. Some experiments also have secondary accession numbers.
Accessions numbers are generated when sufficient meta-data and data files are provided for a submission. Please refer to our FAQ for more details about obtaining accession number for your data set.
The four letter code in the accession number indicates the data source. The source can be:
- ArrayExpress submission tools: MEXP for MIAMExpress (deprecated since July 2014), MTAB for Annotare, and TABM for Tab2MAGE (deprecated since January 2012).
- Data management tools / submisson pipelines from other research organizations (see full list below).
Note that the 4 letter code does not necessarily tell you which organization performed the experiment or manufactured the array design.
Please note that some codes are no longer in active use, i.e. they are not used in accession numbers for new submisisons but are still valid. Some URLs are dead, or point to webpages which are out of maintenance, but are included in the table below for completeness.
|AFMX||Affymetrix data sets processed by an EBI in-house script|
|ATMX||Arabidopsis experiments and array designs submitted through the At-MIAMExpress submission tool||tool deprecated|
|BAIR||Biological Atlas of Insulin Resistance (BAIR) project||www.bair.org.uk|
|BASE||BASE microarray data management tool||base.thep.lu.se|
|BIOD||BioDiscovery microarray data management tool||www.biodiscovery.com|
|BUGS||Bacterial Microarray Group at St George's, University of London (BuG@S)||www.bugs.sgul.ac.uk|
|CAGE||Compendium of Arabidopsis Gene Expression plant developmental time series project||www.cagecompendium.org/index.htm and www.ebi.ac.uk/microarray/cage|
|CBIL||Computational Biology and Informatics Laboratory at University of Pennsylvania||www.cbil.upenn.edu|
|DKFZ||German Cancer Research Center||www.dkfz.de|
|DORD||DDBJ Omics Archive (DOR)||http://trace.ddbj.nig.ac.jp/dor|
|EMBL||European Molecular Biology Laboratory||www.embl.org|
Read Archive Data (pipeline submission from Wellcome Trust Sanger
Institute. "European Read Archive" is now renamed the "Sequence Read
Archive" (SRA) as part of INSDC)
NAR article on the SRA
|FLYC||FlyChip Microarray services, Cambridge Systems Biology Centre, UK||www.flychip.org.uk|
|FPMI||Functional Pathogenomics of Mucosal Immunity project||www.pathogenomics.ca/fpmi|
|GEHB||GE Healthcare array designs||www.gehealthcare.com|
|GEOD||NCBI Gene Expression Omnibus (GEO)||www.ncbi.nlm.nih.gov/geo. See also the How data is imported from GEO.|
|GEUV||Genetic European Variation in Health and Disease (GEUVADIS), A European Medical Sequencing Consortium||www.geuvadis.org/web/geuvadis/home.|
|HGMP||Human Genome Mapping Project Resource Centre (now closed)||www.geneservice.co.uk/home|
|IPKG||Leibniz Institute of Plant Genetics and Crop Plant Research (IPK-Gatersleben)||www.ipk-gatersleben.de/Internet|
|JCVI||J. Craig Venter Institute||www.jcvi.org|
|JJRD||Johnson & Johnson Pharmaceutical Research and Development||www.jnjpharmarnd.com|
|MANP||Coding of experiment was manually prepared by EBI staff|
|MARS||Microarray Analysis and Retrieval System (MARS) from Graz University of Technology, Institute for Genomics and Bioinformatics||genome.tugraz.at|
|MAXD||University of Manchester, Micorarray Group maxd software||www.bioinf.manchester.ac.uk/microarray/maxd|
|MEXP||EBI MIAMExpress webform submission tool (deprecated since July 2014)||www.ebi.ac.uk/miamexpress|
|MIMR||MiMiR data warehouse at the Microarray Centre, Clinical Sciences Centre, Medical Research Council||www.csc.mrc.ac.uk|
|MNIA||Laboratory of Genetics, National
Institute on Aging,
National Institutes of Health
|MTAB||Experiments in MAGE-TAB format, submitted via the MTAB spreadsheet submission tool (retired in September 2014), or via Annotare||http://www.ebi.ac.uk/fg/annotare/login|
|MUGN||Integrated Functional Genomics in Mutant Mouse Models as Tools to Investigate the Complexity of Human Immunological Disease (MUGEN) project||www.mugen-noe.org and www.ebi.ac.uk/fg/mugen|
|NASC||European Arabidopsis Stock Centre||arabidopsis.info|
|NCMF||Netherlands Cancer Institute Central Microarray Facility||microarrays.nki.nl|
|RUBN||Gerry Rubin's lab||http://www.hhmi.org/research/groupleaders/rubin.html|
|RZPD||RZPD German Resource Center for Genome Research (no long accessible)||www.rzpd.de (old URL)|
|SGRP||Saccharomyces Genome Resequencing Project , Wellcome Trust Sanger Institute||www.sanger.ac.uk/Teams/Team71/durbin/sgrp|
|SMDB||Stanford Micorarray Database (moving to Princeton University, as of January 2013)||http://smd.stanford.edu|
|SNGR||Wellcome Trust Sanger Institute||www.sanger.ac.uk|
|TABM||EBI tab2mage spreadsheet submission
tool (deprecated since January 2012)
|TIGR||The Institute for Genomic Research (now part of the J. Craig Venter Institute)||www.tigr.org|
|UCON||The Hutchison/MRC Research Centre||www.hutchison-mrc.cam.ac.uk|
|UHNC||University Health Network Canada||www.uhn.ca|
|UMCU||University Medical Center Utrecht||www.umcutrecht.nl/zorg|
|WMIT||Whitehead Institute for Biomedical Research/ Massachusetts Institute of Technology||www.wi.mit.edu|
Some experiments or array designs have a secondary accession number which links
to another ArrayExpress experiment/array design, or to an external data source. The
secondary accession number is shown in the "Links" section on an experiment's page.
For example, for experiment
For array designs, the secondary accession number can be found in the header of the
actual design file (in tab-delimited text format), e.g. for array design
Common reasons for secondary accession numbers
- Data re-analysis: if the data provided is a re-analysis of another dataset, the accession number of the original experiment will be the secondary accession number
- NCBI GEO-imported data: experimental data imported from the NCBI Gene Expression Omnibus (GEO) will have the GEO series (prefix "GSE") or data set (prefix "GSD") identifier as the secondary accession number. Similarly, array design will have the GEO platform (prefix "GPL") accession number. The ArrayExpress primary accession and GEO secondary accession for GEO-imported data are intuitively correlated. E.g. GEO accession "GSE12345" would be associated with ArrayExpress accession "E-GEOD-12345" . See also How data is imported from GEO.
- High-throughput sequencing experiments: they will have a link to the European Nucleotide Archive (ENA) or European Genome-phenome Archive (EGA) where the raw data files (e.g. fastq reads) are kept.