Accession Codes

Experiments and array designs in ArrayExpress are given unique accession numbers in the format of

  • E-XXXX-n for experiments
  • A-XXXX-n for array designs

XXXX represents a four letter code and n is a number.

E.g. E-MEXP-568, A-UHNC-18. Some experiments also have secondary accession numbers.

Accessions numbers are generated when sufficient meta-data and data files are provided for a submission. Please refer to our FAQ for more details about obtaining accession number for your data set.

The four letter code in the accession number indicates the data source. The source can be:

  • ArrayExpress submission tools: MEXP for MIAMExpress (deprecated since July 2014), MTAB for Annotare, and TABM for Tab2MAGE (deprecated since January 2012).
  • Data management tools / submisson pipelines from other research organizations.

Note that the 4 letter code does not necessarily tell you which organization performed the experiment or manufactured the array design.

Full list of four-letter codes

Code Source URL
AFFY Affymetrix www.affymetrix.com
AFMX Affymetrix data sets processed by an EBI in-house script  
AGIL Agilent www.agilent.com
ATMX Arabidopsis experiments and array designs submitted through the At-MIAMExpress submission tool tool deprecated
BAIR Biological Atlas of Insulin Resistance (BAIR) project www.bair.org.uk
BASE BASE microarray data management tool base.thep.lu.se
BIOD BioDiscovery microarray data management tool www.biodiscovery.com
BUGS Bacterial Microarray Group at St George's, University of London (BuG@S) www.bugs.sgul.ac.uk
CAGE Compendium of Arabidopsis Gene Expression plant developmental time series project www.cagecompendium.org/index.htm and www.ebi.ac.uk/microarray/cage
CBIL Computational Biology and Informatics Laboratory at University of Pennsylvania www.cbil.upenn.edu
DKFZ German Cancer Research Center www.dkfz.de
EMBL European Molecular Biology Laboratory www.embl.org
ERAD European Read Archive Data (pipeline submission from Wellcome Trust Sanger Institute. "European Read Archive" is now renamed the "Sequence Read Archive" (SRA) as part of INSDC)
www.sanger.ac.uk
NAR article on the SRA
FLYC FlyChip Microarray services, Cambridge Systems Biology Centre, UK www.flychip.org.uk
FPMI Functional Pathogenomics of Mucosal Immunity project www.pathogenomics.ca/fpmi
GEHB GE Healthcare array designs www.gehealthcare.com
GEOD NCBI Gene Expression Omnibus (GEO) www.ncbi.nlm.nih.gov/geo. See also the How data is imported from GEO.
HGMP Human Genome Mapping Project Resource Centre (now closed) www.geneservice.co.uk/home
IPKG Leibniz Institute of Plant Genetics and Crop Plant Research (IPK-Gatersleben) www.ipk-gatersleben.de/Internet
JCVI J. Craig Venter Institute www.jcvi.org
JJRD Johnson & Johnson Pharmaceutical Research and Development www.jnjpharmarnd.com
LGCL LGC Limited www.lgc.co.uk
MANP Coding of experiment was manually prepared by EBI staff  
MARS Microarray Analysis and Retrieval System (MARS) from Graz University of Technology, Institute for Genomics and Bioinformatics genome.tugraz.at
MAXD University of Manchester, Micorarray Group maxd software www.bioinf.manchester.ac.uk/microarray/maxd
MEXP EBI MIAMExpress submission tool www.ebi.ac.uk/miamexpress
MIMR MiMiR data warehouse at the Microarray Centre, Clinical Sciences Centre, Medical Research Council www.csc.mrc.ac.uk
MNIA Laboratory of Genetics, National Institute on Aging,
National Institutes of Health
lgsun.grc.nia.nih.gov
MTAB EBI MTAB spreadsheet submission tool www.ebi.ac.uk/cgi-bin/microarray/magetab.cgi
MUGN Integrated Functional Genomics in Mutant Mouse Models as Tools to Investigate the Complexity of Human Immunological Disease (MUGEN) project www.mugen-noe.org and www.ebi.ac.uk/fg/mugen
NASC European Arabidopsis Stock Centre arabidopsis.info
NCMF Netherlands Cancer Institute Central Microarray Facility microarrays.nki.nl
NGEN Nimblegen www.nimblegen.com
RUBN Gerry Rubin's lab http://www.hhmi.org/research/groupleaders/rubin.html
RZPD RZPD German Resource Center for Genome Research (no long accessible) www.rzpd.de (old URL)
SGRP Saccharomyces Genome Resequencing Project , Wellcome Trust Sanger Institute www.sanger.ac.uk/Teams/Team71/durbin/sgrp
SMDB Stanford Micorarray Database (moving to Princeton University, as of January 2013) http://smd.stanford.edu
SNGR Wellcome Trust Sanger Institute www.sanger.ac.uk
SYBR Sybaris Project http://www.sybaris-fp7.eu/
TABM EBI tab2mage spreadsheet submission tool (deprecated since January 2012)
www.ebi.ac.uk/cgi-bin/microarray/tab2mage.cgi
TIGR The Institute for Genomic Research (now part of the J. Craig Venter Institute) www.tigr.org
TOXM Toxicogenomics experiments  
UCON The Hutchison/MRC Research Centre www.hutchison-mrc.cam.ac.uk
UHNC University Health Network Canada www.uhn.ca
UMCU University Medical Center Utrecht www.umcutrecht.nl/zorg
WMIT Whitehead Institute for Biomedical Research/ Massachusetts Institute of Technology www.wi.mit.edu

 

Secondary accession numbers

Some experiments or array designs have a secondary accession number which links to another ArrayExpress experiment/array design, or to an external data source. The secondary accession number is shown in the "Links" section on an experiment's page. For example, for experiment E-GEOD-42281:

Expt Secondary Accession Numbers example

For array designs, the secondary accession number can be found in the header of the actual design file (in tab-delimited text format), e.g. for array design A-GEOD-9349:

ADF Secondary Accession Numbers example

Common reasons for secondary accession numbers

  1. Data re-analysis: if the data provided is a re-analysis of another dataset, the accession number of the original experiment will be the secondary accession number
  2. NCBI GEO-imported data: experimental data imported from the NCBI Gene Expression Omnibus (GEO) will have the GEO series (prefix "GSE") or data set (prefix "GSD") identifier as the secondary accession number. Similarly, array design will have the GEO platform (prefix "GPL") accession number. The ArrayExpress primary accession and GEO secondary accession for GEO-imported data are intuitively correlated. E.g. GEO accession "GSE12345" would be associated with ArrayExpress accession "E-GEOD-12345" . See also How data is imported from GEO.
  3. High-throughput sequencing experiments: they will have a link to the European Nucleotide Archive (ENA) or European Genome-phenome Archive (EGA) where the raw data files (e.g. fastq reads) are kept.