spacer
spacer

About ArrayExpress

ArrayExpress is an international public archive for well-annotated data from array-based platforms, including gene expression, comparative genomic hybridization (CGH), chromatin-immunoprecipitation (ChIP) experiment, tiling arrays, and RNA-Seq data generated on high throughput sequencing platforms such as Solexa, SOLiD and 454.

ArrayExpress has three major goals:

  1. Serve the scientific community as a archive for data supporting publications
  2. Provide easy access to high-quality data in a standard format
  3. Facilitate the sharing of technical platforms, specifically microarray designs and experimental protocols

ArrayExpress has two major components:

  • Gene Expression Atlas New 1.1 version! - the Gene Expression Atlas is a semantically enriched database of meta-analysis based summary statistics servicing queries for condition-specific gene expression patterns (e.g. genes over-expressed in a particular tissue or disease state) as well as broader exploratory searches for biologically interesting genes/samples. Visit the Atlas at www.ebi.ac.uk/gxa or read more about the project. The Atlas replaces the ArrayExpress Data Warehouse and has improved GUIs and new functionality.

 

Archive, Warehouse, Atlas


Diagram of the relationship between the ArrayExpress experiment archive and Gene Expression Atlas.

 

More information can be found through the following links:

Additional resources:

 

ArrayExpress Experiment Archive content

Approximately 60% of the experiments in the ArrayExpress archive come from individual researchers submitting data through our submission tools MIAMExpress, Tab2MAGE and MAGE-TAB. The remaining experiments are submitted to us from microarray databases and tools at other organizations, such as the Stanford MicroArray Database. Experiment accession numbers are in the format of E-XXXX-n, where XXXX is a code for the source of the data. A list of all accession number codes and their sources cam be found here: ArrayExpress accession codes

Some experiments have also been extracted from the Gene Expression Omnibus (GEO) at the NCBI. More information about how we extract experiment information from GEO can be found here: GEO experiment import.

Data releases

The content of the ArrayExpress experiment archive is updated daily at 6am GMT.

 

Gene Expression Atlas content

The Gene Expression Atlas is built on a subset of experiments from the ArrayExpress experiment archive. The criteria we use for selecting experiments for inclusion in the Atlas are as follows:

  • Experiment must be public
  • Experiment must have 6 or more hybridizations
  • Experimental factors and associated factor values must be provided or can be added by the ArrayExpress curators
  • Adequate sample annotation must be provided
  • Processed data must be provided or raw data which can be renormalized must be available
  • Array designs relating to experiment must be re-annotated vs. Ensembl or Uniprot (or have the potential for this to be done)

Array re-annotation

Array designs are re-annotated to ensure the gene annotation is as complete and up-to-date as possible. Arrays for species mapped by Ensembl (www.ensembl.org) are re-annotated bi-monthly after each Ensembl release. Arrays for species not covered by Ensembl are re-annotated via UniProt. We add GO terms and ids, synonyms, Interpro ids, transcript ids etc.

Data releases

We aim to update the content of the Gene Expression Atlas on the 8th of every month.

 

Any further questions, please see our FAQ.

spacer
spacer