ArrayExpress is a public repository of functional genomics data. The data is mainly generated from microarray-based assays, including gene expression, comparative genomic hybridisation (CGH), chromatin immunoprecipitation (ChIP) experiments and tiling arrays. ArrayExpress also accepts functional genomics data generated on high-thoughput sequencing (HTS) platforms such as Illumina, SOLiD and 454 (1). The data are organised in a structured and standardised format (2) and are stored according to community guidelines, including the "Minimal Information about a Microarray Experiment" (MIAME, 3) and the "Minimum Information about a high-throughput SeQuencing Experiment" (MINSEQE), as recommended by the Functional Genomics Data Society. The use of guidelines and standards to collect data is of fundamental importance for publishing, disseminating scientific results and making them quickly and reliably accessible (4,5).
The data are organised into experiments, which are defined as collections of assays often related to a scientific publication. The experiments are submitted from individual researchers, or are imported from other functional genomics databases such as the the Gene Expression Omnibus (GEO) at the NCBI (6) and microarray databases such as the Stanford MicroArray Database (Figure 1). Each experiment is given a unique accession number when submitted.
For each experiment, the data files are stored together with the information describing the experiment (or "metadata'), for example sample characteristics, protocols, array design, and so on. This information is crucial because it allows the results to be interpreted and reanalysed. All data and metadata files can be downloaded. Once downloaded, microarray or HTS data files will need to be analysed using specific data analysis softwares such as Bioconductor, GenePattern, MeV or others in order to extract meaningful biological information.
Figure 1. Diagram showing how the ArrayExpress is built. Data come from different sources, including invidual researchers' submissions and databases of other scientific organisations such as GEO.