How are data stored in ArrayExpress?

In ArrayExpress, the information describing an experiment and the associated result files are stored in a standard spreadsheet-based format called MAGE-TAB. The MAGE-TAB spreadsheet (7,8) is a simple tab-delimited format for sharing functional genomics data according to the MIAME guidelines. MAGE-TAB enables researchers without bioinformatics experience or support to easily manage, exchange and submit data.

For each experiment, several MAGE-TAB files are required to capture information about the experiment and the results.

These files are:

File name

 Description   For microarray data   For HTS  data  
Investigation Description Format (IDF) file Contains an overview of the experiment, including the authors' contact details, publication information, protocols and the experimental variables.    

Generated

at

submission

 

Generated

at

submission

Sample and Data Relationship Format (SDRF) file Describes all the sample characteristics (for example, genotype) or any treatment that the sample has been subjected to (for example, growth in low oxygen conditions) and links each sample to its corresponding data file.  

Generated

at

submission

 

Generated

at

submission

Array Design Format (ADF) file

Describes how a microarray was manufactured and what was printed/synthesized at each position on the array.

 

For commercially available microarrays, this file is provided by the array manufacturer. If the array design you used has already been described in the Archive, then you do not need to submit it.

 

 

Submitted through MIAMExpress, if not already present in the Archive

 N/A
Data files and data matrices - raw data The raw data is the data collected at the source, which has not been subjected to any manipulation. Raw data can be in its original format (for example Affymetrix .CEL files) or re-formatted as data matrices.   Generated by the scanner machine   Generated by the sequencing machine
Data files and data matrices - processed data The processed data is the data that has already been manipulated and is often defined as the data on which the conclusions are based on. Processed data will always be stored as data matrices.   Generated by the author   Generated by the author