spacer
spacer

Details of construction of the Gene Expression Atlas

An overview of the construction of the Gene Expression Atlas is given here. For further details a forthcoming publication will be available here soon.

  1. ArrayExpress Archive provides the original source of microarray data
  2. The ArrayExpress Production Team (curators) select experiments from the archive to populate the Gene Expression Atlas, improving sample annotation by use of ontologies and performing gene re-annotation based on the latest genome builds in the process.
  3. Every experiment is treated individually as follows:
    1. Data is taken as normalized by original submitters/authors.
    2. For every experimental variable ("factor") a set of gene-wise linear models is constructed, coefficients are moderated and one-way multiple comparisons with the mean (MCM) contrasts are computed. [1]
    3. Post-hoc tests are used to compute and identify contrasts of interest with respective t-statistics and globally adjusted p-values.

Thus for each experiment-condition pair we have, for each gene in the experiment, a direction and a p-value indicating the strength of differential expression in contrast to groups defined by the experimental variable that the condition belongs to.

[1]. The statistical models and computations are performed primarily with the aid of the Bioconductor package limma. Smyth, G. K. (2005). Limma: Linear Models for microarray data. In: Bioinformatics and Computational Biology Solutions using R and Bioconductor, R. Gentleman, V. Carey, S. Dudoit, R. Irizarry, W. Huber (eds.), Springer, New York, pages 397-420.

Additional resources:

spacer
spacer