Expression Atlas data in R
The expression data and meta-data for experiments in Atlas are
available as a pre-packaged R object.
There are two ways to access this data:
- Using the
package, available from Bioconductor. This package allows
you to search Atlas and download the data you need inside an R session. See the
package vignette for more information.
- By going to each experiment page in Expression Atlas and downloading the file containing the R object, which you can then load into R (see below).
If you don't want to use the ExpressionAtlas
package to access Atlas R data, you can download the file containing the R
object representing an experiment by clicking the R button
on the top-right of any differential experiment page to download one.
Start an R session on your computer. For details on how to get and use R,
please see the documentation on the R
In order to use the object you will need to install a few packages from Bioconductor. These are:
If you have not already installed these packages, do this
by running the following two commands:
source( "http://bioconductor.org/biocLite.R" )
biocLite( c( "S4Vectors", "IRanges", "GenomicRanges", "SummarizedExperiment" ) )
For more details about using using this package please refer to Bioconductor.
Load the object you downloaded into your R session, e.g.:
load( "/path/to/E-GEOD-38400-atlasExperimentSummary.Rdata" )
This has created an object called
object is a SimpleList object (see the S4Vectors package).
Each element is one of three Bioconductor objects:
Data from an RNA-seq experiment is contained in a single RangedSummarizedExperiment object in the SimpleList you have loaded.
The RangedSummarizedExperiment object is stored under the name "rnaseq", so you
can assign it to a new variable like this:
rSumExp <- experimentSummary$rnaseq
The RangedSummarizedExperiment object contains the following:
- Matrix of raw counts (not normalized), in the
assays slot, in a counts
- Sample annotations, in the colData slot.
- Brief outline of methods, from QC of FASTQ files to production of
raw counts, in the metadata slot.
For more information on how to use a RangedSummarizedExperiment object, please see
the documentation from Bioconductor.
Data from a one-colour, or single-channel, microarray experiment is stored
in potentially multiple ExpressionSet objects in the SimpleList
summary you have loaded. There is one ExpressionSet per array design used in
the experiment. The ExpressionSets are indexed by the ArrayExpress accession of the array design used.
You can access each ExpressionSet via its array design accession, by typing
expressionSet <- experimentSummary[[ "A-AFFY-18" ]]
Each ExpressionSet object contains the following:
- Matrix of normalized intensity values, in the
assayData, accessed via:
- Sample annotations, in the phenoData, accessed via:
pData( expressionSet )
- Brief outline of normalization method applied, in the
experimentData slot, accessed via:
experimentData( expressionSet ) )
For more information on how to use an ExpressionSet object, please see the
documentation from Bioconductor.