Differential Expression Atlas help
The Differential Atlas lets you ask questions about which genes are
up– or down–regulated in different experimental conditions. It is
based on a set of highly curated differential expression RNA-Seq and microarray
experiments from ArrayExpress that have been
re-processed using our in–house differential expression statistical
By default, the maximum adjusted p-value is 0.05, and the minimum
log2 fold-change is 1. This means only genes with an adjusted
p-value below 0.05 and log2 fold-change above 1 (or below -1)
You may change the two cutoff values to whatever you like, to
relax constraints or make them stricter. Changing these values will affect the
number of genes displayed. Generally, the higher the chosen adjusted
p-value, and the lower the minimum log2 fold-change, the more
genes will be displayed.
Type any value from 0 to 1 in the Adjusted p-value cutoff
box to use as the maximum false discovery rate (FDR) -adjusted p-value
cutoff. Only genes that show differential expression with an adjusted
p-value at or below this cutoff will be returned.
Type any numerical value in the Log2 fold-change
cutoff box to use as the minimum absolute log2 fold-change.
Only genes that have a larger absolute log2 fold-change will be
displayed. When we say "absolute" value, we mean the value without a "+" or
"-" sign in front. For example, if you choose an absolute value of 2, then
genes with log2 fold-change of greater than +2 or
less than -2 are shown.
You can search with Ensembl gene symbols
(e.g. desat1), identifiers (e.g. FBgn0086687) or biotypes (e.g.
protein_coding), UniProt accessions (e.g. P35542), GO ("pheromone
biosynthetic process") or InterPro
terms (e.g. "Fatty acid desaturase, type 1"). A space–separated
list of gene attributes will bring back genes that match at least one of
the attributes in the query.
Use the radio buttons to the right of the Gene Query box to limit the search
to up–regulated genes, down–regulated genes, or both.
If the "Exact match" box is checked, only genes with annotations that fully
match your query will be returned. For example, to see if the
gene is differentially expressed, check the "Exact match" box, type "desat1" and click "Search". If the
box is not checked, genes with annotations that contain words that fully match
your query will be returned. For example, to find results for genes
with "desaturase" in their annotations (e.g. "stearoyl-CoA 9-desaturase
activity", "Fatty acid/sphingolipid desaturase", …), uncheck the "Exact
match" box, type "desaturase" and click search.
Many experiments contain more than one comparison. If you do
not select any comparisons, results for all comparisons in the experiment will be
displayed. To see the results for specific comparison(s) select the ones you wish
to see from the Comparison box:
Select one or more comparisons to search with, or start typing to see
suggestions. To see details of the groups of samples used in each comparison,
click the button.
Specific vs. non-specific search
If there is more than one comparison in an experiment, the default
Differential Atlas search reports genes with more "specific" differential
expression (with the Specific option selected by default). What we mean
by this is outlined below.
If no comparisons are selected…
If no comparisons are selected, the Specific search will rank
genes so that genes that are differentially expressed in just one
comparison come first, followed by genes differentially expressed in two
comparisons, then three and so on, reporting genes that are differentially
expressed in all available comparisons at the end of the list of results. Within
each group of N comparisons, genes are ordered by largest average absolute (i.e.
ignoring positive or negative status) log2 fold-change.
Only log2 fold-changes for differential expression in the
direction selected are considered for specificity. For example, if you search
for only up–regulated genes (by clicking the "up" radio button next to
the Gene Query box), only genes with positive log2 fold-changes are
used in the calculations for specificity.
If at least one comparison is selected…
If at least one comparison is selected, the Specific search will
promote genes with larger average absolute log2 fold-change in the
selected comparison(s), and at the same time penalize genes with the same
or larger log2 fold-change in the un–selected comparison(s).
This is done using the "fold difference" between the selected and
un–selected absolute log2 fold-changes:
- Fold difference is calculated by dividing the average
absolute log2 fold-change of the selected comparison(s) by the
largest absolute log2 fold-change of the un–selected
- If none of the un–selected comparisons' absolute log2
fold-changes are at or above the selected log2 fold-change
cutoff, the average of the selected comparisons' absolute log2 fold-changes
is divided by the cutoff instead.
Genes with the greatest fold difference in absolute log2 fold-changes
between selected and un–selected comparisons will be pushed towards the top
of the list. Genes that have a larger absolute log2 fold-change in the
un–selected comparisons are not shown.
If the Specific option is not selected, the Differential Atlas search
will ignore specificity of differential expression when ordering the
results. In particular:
- If no comparisons are selected, genes with largest absolute
log2 fold-changes in all comparisons will be reported first.
- If at least one comparison is selected, genes with larger
absolute log2 fold-changes across the selected comparisons are reported first.
Log2 fold-changes in the un–selected comparisons are ignored in
the ordering of results.
Results for your query are shown in a heatmap table. The heatmap rows are
labelled with gene symbols (and design elements in the case of microarray
data). The columns are labelled with comparison descriptions. Mouseover the
column headings to see the full description.
The heatmap ranks genes by absolute (i.e. ignoring up- or down-regulated
status) log2 fold-change, with the largest at the top. If some genes
have identical log2 fold-changes, these are then ordered by adjusted
p-value, with the gene with the lowest adjusted p-value first.
Greater colour intensity means larger absolute log2 fold-change, as
illustrated by the gradient bars (e.g. ) displayed above the
heatmap. Blue boxes indicate the gene is down–regulated, while red ones
mean up–regulated. Mouseover a filled box in the heatmap to see the
adjusted p-value and log2 fold-change (and t-statistic
for microarray data). To see the log2 fold-changes for all genes,
click the button. This
also reveals the maximum and minimum log2 fold-changes in the
experiment either side of the gradient bars. The gradient shows intensities
corresponding to up to the top 50 differentially expressed genes
currently displayed (rather than all genes returned by the query).
If you mouseover a gene name in the heatmap, synonyms, Gene Ontology terms, and Interpro terms are displayed:
Any terms that match your Gene Query term(s) are highlighted in yellow here.
Click the gene name to see more information about that gene, including other
Expression Atlas experiments in which it was found.
Two types of plots can be visualised by clicking one of the buttons in the header of the heatmap table. This brings up a
menu like that shown below. Different numbers of plots may be available for
An MA plot is generated for each comparison in every experiment. To see
it, select the item from the menu. This plot displays
average expression level for each gene (normalized microarray intensity level or
RNA-Seq log2 counts-per-million) on the x-axis against
log2 fold-change on the y-axis. This plot gives an overview
of the relationship between expression level and fold-change, and can be used
to assess any systematic biases in the data. Genes that were called as
differentially expressed at FDR < 0.05 are shown in red in the plot.
Gene set overlap analysis is performed using the Piano
package from Bioconductor. For each
comparison, overlap between terms from GO, InterPro, and Reactome and the set of
differentially expressed genes is performed, using Fisher's exact
test with multiple testing correction (FDR < 0.1). Gene set overlap plots, are available
only when statistically significant overlap of terms was detected. This means that for some experiments, the menu will not display plots for all three of the
aforementioned resources. To see a plot, select one of the following options from the menu:
- to see GO term overlap,
- to see InterPro term overlap,
- to see Reactome pathway term overlap.
Doing so will bring up a plot like the one below, showing a maximum of top 10 overlapping terms (nodes) from a list sorted by the
effect size (i.e. the number of observed divided by the number of expected genes annotated with a given term
within the differentially expressed set of genes).
The terms are linked by edges representing genes shared between them - the more more genes shared between the two terms, the thicker the edge.
The size of each node represents the proportion of
differentially expressed genes annotated with each term. Please see the Piano
documentation for more details.
If you select a gene name and a comparison heading from the heatmap, the
Ensembl Genome Browser Open button, to the left of the heatmap
table, will become clickable.
Clicking the Open button will take you to the Ensembl or Ensembl Genomes browser, which will
display the log2 fold-change for the comparison you selected in the
context of the genomic location of the gene you selected. For more details
about how to use the genome browsers, please see the relevant documentation for
Ensembl or Ensembl Genomes.
The top 50 differentially expressed genes resulting from your search
are displayed on the page. Click on the button to
download the full results of your query in tab–delimited format
with no ordering.
You can download RNA-seq raw
counts (click the button), normalized microarray intensity data (click the button), and all statistical
analytics results for all comparisons in the experiment (click the button). All data is
provided as tab-delimited text files. You can also download data for each
experiment ready to load into R by
clicking the on any experiment page. See our help page about Expression Atlas data in R
for more details.
Click the button to see the
results of quality assessment for the experiment data files.
For microarray experiments, this report is
generated by the arrayQualityMetrics
package from Bioconductor in R. Please see the arrayQualityMetrics documentation
for more details about the methods used. Breifly, outlier arrays are detected
using distance measures, box plots, and MA plots. Any array that is found to be
an outlier by all three of these methods is excluded from further analysis.
For RNA-seq experiments, the QC report is generated by the iRAP pipeline. Please see the iRAP
for more details on the methods used.
The Analysis Methods page (e.g. for D.
melanogaster CDK8 and CycC mutants or Arabidopsis
lncRNA-mediated transcriptional silencing mutants) lists the data
analysis methods applied to the raw experimental data (FASTQ files
or microarray intensity data), to obtain the differential expression statistics
shown in the Differential Atlas. You can access it by clicking the button.
The Experiment Design page (e.g. for D.
melanogaster CDK8 and CycC mutants is accessible via the button. Here you can see RNA-Seq processing run accessions
(from ENA) or microarray assay
accessions, along with their corresponding biological sample characteristics
and experimental variables. If you select a comparison from the "Comparison"
dropdown menu, the runs/assays used in that comparison will be highlighted in the
table. Runs/assays not used in the selected comparison are not highlighted.
Runs/assays assigned to the "Reference" group are highlighted one colour, and
those assigned to the "Test" group are highlighted in another. The up– or
down–regulated expression calls shown in the heatmap are always from the
perspective of the "Test" group.
You can sort the experiment design table by clicking on a column. Click the
column again to reverse the sort order. To then sort by another column while
retaining the ordering of the first column, hold down the shift key and
click another column. The first selected column forms the primary sort order
and the shift+click–ed columns form secondary, tertiary, etc. sort orders
in turn. If you shift+click a column to sort by it and then wish to undo this,
just shift+click it again until the column is unsorted ().
The experiment design table is also searchable — entering a keyword
automatically selects the subsection of the table that matches the keyword.
Click on the button on the
right–hand side to download the full results of the current query (with
no ordering applied) in tab–delimited format.
We value your feedback on what Expression Atlas is doing right, what does
not work or what could be achieved more intuitively. We would also be grateful
for any feedback on the analysis methods we have adopted — we are
passionate about not only about intuitive data presentation, the highest level
of experimental metadata curation, but also about the quality and biological
validity of the data presented in the Differential Atlas.
To send us your feedback, please fill in the form accessed by clicking on
the Feedback link in the top–right of the page: