How to search for public experiments

1. Quick search
      1.1 Search by accession or keyword
      1.2 Search term expansion by Experimental Factor Ontology (EFO)
      1.3 Filtering results
2. Advanced search
      2.1 Combining search terms with "AND", "OR", "NOT"
      2.2 Restrict search by specifying the search space
      2.3 Filtering results by counts

1. Quick search

1.1 Search by accession or keyword

Enter an experiment accession number (e.g. E-MEXP-568) or keyword (e.g. RNAi, "breast cancer", "p53 knockout") in the query box at the top of the ArrayExpress Experiments page.

  • Put quotes around multiple keywords if you want to find experiments where these words are found next to each other e.g. "breast cancer".
  • Entering multiple words without quotes will retrieve experiments where both keywords are found but they are not necessarily adjacent e.g. mouse leukemia.

ArrayExpress experiment search box

You can search for multiple accession numbers at a time. All you need to do is to separate the accession numbers with a space, a comma, a semi-colon or a tab character:
ArrayExpress experiment search box with multi accession example



1.2 Search term expansion by Experimental Factor Ontology (EFO)

ArrayExpress search uses the Experimental Factor Ontology (EFO) to extend your query to synonyms (e.g. "cerebral cortex" and "adult brain cortex") and EFO child-terms (e.g. rib and vertebra as child-terms for bone). You will notice term suggestions (where possible) when you start typing the first few letters. EFO terms are marked by "EFO" at the right hand side. You can reveal some child-terms by clicking the "+" sign and select a more specific term for your search:

EFO expansion with hea

EFO expansoin with rna-s

In the search results, matched search terms are highlighted depending on the nature of the match. For example, when using search term "leukemia":

Search results highlighting



1.3 Filtering results

Filtering options image

Experiments can be filtered using the drop down menus at the top of the results table. They can be filtered by:

  • organism
  • array design used
  • molecule (DNA, RNA, amplicon, metabolite, protein)
  • technology (array, high-throughput sequencing, mass spectrometry)
  • ArrayExpress data only - experiments submitted directly to ArrayExpress, not imported from the NCBI Gene Expression Omnibus (GEO). Use the 'ArrayExpress data only' checkbox to activate this filter. For more information about the data we import from GEO see the GEO data help page.

After selecting a filter, click on the 'Filter' button on the right hand side to filter experiments. To remove a filter, re-select the top option from the list and then click on the 'Filter' button again to requery ArrayExpress.



2. Advanced search

2.1 Combining search terms with "AND", "OR", "NOT"

Enter two or more keywords in the search box with the operators AND, OR or NOT. These operators must be entered in UPPERCASE. AND is the default in searches, so for example, a search for prostate breast will return hits with a match to 'prostate' AND 'breast'.

Search terms of more than one word must be entered inside quotes otherwise only the first word will be searched for. For example, "transcription AND Rattus norvegicus" will effectively be a search for "transcription" AND "Rattus".

Note: If a field is not specified (see section 2.2 below), then the term is searched in every experiment fields, which include the experiment description, sample annotation, publication paper title, submitter's email address, protocol description, etc.

Here are some use cases:

Search terms How the terms are interpreted
1 heart brain Search for experiments which mention both "heart" and "brain". The "AND" relationship between search terms is implicit.
2 heart, brain The same as (1).
3 heart AND brain The explicit way of writing the same query as in (1) and (2).
4 "heart brain" Search for experiments which mention "heart brain" (two terms adjacent to each other).
5 "heart and brain" Search for experiments which mention "heart and brain" (three words strung together as such).
6 heart OR brain Search for experiments which mention either "heart" or "brain".
7 heart NOT brain Search for experiments which mention "heart" but not "brain".



2.2 Restrict search by specifying the search space

You can limit the search space to a certain "field" by writing the query in the format of fieldname:value. A "field" can be, for example, the organism under study, the experimental factor, the assay technology used. Restricting the search space is useful when, for example, you would like to search for experiments which are performed on human samples. By writing "organism:Homo sapiens", this would avoid matching experiments which merely mentioned "Homo sapiens" or "human" in the experiment description but perhaps performed with samples from another organism.

The search box would try to detect "fieldname:" entered and give you search hints:



Again, phrases of more than one word must be entered in quotes, otherwise only the first word will be searched for. Experimental Factor Ontology (EFO) expansion of search terms also applies where possible.

The fields that can be searched are:

Field name Search scope Example use case
accession Experiment primary or secondary accession accession:E-MTAB-1234
array Array design accession or name array:A-AFFY-33
ev (or ef) Experimental variable (or factor), the name of the main variable under study in an experiment. E.g. if the variable is "sex" in a human study, the researchers would be comparing between male and female samples, and "sex" is not merely an attribute the samples happen to have. Has EFO expansion. ev:genotype
evv (or efv) The value of an experimental variable (or factor). E.g. The values for "genotype" factor can be "wild type", "p53-/-". Has EFO expansion. evv:"wild type"
expdesign Experiment design type, related to the questions being addressed by the study, e.g. "time series design", "stimulus or stress design", "genetic modification design". Has EFO expansion. expdesign:"time series"
exptype Experiment type, related to the assay technology used. See the full list of experiment types in ArrayExpress. Has EFO expansion. exptype:"RNA-seq of coding RNA"
gxa Presence/absence of an ArrayExpress experiment in the Expression Atlas. Use values "true" and "false" respectively. gxa:true
pmid PubMed identifier for a publication. pmid:16553887
sa Sample attribute values. Has EFO expansion. sa:fibroblast
sac Sample attribute category. Find experiments that have a specific sample attribute defined, e.g. "age", "strain". Has EFO expansion. sac:age
organism Species of the samples. Can use common name (e.g. "mouse") or binomial nomenclature/Latin names (e.g. "Mus musculus"). Has EFO expansion. organism:"homo sapiens"



2.3 Filtering results by counts

Experiments fulfilling certain count critera can also be searched for. E.g. Those having more than 10 assays (hybridizations). Here are some examples:

Filter Query format What is filtered Example
Number of assays assaycount:[x TO y] Filter on the number of of assays where x <= y and both values are between 0 and 99,999 (inclusive) . To count excluding the values given, use curly brackets e.g. assaycount:{1 TO 5} will find experiments with 2-4 assays. Single numbers may also be given e.g. assaycount:10 will find experiments with exactly 10 assays. assaycount:[1 TO 5]
Number of experimental factors efcount:[x TO y] Filter on the number of experimental factors (the main variables under study in an experiment, e.g. "sex", "genotype", "strain".) efcount:[1 TO 5]
Number of samples samplecount:[x TO y] filter on the number of samples samplecount:[1 TO 5]
Number of sample attribute categories sacount:[x TO y] filter on the number of sample attribute categories. A category can be "patient ID", "treatment", "sex", "diet". sacount:[1 TO 5]
Raw data files raw:true/false filter on the presence/absence of raw data files (native files obtained directly from microarray scanner or sequencing machine)
Processed data files processed:true/false filter on the presence/absence of processed data files (e.g. normalised/transformed data)
Presence of adequate meta-data (MIAME score) miamescore:[x TO y] filter on the MIAME compliance score (maximum score is 5) miamescore:[1 TO 5]
Release date date:yyyy-mm-dd

filter by release date

  • date:2009-12-01 - will search for experiments released on 1st of Dec 2009
  • date:2009* - will search for experiments released in 2009
  • date:[2008-01-01 2008-05-31] - will search for experiments released between 1st of Jan and end of May 2008
date:[2008-01-01 2008-05-31]