Programmatic access

REST-style queries to retrieve results in XML format
   1. Finding Experiments
       Keyword searches for experiments and data files
       Specifing particular fields for searching
       Retrieving detailed metadata of a specific experiment
   2. Finding Files
   3. Finding Protocols
   4. Sorting the output
   5. Format of XML results
JSON queries
Accessing private data
Changes to programmatic access since August 2016


REST-style queries to retrieve results in XML format

Experiments, protocols and files linked to experiments can be searched for by keywords, by searching specific fields (e.g. sample attributes or experiment types), or by selecting experiments that fulfill certain conditions such as the number of assays (hybridizations) or were released on a particular date.

1. Finding Experiments

https://www.ebi.ac.uk/arrayexpress/xml/v3/experiments

This is the basic syntax to search experiments that will retrieve an XML representing the metadata of all public experiments in ArrayExpress.

If you know the experiment's accession number, the following syntax can be used (more on the format of ArrayExpress accession numbers):
https://www.ebi.ac.uk/arrayexpress/xml/v3/experiments/E-xxxx-nnnnn

Multiple accessions can be retrieved at once using:
https://www.ebi.ac.uk/arrayexpress/xml/v3/experiments/E-MTAB-1234,E-MTAB-5678


1.1 Keyword searches for experiments

Keyword searches of all fields for experiments and files linked to experiments can be made using the following format of URLs:

One keyword: https://www.ebi.ac.uk/arrayexpress/xml/v3/experiments?keywords=prostate

Multiple keywords: https://www.ebi.ac.uk/arrayexpress/xml/v3/experiments?keywords=some%20keywords
(or https://www.ebi.ac.uk/arrayexpress/xml/v3/experiments/some%20keywords)

A few points to note when using keyword search:

  • Accession number and keyword searches are case insensitive
  • Use an asterisk * as a multiple character wild card e.g. keywords=colo* will search for colon, colorectal, color etc
  • Use a question mark ? as a single character wild card e.g. keywords=te?t will search for text and test
  • Phrases of more than one word must be entered in quotes e.g. keywords="growth condition"
  • More than one keyword can by searched for using the '+' sign e.g. keywords=lung+cancer. The search treats these as 'AND' statements. See below for using OR and NOT.

More complex queries can be constructed using the operators AND, OR or NOT. All operators must be entered in UPPERCASE. AND is the default if no operator is specified.

https://www.ebi.ac.uk/arrayexpress/xml/v3/experiments?keywords=prostate+AND+breast

https://www.ebi.ac.uk/arrayexpress/xml/v3/experiments?keywords=prostate+breast (same as above)

https://www.ebi.ac.uk/arrayexpress/xml/v3/experiments?keywords=prostate+OR+breast

https://www.ebi.ac.uk/arrayexpress/xml/v3/experiments?keywords=prostate+NOT+breast

Top

1.2 Specifing particular fields for searching

The following field names can be used to specify the field in which a keyword is searched for.

Many of the free-text fields support search term expansion using the Experimental Factor Ontology (EFO). For example, if 'cancer' is entered, the search will be for the term "cancer", for synonyms of "cancer", and sub types of cancer listed in EFO ("lymphoma", "breast adenocarcinoma", etc).

Field name What is searched? Example
accession Experiment primary ArrayExpress or secondary (GEO, ENA, EGA etc) accession https://www.ebi.ac.uk/arrayexpress/xml/v3/experiments?accession=E-MEXP-31
array Array design accession or name (wildcards supported) https://www.ebi.ac.uk/arrayexpress/xml/v3/files?array=A-AFFY-33
expdesign Experiment design type, related to the questions being addressed by the study, e.g. "time series design", "stimulus or stress design", "genetic modification design". Has EFO expansion. https://www.ebi.ac.uk/arrayexpress/xml/v3/files?expdesign=dose+response
exptype Experiment type, related to the assay technology used. List of experiment types in ArrayExpress. Has EFO expansion. https://www.ebi.ac.uk/arrayexpress/xml/v3/experiments?exptype="RNA-seq of non coding RNA"
ef or ev Experimental factor (also called experimental variable), the name of the main variable under study in an experiment. E.g. if the factor is "sex" in a human study, the researchers would be comparing between male and female samples, and "sex" is not merely an attribute the samples happen to have. Has EFO expansion. https://www.ebi.ac.uk/arrayexpress/xml/v3/experiments?ef="cell type"
efv or evv The value of an experimental factor. E.g. The values for "genotype" factor can be "wild type genotype", "p53-/-". Has EFO expansion. https://www.ebi.ac.uk/arrayexpress/xml/v3/experiments?efv=HeLa
sa Sample attribute values, e.g. "male", "liver". Has EFO expansion. https://www.ebi.ac.uk/arrayexpress/xml/v3/files?sa=fibroblast
sac Sample attribute category that is defined in an experiment, e.g. "age", "cell type", "disease". Has EFO expansion. https://www.ebi.ac.uk/arrayexpress/xml/v3/files?sac=age
species Species of the samples. Can use common name (e.g. "mouse") or binomial nomenclature/Latin names (e.g. "Mus musculus"). Has EFO expansion. https://www.ebi.ac.uk/arrayexpress/xml/v3/experiments?species="homo sapiens"
pmid PubMed identifier https://www.ebi.ac.uk/arrayexpress/xml/v3/experiments?pmid=16553887

There are several "Boolean" traits that can be used to filter experiments. These parameters accept "on/off", "1/0" or "true/false" as values.

Field name What is filtered? Example
gxa Presence ("true") / absence ("false") of an ArrayExpress experiment in the Expression Atlas. https://www.ebi.ac.uk/arrayexpress/xml/v3/experiments?gxa=true
directsub If "true" only returns experiments directly submitted to ArrayExpress (i.e. not imported from GEO). For more information about how we import data from GEO see the GEO data help page. https://www.ebi.ac.uk/arrayexpress/xml/v3/experiments?directsub=true
raw Experiment has raw data available. https://www.ebi.ac.uk/arrayexpress/xml/v3/experiments?raw=true
processed Experiment has processed data available. https://www.ebi.ac.uk/arrayexpress/xml/v3/experiments?processed=true

Experiment that fulfill certain count criteria e.g. having more than 10 assays (hybridizations) can also be searched for. These searches use the following syntax:

Field name What is filtered? Example
assaycount [x TO y] The number of of assays where x <= y and both values are between 0 and 99,999 (inclusive). To count excluding the values given use curly brackets e.g. assaycount={1 TO 5} will find experiments with 2-4 assays. Single numbers may also be given e.g. assaycount=10 will find experiments with 10 assays. https://www.ebi.ac.uk/arrayexpress/xml/v3/experiments?assaycount=[1 TO 5]
samplecount [x TO y] The number of samples https://www.ebi.ac.uk/arrayexpress/xml/v3/experiments?samplecount=[1 TO 5]
efcount [x TO y] The number of experimental factors https://www.ebi.ac.uk/arrayexpressxml/v3/experiments?efcount=[1 TO 5]
sacount [x TO y] The number of sample attribute categories https://www.ebi.ac.uk/arrayexpress/xml/v3/experiments?sacount=[1 TO 5]
miamescore [x TO y] The MIAME compliance score (maximum score is 5) https://www.ebi.ac.uk/arrayexpress/xml/v3/experiments?miamescore=[1 TO 5]
minseqe [x TO y] The MINSEQE compliance score (maximum score is 5) https://www.ebi.ac.uk/arrayexpress/xml/v3/experiments?minseqescore=[1 TO 5]
date The release date of the experiment. Format is [YYYY-MM-DD]. Wildcards supported. For example:
date=2009-12-01 will search for experiments released on 1st of Dec 2009.
date=2009* will search for experiments released in 2009.
date=[2008-01-01 2008-05-31] will search for experiments released between 1st of Jan and end of May 2008.
https://www.ebi.ac.uk/arrayexpress/xml/v3/experiments?date=[2008-01-01 2008-05-31]

To link different search criteria together use the '&' symbol. E.g.

https://www.ebi.ac.uk/arrayexpress/xml/v3/experiments?keywords=glioblastoma&species="homo sapiens"

https://www.ebi.ac.uk/arrayexpress/xml/v3/experiments?sa=fibroblast&species="mus musculus"

Top

1.3 Retrieving detailed metadata of a specific experiment

To retrieve more detailed metadata associated with an experiment, e.g. sample annotation, protocols and data files, use following syntax.

Experiment metadata XML for the accession E-xxxx-nnnnn:
https://www.ebi.ac.uk/arrayexpress/xml/v3/experiments/E-xxxx-nnnnn

Files metadata XML for the experiment with accession E-xxxx-nnnnn:
https://www.ebi.ac.uk/arrayexpress/xml/v3/experiments/E-xxxx-nnnnn/files

Samples metadata XML for the experiment with accession E-xxxx-nnnnn:
https://www.ebi.ac.uk/arrayexpress/xml/v3/experiments/E-xxxx-nnnnn/samples

Protocols metadata XML for the experiment with accession E-xxxx-nnnnn:
https://www.ebi.ac.uk/arrayexpress/xml/v3/experiments/E-xxxx-nnnnn/protocols

Top

2. Finding Files

Files on the public ArrayExpress FTP site can be directly searched for using keywords and several field codes.

https://www.ebi.ac.uk/arrayexpress/xml/v3/files

This is the basic syntax to search files that will retrieve an XML representing the file metadata of all public experiments in ArrayExpress (when no parameters are specified).

The following parameters are available for file searches:

Field name What is filtered? Example
keywords Perform full text keyword search in experiment metadata. https://www.ebi.ac.uk/arrayexpress/xml/v3/files?keywords=cancer
accession Experiment primary ArrayExpress or secondary (GEO, ENA, EGA etc) accession. Wildcard supported. https://www.ebi.ac.uk/arrayexpress/xml/v3/files?accession=E-MEXP-31
name The file name https://www.ebi.ac.uk/arrayexpress/xml/v3/files?name=E-MTAB-3*
kind The file type. Choose from: processed, raw, cel, adf, idf, sdrf, r-object https://www.ebi.ac.uk/arrayexpress/xml/v3/files?kind=raw
extension The file extension (e.g. 'txt' or 'tar.gz') https://www.ebi.ac.uk/arrayexpress/xml/v3/files?extension=zip

A few points to note when searching for files:

Finding information about individual files from an experiment (not the zipped archive files) and their ftp location is possible via
https://www.ebi.ac.uk/arrayexpress/xml/v3/experiments/E-xxxx-nnnnn/samples
under the tag <file> and/or <scan> for sequencing experiments.

https://www.ebi.ac.uk/arrayexpress/xml/v3/files/E-xxxx-nnnnn
is equivalent to https://www.ebi.ac.uk/arrayexpress/xml/v3/experiments/E-xxxx-nnnnn/files
and retrieves files metadata for the experiment with accession E-xxxx-nnnnn.

https://www.ebi.ac.uk/arrayexpress/xml/v3/files/some%20keywords
is equivalent to https://www.ebi.ac.uk/arrayexpress/xml/v3/files?keywords=some%20keywords
and retrieves files metdata for search keywords "some keywords".

Top

3. Finding Protocols

https://www.ebi.ac.uk/arrayexpress/xml/v3/protocols

This is the basic syntax to search protocols that will retrieve an XML representing the protocols of all public experiments in ArrayExpress (when no parameters are specified).

The following parameters are available for protocol searches:

Field name What is filtered? Example
keywords Perform full text keyword search in protocols metadata. Has EFO expansion. https://www.ebi.ac.uk/arrayexpress/xml/v3/protocols?keywords=Trizol
accession Protocol accession. Wildcard supported. https://www.ebi.ac.uk/arrayexpress/xml/v3/protocols?accession=P-GSE55015-2
experiment The file name https://www.ebi.ac.uk/arrayexpress/xml/v3/files?name=E-MTAB-3*
type The protocol type. Choose from ontology terms. Has EFO expansion. https://www.ebi.ac.uk/arrayexpress/xml/v3/protocols?type="normalization data transformation protocol"
standard If protocol is standard public protocol (Boolean, accepts "true/false", "0/1" or "on/off"). https://www.ebi.ac.uk/arrayexpress/xml/v3/protocols?standard=true

A few points to note when searching for protocols:

https://www.ebi.ac.uk/arrayexpress/xml/v3/protocols/E-xxxx-nnnnn
is equivalent to https://www.ebi.ac.uk/arrayexpress/xml/v3/experiments/E-xxxx-nnnnn/protocols
and retrieves protocols metadata for the experiment with accession E-xxxx-nnnnn.

https://www.ebi.ac.uk/arrayexpress/xml/v3/protocols/some%20keywords
is equivalent to https://www.ebi.ac.uk/arrayexpress/xml/v3/protocols?keywords=some%20keywords
and retrieves protocol metdata for search keywords "some keywords".

Top

4. Sorting the output

The results of a query can be sorted on several fields in ascending or descending order using sortby=xxx and sortorder=ascending/descending. The fields that can be used for sorting are:

  • accession
  • name
  • assays
  • species
  • releasedate
  • fgem (for "final gene expression matrix", i.e. processed data)
  • raw
  • atlas

Example query:

https://www.ebi.ac.uk/arrayexpress/xml/v3/experiments?keywords=prostate&sortby=accession&sortorder=ascending

Top

5. Format of XML results

The XML documents returned from a search lists how many experiments were retrieved and then either information about the experiment, or lists the protocols or files associated with an experiment depending on the search made. The information about each experiment/file/protocol is within <experiment></experiment>, <file></file>, or <protocol></protocol> elements, respectively.

Example output of an experiments search:
https://www.ebi.ac.uk/arrayexpress/xml/v3/experiments/E-MEXP-3682
Example output for XML queries (experiments)

Example output of a files search:
https://www.ebi.ac.uk/arrayexpress/xml/v3/files/E-MEXP-3682
Example output for XML queries (files)

Top

JSON queries

To retrieve the results of queries in JSON format, the base URL changes from https://www.ebi.ac.uk/arrayexpress/xml/v3/ to https://www.ebi.ac.uk/arrayexpress/json/v3/. All queries described above can be carried out and will produce a JSON-format file which can be downloaded.

Example experiment queries:

https://www.ebi.ac.uk/arrayexpress/json/v3/experiments?keywords=cancer+breast

https://www.ebi.ac.uk/arrayexpress/json/v3/experiments/E-MEXP-31

Example file queries:

https://www.ebi.ac.uk/arrayexpress/json/v3/files?keywords=cancer+breast

https://www.ebi.ac.uk/arrayexpress/json/v3/files/E-MEXP-31

There is one extra parameter 'jsonp' - Enable JSONP; the JSON output will be prepended with the value of the jsonp parameter and wrapped in parentheses. For example:

https://www.ebi.ac.uk/arrayexpress/json/v3/experiments/E-MEXP-31?jsonp=experiment

Top

Accessing private data

Private data are usually pre-published/unpublished data. Access to private data is under password control.

  1. Use your client to retrieve the following URL (inserting the username and password provided to you by the ArrayExpress curators):
    http://www.ebi.ac.uk/arrayexpress/verify-login.txt?u=username&p=password If the login details are correct this will return a login token that is unique to your username, IP address and client. If the login fails you will get a blank page.
  2. Set up 2 cookies for the domain http://www.ebi.ac.uk/ :
    AeLoggedUser = username
    AeLoginToken = login token from URL in step 1
  3. Use these cookies when making all subsequent requests to ArrayExpress. It is important to use the same client that was used in step 1.

Top

Changes to programmatic access since August 2016

  • New protocols search function.
  • Detailed metadata for a specific experiment can be queried. This allows retrieval of individual sample information, including their annotation and linked file locations.
  • New search fields have been added: "sac" (sample attribute category) and "minseqescore".
  • The XML layout has slightly changed. E.g. file records are not wrapped in <experiment> tags anymore.