Programmatic access

1. REST-style queries to retrieve results in XML format
  1.1. Keyword searches for experiments and data files
  1.2. Specifing particular fields for searching
  1.3. Expanding searches using the Experimental Factor Ontology
  1.4. Construction of queries using AND, OR and NOT operators
  1.5. Filtering to get ArrayExpress direct submission data
  1.6. Filtering by counts
  1.7. Sorting
  1.8. Old-style queries
      1.8.1. Keyword searches
      1.8.2. Species searches
      1.8.3. Retrieving files
  1.9. Accessing private data
  1.10. Retrieve information about all public experiments
  1.11. Format of XML results
      1.11.1. Searches for experiments
      1.11.2. Searches for files
2. JSON web services format
3. Changes to programmatic access since September 2010

 

1. REST-style queries to retrieve results in XML format

Experiments and files linked to experiments can by searched for using keywords, by searching specific fields (e.g. sample attributes or experiment types), or by selecting experiments that have fulfill certain conditions such as the number of assays (hybridizations) or were released on a particular date.

1.1. Keyword searches for experiments and files associated with experiments

Keyword searches of all fields for experiments and files linked to experiments can be made using the following format of URLs:

http://www.ebi.ac.uk/arrayexpress/xml/v2/experiments?keywords=prostate
http://www.ebi.ac.uk/arrayexpress/xml/v2/files?keywords=glioblastoma

 

A few points to note when using keyword search:

  • Accession number and keyword searches are case insensitive
  • Phrases of more than one word must be entered in quotes e.g. keywords="growth condition"
  • More than one keyword can by searched for using the '+' sign e.g. keywords=lung+cancer. The search treats these as 'AND' statements. See below for using OR and NOT.
  • Use an asterisk * as a multiple character wild card e.g. keywords=colo* will search for colon, colorectal, color etc
  • Use a question mark ? as a single character wild card e.g. keywords=te?t will search for text and test

Top

1.2 Specifing particular fields for searching

The following terms can be used to specify the field in which a term is searched for. Either experiments or files can be searched for in all cases by using the 'experiments' or 'files' term in the URL.

Field name Searches Example
accession Experiment primary or secondary accession http://www.ebi.ac.uk/arrayexpress/xml/v2/experiments?accession=E-MEXP-31
array Array design accession or name http://www.ebi.ac.uk/arrayexpress/xml/v2/files?array=A-AFFY-33
ef Experimental factor, the name of the main variable under study in an experiment. E.g. if the factor is "sex" in a human study, the researchers would be comparing between male and female samples, and "sex" is not merely an attribute the samples happen to have. Has EFO expansion. http://www.ebi.ac.uk/arrayexpress/xml/v2/experiments?ef="cell type"
efv The value of an experimental factor. E.g. The values for "genotype" factor can be "wild type genotype", "p53-/-". Has EFO expansion. http://www.ebi.ac.uk/arrayexpress/xml/v2/experiments?efv=HeLa
expdesign Experiment design type, related to the questions being addressed by the study, e.g. "time series design", "stimulus or stress design", "genetic modification design". Has EFO expansion. http://www.ebi.ac.uk/arrayexpress/xml/v2/files?expdesign=dose+response
exptype Experiment type, related to the assay technology used. Has EFO expansion. http://www.ebi.ac.uk/arrayexpress/xml/v2/experiments?exptype="RNA-seq of non coding RNA"
gxa Presence ("true") /absence ("false") of an ArrayExpress experiment in the Expression Atlas. http://www.ebi.ac.uk/arrayexpress/xml/v2/experiments?gxa=true
pmid PubMed identifier http://www.ebi.ac.uk/arrayexpress/xml/v2/experiments?pmid=16553887
sa Sample attribute values, e.g. "male", "liver". Has EFO expansion. http://www.ebi.ac.uk/arrayexpress/xml/v2/files?sa=fibroblast
species Species of the samples. Can use common name (e.g. "mouse") or binomial nomenclature/Latin names (e.g. "Mus musculus"). Has EFO expansion. http://www.ebi.ac.uk/arrayexpress/xml/v2/experiments?species="homo sapiens"

To link different search criteria together use the '&' symbol. E.g.

http://www.ebi.ac.uk/arrayexpress/xml/v2/experiments?keywords=gliobastoma&species="homo sapiens"
http://www.ebi.ac.uk/arrayexpress/xml/v2/experiments?sa=fibroblast&species="mus musculus"

Top

1.3 Expanding searches using the Experimental Factor Ontology

The Experimental Factor Ontology is an application-focused ontology modelling the experimental factors in ArrayExpress. You can expand searches using EFO. For example, if 'cancer' is entered, the search will be for the term "cancer", for synonyms of "cancer", and sub types of cancer listed in EFO ("lymphoma", "breast adenocarcinoma", etc).

Search terms are expanded using EFO by default for keyword-based searches and all relevant fields:

  • organism
  • exptype (for "experiment type")
  • expdesign (for "experiment design")
  • ef (for "experimental factor")
  • efv (for "experimental factor value")
  • sa (for "sample attribute")

8 May 2014 update: Please note that the parameter expandefo (which used to control whether EFO expansion is turned on or not) is no longer used.

Top

1.4 Construction of queries using AND, OR and NOT operators

More complex queries can be constructed using the operators AND, OR or NOT. All operators must be entered in UPPERCASE. AND is the default if no operator is specified. Again, either experiments or files can be searched for in all cases by using the 'experiments' or 'files' term in the URL.

http://www.ebi.ac.uk/arrayexpress/xml/v2/experiments?keywords=prostate+AND+breast
http://www.ebi.ac.uk/arrayexpress/xml/v2/experiments?keywords=prostate+breast (same as above)
http://www.ebi.ac.uk/arrayexpress/xml/v2/experiments?keywords=prostate+OR+breast
http://www.ebi.ac.uk/arrayexpress/xml/v2/experiments?keywords=prostate+NOT+breast

Top

1.5 Filtering to get ArrayExpress direct submission data

Searches can be limited to experiments submitted directly to ArrayExpress (this is excludes data imported from the Gene Expression Omnibus (GEO, www.ncbi.nlm.nih.gov/geo/)) or to only the imported data. To do this use the following syntax:

Field What is searched Example
directsub only experiments directly submitted to ArrayExpress (directsub=true), or only experiments imported from the GEO database (directsub=false) http://www.ebi.ac.uk/arrayexpress/xml/v2/experiments?directsub=true

For more information about how we import data from GEO see the GEO data help page.

 

Top

1.6 Filtering by counts

Experiments fulfilling certain count critera can also be searched for e.g. having more than 10 assays (hybridizations). These searches use the following syntax:

Filter What is filtered Example
assaycount:[x TO y] filter on the number of of assays where x <= y and both values are between 0 and 99,999 (inclusive) . To count excluding the values given use curly brackets e.g. assaycount:{1 TO 5} will find experiments with 2-4 assays. Single numbers may also be given e.g. assaycount:10 will find experiments with 10 assays. http://www.ebi.ac.uk/arrayexpress/xml/v2/experiments?assaycount=[1 TO 5]
efcount:[x TO y] filter on the number of experimental factors http://www.ebi.ac.uk/arrayexpressxml/v2/experiments?efcount=[1 TO 5]
samplecount:[x TO y] filter on the number of samples http://www.ebi.ac.uk/arrayexpress/xml/v2/experiments?samplecount=[1 TO 5]
sacount:[x TO y] filter on the number of sample attribute categories http://www.ebi.ac.uk/arrayexpress/xml/v2/experiments?sacount=[1 TO 5]
raw:true/false filter on the presence/absence of raw data files
http://www.ebi.ac.uk/arrayexpress/xml/v2/experiments?raw=true
processed:true/false filter on the presence/absence of processed data files
http://www.ebi.ac.uk/arrayexpress/xml/v2/experiments?processed=true
miamescore:[x TO y] filter on the MIAME compliance score (maximum score is 5) http://www.ebi.ac.uk/arrayexpress/xml/v2/experiments?miamescore=[1 TO 5]
date:yyyy-mm-dd

filter by release date

  • date:2009-12-01 - will search for experiments released on 1st of Dec 2009
  • date:2009* - will search for experiments released in 2009
  • date:[2008-01-01 2008-05-31] - will search for experiments released between 1st of Jan and end of May 2008
http://www.ebi.ac.uk/arrayexpress/xml/v2/experiments?date=[2008-01-01 2008-05-31]

Top

1.7 Sorting

The results of a query can be sorted on several fields in ascending or descending order using sortby=xxx and sortorder=ascending/descending. The fields that can be used for sorting are:

  • accession
  • name
  • assays
  • species
  • releasedate
  • fgem (for "final gene expression matrix", i.e. processed data)
  • raw
  • atlas

Example queries:

http://www.ebi.ac.uk/arrayexpress/xml/v2/experiments?keywords=prostate&sortby=accession&sortorder=ascending
http://www.ebi.ac.uk/arrayexpress/xml/v2/files?sa=heart&sortby=releasedate&sortorder=descending

Top

1.8 Old-style queries

The URL format for ArrayExpress searches was changed in September 2010 to allow searching of specific fields, use of AND, OR, NOT, counts of attributes and ordering of results.

The old-style URLs are still fully functional however. The format of the pre-September 2010 URL is described below.

 Top


1.8.1. Keyword searches

Keywords can be used to search for specific experiments, with the results returned in an XML document, using URLs with the experiments term and keywords=X format. E.g.

http://www.ebi.ac.uk/arrayexpress/xml/experiments/E-MEXP-31
http://www.ebi.ac.uk/arrayexpress/xml/experiments?keywords=cancer

To narrow the search by using more than one keyword, separate terms by + e.g.

http://www.ebi.ac.uk/arrayexpress/xml/experiments?keywords=cancer+breast

Top

1.8.2 Species searches

To narrow search to experiments with samples of a particular species include the 'species' term

http://www.ebi.ac.uk/arrayexpress/xml/experiments?keywords=cancer&species=Homo+sapiens

op

 

1.8.3 Retrieving files

To retrieve the list of files associated with a set of experiments use the 'files' term in the URL

http://www.ebi.ac.uk/arrayexpress/xml/files?keywords=cancer+breast

To retrieve a list of files for a particular experiment format the query as follows:

http://www.ebi.ac.uk/arrayexpress/xml/files/E-MEXP-31

Also note:

  • Accession number and keyword searches are case insensitive.
  • Parts of words can also be searched e.g. colo will retrieve results with matches to both colon and colorectal (and also color, colony etc), unless there is a paramter wholewords=on in the query string e.g.
http://www.ebi.ac.uk/arrayexpress/xml/experiments?keywords=cancer&species=Homo+sapiens&wholewords=on

Top

1.9 Accessing private data

Private data are usually pre-published/unpublished data. Access to private data is under password control.

  1. Use your client to retrieve the following URL (inserting the username and password provided to you by the ArrayExpress curators):
    http://www.ebi.ac.uk/arrayexpress/verify-login.txt?u=username&p=password If the login details are correct this will return a login token that is unique to your username, IP address and client. If the login fails you will get a blank page.
  2. Set up 2 cookies for the domain http://www.ebi.ac.uk/ :
    AeLoggedUser = username
    AeLoginToken = login token from URL in step 1
  3. Use these cookies when making all subsequent requests to ArrayExpress. It is important to use the same client that was used in step 1.

Top

1.10 Retrieve information about all public experiments

To retrieve an xml file with information about all public experiments remove the keyword part of the search.

http://www.ebi.ac.uk/arrayexpress/xml/v2/experiments

Top

1.11 Format of XML results

The XML documents returned from a search lists how many experiments were retrieved and then either information about the experiment or lists the files associated with an experiment depending on the search made. In both cases information about each experiment is in an <experiment></experiment> element.

Top

1.11.1 Searches for experiments

Example:

xml example - experiment

Top

1.11.2 Searches for files

Example:

xml example - files

Top

2. JSON web services format

To retrieve the results of queries in JSON format, the base URL changes from http://www.ebi.ac.uk/arrayexpress/xml/v2/ to http://www.ebi.ac.uk/arrayexpress/json/v2/. All queries described above can be carried out and will produce a JSON-format file while can by downloaded.

E.g. experiment queries:

http://www.ebi.ac.uk/arrayexpress/json/v2/experiments?keywords=cancer+breast
http://www.ebi.ac.uk/arrayexpress/json/v2/experiments/E-MEXP-31

E.g. file queries:

http://www.ebi.ac.uk/arrayexpress/json/v2/files?keywords=cancer+breast
http://www.ebi.ac.uk/arrayexpress/json/v2/files/E-MEXP-31

There is one extra parameter 'jsonp' - Enable JSONP; the JSON output will be prepended with the value of the jsonp parameter and wrapped in parentheses. For example:

http://www.ebi.ac.uk/arrayexpress/json/v2/experiments/E-MEXP-31?jsonp=experiment

 

3. Changes to programmatic access since September 2010

See also section 1.8 above for old-style queries used before September 2010.

  1. URL change - the base URL for queries has changed to http://www.ebi.ac.uk/arrayexpress/xml/v2/. The previous format for queries is still supported.
  2. Additional functionality including filtering and complex queries (AND, OR , NOT) has been added
  3. Addition of JSON web services