spacer
spacer

Programmatic Access

 

1. Updates to programmatic access
  1.1 Recent changes to programmatic access
  1.2 Subscribe to the announcement list
 
2. Programmatic access of the Experiments Archive
  2.1 REST-style queries to retrieve results in XML format
    2.1.1 Keyword searches for experiments and data files
    2.1.2 Specifing particular fields for searching
    2.1.3 Expanding searches using the Experimental Factor Ontology
    2.1.4 Construction of queries using AND, OR and NOT operators
    2.1.5 Filtering to get ArrayExpress direct submission data
    2.1.6 Filtering by counts
    2.1.7 Sorting
    2.1.8 Old-style queries
      2.1.8.1. Keyword searches
      2.1.8.2 Species searches
      2.1.8.3 Retrieving files
    2.1.9 Accessing private data
    2.1.10 Retrieve information about all public experiments
    2.1.11 Format of XML results
      2.1.11.1 Searches for experiments
      2.1.11.2 Searches for files
  2.2 JSON web services format
 
3. Programmatic access of the Atlas of Gene Expression

 

1. Updates to programmatic access

1.1 Recent changes to programmatic access

September 2010

  1. URL change - the base URL for queries has changed to http://www.ebi.ac.uk/arrayexpress/xml/v2/. The previous format for queries is still supported.
  2. Additional functionality including filtering and complex queries (AND, OR , NOT) has been added
  3. Addition of JSON web services
  4. SOAP Web services for the Data Warehouse are no longer supported . The Data Warehouse is a legacy database, please use the Gene Expression Atlas instead.

 

Top

1.2 Subscribe to the announcment list

To be notified of any future changes and extensions to the programmatic access please subscribe to the announcement list using the form below.

ArrayExpress announcements and important news


(Closed mailing list so please wait for approval)

 

Top

2. Programmatic access of the Experiments Archive

Experiment search results can be retrieved either in parsable XML format or JSON.

2.1 REST-style queries to retrieve results in XML format

Experiments and files linked to experiments can by searched for using keywords, by searching specific fields (e.g. sample attributes or experiment types), or by selecting experiments that have fulfill certain conditions such as the number of assays (hybridizations) or were released on a particular date.

2.1.1 Keyword searches for experiments and files associated with experiments

Key word searches of all fields for experiments and files linked to experiments can be made using the following format of URLs:

http://www.ebi.ac.uk/arrayexpress/xml/v2/experiments?keywords=prostate
http://www.ebi.ac.uk/arrayexpress/xml/v2/files?keywords=glioblastoma

 

  • Accession number and keyword searches are case insensitive
  • Phrases of more than one word must be entered in quotes e.g. keywords="breast cancer cells"
  • More than one keyword can by searched for using the '+' sign e.g. keywords=lung+cancer. The search treats these as 'AND' statements. See below for using OR and NOT.
  • Use an asterisk * as a multiple character wild card e.g. keywords=colo* will search for colon, colorectal, color etc
  • Use a question mark ? as a single character wild card e.g. keywords=te?t will search for text and test

 

Top

2.1.2 Specifing particular fields for searching

The following terms can be used to specify the field in which a term is searched for. Either experiments or files can be searched for in all cases by using the 'experiments' or 'files' term in the URL.

Field name Searches Example
accession Experiment primary or secondary accession http://www.ebi.ac.uk/arrayexpress/xml/v2/experiments?accession=E-MEXP-31
array Array design accession or name http://www.ebi.ac.uk/arrayexpress/xml/v2/files?array=A-AFFY-33
ef Experimental factor, the name of the main variables in an experiment. http://www.ebi.ac.uk/arrayexpress/xml/v2/experiments?ef=CellType
efv Experimental factor value. Has EFO expansion. http://www.ebi.ac.uk/arrayexpress/xml/v2/experiments?efv=HeLa
expdesign Experiment design type http://www.ebi.ac.uk/arrayexpress/xml/v2/files?expdesign=dose+response
exptype Experiment type. Has EFO expansion. http://www.ebi.ac.uk/arrayexpress/xml/v2/experiments?exptype="RNA-seq"
gxa Presence in the Gene Expression Atlas. Only value is gxa=true. http://www.ebi.ac.uk/arrayexpress/xml/v2/experiments?gxa=true
pmid PubMed identifier http://www.ebi.ac.uk/arrayexpress/xml/v2/experiments?pmid=16553887
sa

Sample attribute values. Has EFO expansion.

http://www.ebi.ac.uk/arrayexpress/xml/v2/files?sa=fibroblast
species Species of the samples.Has EFO expansion.

http://www.ebi.ac.uk/arrayexpress/xml/v2/experiments?species="homo sapiens"

 

To link different search criteria together use the '&' symbol. E.g.

http://www.ebi.ac.uk/arrayexpress/xml/v2/experiments?keywords=gliobastoma&species="homo sapiens"
http://www.ebi.ac.uk/arrayexpress/xml/v2/experiments?sa=fibroblast&species="mus musculus"

 

Top

2.1.3 Expanding searches using the Experimental Factor Ontology

The Experimental Factor Ontology is an application-focused ontology modelling the experimental factors in ArrayExpress. If the term expandefo=true is included in the URL then the search will be for occurrences of your term, and for synonyms and child terms in the Experimental Factor Ontology. E.g. if 'cancer' is entered the search will be for the term cancer, for synonyms of cancer and sub types of cancer such as lymphoma.

http://www.ebi.ac.uk/arrayexpress/xml/v2/experiments?keywords=heart&expandefo=on

Some specific fields have EFO expansion on always: experimental factor value, experiment type, sample attribute and species.

 

Top

2.1.4 Construction of queries using AND, OR and NOT operators

More complex queries can be constructed using the operators AND, OR or NOT. AND is the default if no operator is specified. Again, either experiments or files can be searched for in all cases by using the 'experiments' or 'files' term in the URL.

http://www.ebi.ac.uk/arrayexpress/xml/v2/experiments?keywords=prostate+AND+breast
http://www.ebi.ac.uk/arrayexpress/xml/v2/experiments?keywords=prostate+breast (same as above)
http://www.ebi.ac.uk/arrayexpress/xml/v2/experiments?keywords=prostate+OR+breast
http://www.ebi.ac.uk/arrayexpress/xml/v2/experiments?keywords=prostate+NOT+breast

 

Top

2.1.5 Filtering to get ArrayExpress direct submission data

Searches can be limited to experiments submitted directly to ArrayExpress (this is excludes data imported from the Gene Expression Omnibus (GEO, www.ncbi.nlm.nih.gov/geo/)) or to only the imported data. To do this use the following syntax:

Field What is searched Example

directsub

only experiments directly submitted to ArrayExpress (directsub=true), or only experiments imported from the GEO database (directsub=false) http://www.ebi.ac.uk/arrayexpress/xml/v2/experiments?directsub=true

For more information about how we import data from GEO see the GEO data help page.

 

Top

2.1.6 Filtering by counts

Experiments fulfilling certain count critera can also be searched for e.g. having more than 10 assays (hybridizations). These searches use the following syntax:

 

Filter What is filtered Example

assaycount:[x TO y]

filter on the number of of assays where x <= y and both values are between 0 and 99,999 (inclusive) . To count excluding the values given use curly brackets e.g. assaycount:{1 TO 5} will find experiments with 2-4 assays. Single numbers may also be given e.g. assaycount:10 will find experiments with 10 assays. http://www.ebi.ac.uk/arrayexpress/xml/v2/experiments?assaycount=[1 TO 5]
efcount:[x TO y] filter on the number of experimental factors http://www.ebi.ac.uk/arrayexpressxml/v2/experiments?efcount=[1 TO 5]
samplecount:[x TO y] filter on the number of samples http://www.ebi.ac.uk/arrayexpress/xml/v2/experiments?samplecount=[1 TO 5]
sacount:[x TO y] filter on the number of sample attribute categories http://www.ebi.ac.uk/arrayexpress/xml/v2/experiments?sacount=[1 TO 5]
rawcount:[x TO y] filter on the number of raw files http://www.ebi.ac.uk/arrayexpress/xml/v2/experiments?rawcount=[1 TO 5]
fgemcount:[x TO y] filter on the number of final gene expression matrix (processed data) files http://www.ebi.ac.uk/arrayexpress/xml/v2/experiments?fgemcount=[1 TO 5]
miamescore:[x TO y] filter on the MIAME compliance score (maximum score is 5) http://www.ebi.ac.uk/arrayexpress/xml/v2/experiments?miamescore=[1 TO 5]
date:yyyy-mm-dd

filter by release date

  • date:2009-12-01 - will search for experiments released on 1st of Dec 2009
  • date:2009* - will search for experiments released in 2009
  • date:[2008-01-01 2008-05-31] - will search for experiments released between 1st of Jan and end of May 2008
http://www.ebi.ac.uk/arrayexpress/xml/v2/experiments?date=[2008-01-01 2008-05-31]

 

Top

2.1.7 Sorting

The results of a query can be sorted on several fields in ascending or descending order using sortby=xxx and sortorder=ascending/descending. The fields that can be used for sorting are:

  • accession
  • name
  • assays
  • species
  • releasedate
  • fgem
  • raw
  • atlas

Examples

http://www.ebi.ac.uk/arrayexpress/xml/v2/experiments?keywords=prostate&sortby=accession&sortorder=ascending
http://www.ebi.ac.uk/arrayexpress/xml/v2/files?sa=heart&sortby=releasedate&sortorder=descending

 

Top

2.1.8 Old-style queries

The URL format for archive searches was changed in September 2010 to allow searching of specific fields, use of AND, OR, NOT, counts of attributes and ordering of results.

The old-style URLs are still fully functional however. The format of the pre-September 2010 URL is described below.

 

Top

2.1.8.1. Keyword searches

Keywords can be used to search for specific experiments, with the results returned in an XML document, using URLs with the experiments term and keywords=X format. E.g.

http://www.ebi.ac.uk/arrayexpress/xml/experiments/E-MEXP-31
http://www.ebi.ac.uk/arrayexpress/xml/experiments?keywords=cancer

To narrow the search by using more than one keyword, separate terms by + e.g.

http://www.ebi.ac.uk/arrayexpress/xml/experiments?keywords=cancer+breast

 

Top

2.1.8.2 Species searches

To narrow search to experiments with samples of a particular species include the 'species' term

http://www.ebi.ac.uk/arrayexpress/xml/experiments?keywords=cancer&species=Homo+sapiens

op

 

2.1.8.3 Retrieving files

To retrieve the list of files associated with a set of experiments use the 'files' term in the URL

http://www.ebi.ac.uk/arrayexpress/xml/files?keywords=cancer+breast
 
 
 
 
 
 
 
 
 

To retrieve a list of files for a particular experiment format the query as follows:

http://www.ebi.ac.uk/arrayexpress/xml/files/E-MEXP-31
 
 
 
 
 
 
 
 
 

Also note:

  • Accession number and keyword searches are case insensitive.
  • Parts of words can also be searched e.g. colo will retrieve results with matches to both colon and colorectal (and also color, colony etc), unless there is a paramter wholewords=on in the query string e.g.
http://www.ebi.ac.uk/arrayexpress/xml/experiments?keywords=cancer&species=Homo+sapiens&wholewords=on

 

Top

2.1.9 Accessing private data

 

  1. Use your client to retrieve the following URL (inserting the username and
    password provided to you by the ArrayExpress curators):
    http://www.ebi.ac.uk/arrayexpress/verify-login.txt?u=username&p=password
    If the login details are correct this will return a login token that is unique
    to your username, IP address and client. If the login fails you will get a
    blank page.
  2. Set up 2 cookies for the domain http://www.ebi.ac.uk/ :
    AeLoggedUser = username
    AeLoginToken = login token from URL in step 1
  3. Use these cookies when making all subsequent requests to ArrayExpress. It is important to use the same client that was used in step 1.

 

Top

2.1.10 Retrieve information about all public experiments

To retrieve an xml file with information about all public experiments remove the keyword part of the search.

http://www.ebi.ac.uk/arrayexpress/xml/v2/experiments

 

Top

2.1.11 Format of XML results

The XML documents returned from a search lists how many experiments were retrieved and then either information about the experiment or lists the files associated with an experiment depending on the search made. In both cases information about each experiment is in an <experiment></experiment> element.

Top

2.1.11.1 Searches for experiments

Example

xml example - experiment

Top

2.1.11.2 Searches for files

Example

xml example - files

Top

Top2.2 JSON web services format

To retrieve the results of queries in JSON format, the base URL changes from http://www.ebi.ac.uk/arrayexpress/xml/v2/ to http://www.ebi.ac.uk/arrayexpress/json/v2/. All queries described above can be carried out and will produce a JSON-format file while can by downloaded.

E.g. experiment queriesTop

http://www.ebi.ac.uk/arrayexpress/json/v2/experiments?keywords=cancer+breast
http://www.ebi.ac.uk/arrayexpress/json/v2/experiments/E-MEXP-31

E.g. file queriesTop

http://www.ebi.ac.uk/arrayexpress/json/v2/files?keywords=cancer+breast
http://www.ebi.ac.uk/arrayexpress/json/v2/files/E-MEXP-31

There is one extra parameter 'jsonp' - Enable JSONP; the JSON output will be prepended with the value of the jsonp parameter and wrapped in parentheses.

E.g.Top

http://www.ebi.ac.uk/arrayexpress/json/v2/experiments/E-MEXP-31?jsonp=experiment

 

 

3. Programmatic access of the Atlas of Gene Expression

For information about programmatic access of the Atlas of Gene Expression see the Atlas API help page.

Top

Any further questions, please see our FAQ.

spacer
spacer