Accessing ENA data programmatically
Programmatic access using ENA browser REST URLs
Using ENA browser REST URLs, a wide variety of data is accessible in a variety of different formats. Single or multiple identifiers (including data ranges) can be used to retrieve up to 100,000 records at a time, which can be gzip-compressed or uncompressed. It is also possible to request specific taxonomy, or to retrieve archived versions of the data.
Here are some examples of what can be achieved through ENA browser REST URLs:
Retrieve EMBL-Bank records in XML or flat file formats;
Retrieve EMBL-Bank records using sequence versions;
- Retrieve EMBL-Bank graphical images;
Retrieve Taxon records in XML or Darwin Core XML formats;
Retrieve a list of SRA submitted files or FASTQ files;
Retrieve SRA metadata in XML format;
Retrieve Trace sequences in FASTA or FASTQ formats;
- Retrieve Trace metadata in XML format.
Programmatic access using Dbfetch
Dbfetch provides an easy way to retrieve data from multiple databases, including ENA, in a consistent manner (Figure 50). Dbfetch can be used from any web browser, as well as within a web-aware scripting tool that uses wget, lynx or similar.
Figure 50. EBI Dbfetch tool showing the range of databases and data formats available.