Taxonomy formats

ENA's supports the following taxonomy formats: ENA Taxonomy XML and Darwin Core XML/Archive formats. For example, the following URLs return the taxonomy information for Eukaryota in ENA Taxonomy and Darwin Core XML formats from the ENA Browser:

The Darwin Core Archive is available for download.

About GBIF and the Darwin Core XML/Archive

The Global Biodiversity Information Facility (GBIF) aims to make the world’s Biodiversity data freely and universally available via the Internet and to provide an essential global informatics infrastructure for Biodiversity research and applications worldwide.

The Darwin Core standard has been used to mobilise the vast majority of specimen occurrence and observational records within the GBIF network. It was originally conceived to facilitate the discovery, retrieval, and integration of information about modern biological specimens, their spatio-temporal occurrence, and their supporting evidence housed in collections (physical or digital). The Darwin Core achieved this by defining a set of items in an ordered list, published in an XML document.

The preferred format for publishing data to the GBIF network is the Darwin Core Archive (DwC-A), which is essentially a set of text files with a simple descriptor to inform others how the files are organized. The central idea of this archive is that its data files are logically arranged with one core data file surrounded by any number of ’extensions’. Each extension record points to a record in the core file; in this way, many extension records can exist for each single core record. Sharing entire datasets instead of using pageable web services allows much simpler and more efficient data transfer with the Darwin Core Archive.

The ENA has mapped its taxonomy into Darwin Core XML and Archive formats. the Darwin Core XML format is supported in the ENA Browser and the Darwin Core Archive is made available for download.

The Darwin Core Archive comprises of 3 files: a tab-delimited data file, an XML file listing the descriptors of used in the data file and an another XML file representing a metadata file with information related to the data itself, the data supplier, the archive creator name of the person who created the archive.

In the future, ENA may also develop a specific molecular extension which could be used by researchers interested in sharing molecular data.

Latest ENA news

19 Jan 2018: Forthcoming changes to WGS and TSA sequences

ENA is making changes to provision of WGS and TSA sequences

05 Jan 2018: ENA release 134

Release 134 of ENA's assembled/annotated sequences is now available

21 Dec 2017: ENA services over the holiday period

Between Friday 22nd December and Tuesday 2nd January ENA services such as submissions and retrieval...

21 Dec 2017: ENA release 134 expected early January

The last release of assembled and annotated sequences for 2017 (134) has been particularly...