Taxonomy

The classification system for source biological organisms for all INSDC records is the NCBI Taxonomy and is available from the ENA browser (see here for an example). ENA curators work alongside taxonomist at the NCBI to ensure that all ENA records display the accepted organism name and classification hierarchy. NCBI Taxonomy is an incomplete classification system in that it only considers taxa for data that are represented in INSDC records. Users should note that taxa are only displayed if at least one associated ENA record is available.

Taxon consults

As part of the curation of incoming data, ENA curators check the source organisms. Many organisms will already be classified and no further action will be necessary. Curators check for spellings and alternative names, including but not limited to acronyms (viruses), synonyms, teleomorphs, anamorphs and misnomers, and revise these to the accepted scientific name from the NCBI Taxonomy. Organisms which are published but unclassified require a simple taxonomy referral; a standard letter is sent to the NCBI Taxonomy team with details of the organism name and source metadata. The taxonomists will then do the necessary research required to add the new taxon and apply the correct classification which may include adding further taxa from higher ranks.

A novel unpublished organism or one that has not been fully identified requires the creation of an informal name. This contains a placeholder element that is used until a formalised name is accepted by the appropriate nomenclature code. For prokaryotes, this is generally a strain or isolate name but can be any identifier that makes the organism unique in the database. For example, Escherichia sp. ES13 (Taxon:640049) is an informal name for an Escherichia coli strain that has not been identified or published at species level.

ENA curators will communicate at length with the submitter in order to obtain all the information required to correctly classify the source organism of the sequence entry. Curators will also liaise between NCBI Taxonomy and the submitters on nomenclatural changes or updates, such as when an informal name is published. ENA records are only loaded into the archive once the organism data records are up to date in NCBI Taxonomy.

Latest ENA News

20 Aug 2014: Read data through Globus GridFTP
Read data can now be downloaded using Globus GridFTP through ebi#ena Globus Online public endpoint.

18 Aug 2014: Changes to SRA XML 1.5
Small changes to Experiment XML, Analysis XML, EGA Dataset XML, EGA DAC XMLs were deployed on 11th of August 2014.

1 Jul 2014: ENA release 120
Release 120 of ENA's assembled/annotated seqences now available

23 May 2014: Change to date format for advanced search
From 16th June 2014, the date format used in the advanced search will be changed to ISO format (YYYY-MM-DD).

20 May 2014: Update to the ENA SAMPLE checklist
From 10th of June 2014 the ENA SAMPLE checklist XML will be updated and the older version will be deprecated.