Reporting standards

Harmonization of data and metadata collection becomes an essential effort in the age when data generation is often easier and more affordable then their organization and storage.

Compliance of submitted data to the relevant reporting standards promotes:

  • consistent and adequate data description
  • thorough data validation
  • data discoverability
  • data reproducibility
  • data interoperability and usability

ENA/INSDC reporting standards

The European Nucleotide Archive requires, where appropriate, use of the following reporting standards:

  • Feature Table – Description of nucleotide sequence provenance and functional annotation of nucleotide sequence domains
  • Third Party Data – Guidelines for submission of assembly and/or annotation or existing INSDC reads and primary sequences by a third party
  • Genome Assembly – Guidelines for submission of genome assemblies

Community-developed reporting standards

The European Nucleotide Archive supports use of the following community-developed reporting standards:

  • BARCODE – Minimum information about a species BARCODE sequence
  • GMI:MDM – Minimal Data for Mapping in relation to the Global Microbial Identifier pathogen tracking initiative
  • Micro B3 – Minimum information about marine microbial sampling
  • MINSEQE – Minimum Information about a high-throughput Nucleotide SeQuencing Experiment
  • MIxS – Minimum Information about any (x) Sequence
  • Influenza/COMPARE: Minimum Information for reporting of Influenza virus samples

Specialised databases

European Nucleotide Archive submitters may also wish to submit to the following specialised databases after acquiring an INSDC accession number from ENA:

  • IPD-IMGT/HLA – For Human Leukocyte Antigen sequences, overseen by the WHO HLA Nomenclature Committee
  • IPD-MHC – For non-Human Major Histocompatibility Complex sequences, overseen by Comparative MHC Nomenclature Committee
  • IPD-KIR – For Human Killer-cell Immunoglobulin-like Receptor sequences

Latest ENA news

08 Feb 2016: Changes to WGS and TSA sequences in the ENA browser
Changes are being made to WGS and TSA sequences in the ENA browser.

13 Jan 2016: Assembly XML now available from ENA
As the first step to improvements into genome assembly data access, XML files describing each assembly version are now available from the ENA browser.

07 Jan 2016: ENA launches comprehensive CRAM indexing
ENA has launched a new service to provide reference coordinate-based indices for CRAM data.

09 Dec 2015: ENA Release 126
Release 126 of ENA's assembled/annotated sequences now available