Submitting environmental sequences

This covers the submission of data derived from the sequencing of DNA or RNA from organisms that have not been isolated and is typically focused on entire microbial communities, size fractionated organisms, enrichments cultures and flow sorted microbes. Both amplicon (metabarcoding, diversity survey) and shotgun (metagenomics/metatranscriptomics) based methods are included here.

Minimally, ENA aims to capture and present both raw data and identifications (taxonomic or functional) from environmental sequencing approaches. Additionally, intermediate information, such as mappings between raw reads and identifications and cluster information can also be captured.

Amplicon-based

There are currently two different Webin submission systems that both should be used for submitting this type of data.  The first should be used for registering samples and submitting raw sequencing reads, the latter is used to submit the assembled and annotated sequences.

Submitting reads and samples

The read domain Webin submission tool (see here) should be used to submit both raw sequencing reads and samples. Minimal sample description should be provided in this step as more detailed MIMARKS compliant information will be provided in the next step while submitting the annotated sequences. Take note of the sample accession as this will be needed during the sequence submission.

Submitting annotated sequences

Annotated marker sequences should be submitted using the sequence Webin submission tool:

  1. Register samples (see above)
  2. Login or register
  3. Create a new sequence submission
  4. Select MIMARKS-Survey 16S rRNA sequences
  5. Provide release date
  6. Provide citation details
  7. Choose appropriate fields
  8. Fill the selected fields for each sequence being submitted either manually or by uploading a pre-prepared spreadsheet or a Fasta file
  9. Please note that Sample accessions should be provided in the Sample Accession field

After submission, please wait for contact from curators with any queries or a list of assigned sequence accession numbers with instructions on how to access and cite the submitted data.

Shotgun-based

The ENA archives Next Generation Sequencing data from metagenomics and metatranscriptomics studies into the read domain. Comprehensive instructions on read data submission can be found here. These guidelines include both programmatic and interactive submissions.

For all studies, but especially metagenomics and metatranscriptomics studies, it is essential that the read data files, study and sequencing experiment description are accompanied by minimal sample metadata information in order to ensure that the data originating from the study are reproducible, meaningful and can be sufficiently analysed.

Minimal reporting requirements for samples of metagenomics and metatrancriptomics studies are addressed in the MIxS data standard described here.

Metagenomics and metatrancriptomics studies containing adequate sample description are prioritised for analyses by the EBI Metagenomics service

Latest ENA news

01 Jul 2015: ENA release 124
Release 124 of ENA's assembled/annotated sequences now available

20 Jun 2015: Sample Checklist Updates - June 2015
ENA are planning to update several sample metadata reporting checklists. Some of these changes have been carried out for harmonisation of attributes/fields between various checklist. Other changes were made to allow a standardised missing/null value reporting. All changes will come into effect as of 3rd August 2015.

03 Jun 2015: Changes to read data submission services 1st of October 2015
ENA will make a number of changes to submission services for raw sequence read data on first of October 2015. We continue to track an ever evolving landscape of available and preferred formats and introduce these changes with a view to overall simplification of the submission system to allow us to provide a more efficient service with faster turnaround.