Submitting environmental sequences

This covers the submission of data derived from the sequencing of DNA or RNA from organisms that have not been isolated and is typically focused on entire microbial communities, size fractionated organisms, enrichments cultures and flow sorted microbes. Both amplicon (metabarcoding, diversity survey) and shotgun (metagenomics/metatranscriptomics) based methods are included here.

Minimally, ENA aims to capture and present both raw data and identifications (taxonomic or functional) from environmental sequencing approaches. Additionally, intermediate information, such as mappings between raw reads and identifications and cluster information can also be captured.

Amplicon-based

There are currently two different Webin submission systems that both should be used for submitting this type of data.  The first should be used for registering samples and submitting raw sequencing reads, the latter is used to submit the assembled and annotated sequences.

Submitting reads and samples

The read domain Webin submission tool (see here) should be used to submit both raw sequencing reads and samples. Minimal sample description should be provided in this step as more detailed MIMARKS compliant information will be provided in the next step while submitting the annotated sequences. Take note of the sample accession as this will be needed during the sequence submission.

Submitting annotated sequences

Annotated marker sequences should be submitted using the sequence Webin submission tool:

  1. Register samples (see above)
  2. Login or register
  3. Create a new sequence submission
  4. Select MIMARKS-Survey 16S rRNA sequences
  5. Provide release date
  6. Provide citation details
  7. Choose appropriate fields
  8. Fill the selected fields for each sequence being submitted either manually or by uploading a pre-prepared spreadsheet or a Fasta file
  9. Please note that Sample accessions should be provided in the Sample Accession field

After submission, please wait for contact from curators with any queries or a list of assigned sequence accession numbers with instructions on how to access and cite the submitted data.

Shotgun-based

The ENA archives Next Generation Sequencing data from metagenomics and metatranscriptomics studies into the read domain. Comprehensive instructions on read data submission can be found here. These guidelines include both programmatic and interactive submissions.

For all studies, but especially metagenomics and metatranscriptomics studies, it is essential that the read data files, study and sequencing experiment description are accompanied by minimal sample metadata information in order to ensure that the data originating from the study are reproducible, meaningful and can be sufficiently analysed.

Minimal reporting requirements for samples of metagenomics and metatrancriptomics studies are addressed in the MIxS data standard described here.

Metagenomics and metatrancriptomics studies containing adequate sample description are prioritised for analyses by the EBI Metagenomics service

Latest ENA News

20 Aug 2014: Read data through Globus GridFTP
Read data can now be downloaded using Globus GridFTP through ebi#ena Globus Online public endpoint.

18 Aug 2014: Changes to SRA XML 1.5
Small changes to Experiment XML, Analysis XML, EGA Dataset XML, EGA DAC XMLs were deployed on 11th of August 2014.

1 Jul 2014: ENA release 120
Release 120 of ENA's assembled/annotated seqences now available

23 May 2014: Change to date format for advanced search
From 16th June 2014, the date format used in the advanced search will be changed to ISO format (YYYY-MM-DD).

20 May 2014: Update to the ENA SAMPLE checklist
From 10th of June 2014 the ENA SAMPLE checklist XML will be updated and the older version will be deprecated.