|1 Jul 2014||
ENA release 120
Release 120 of assembled/annotated sequences from the European Nucleotide Archive (ENA) is now available on the EBI public ftp server at ftp://ftp.ebi.ac.uk/pub/databases/embl/release/ and also at ftp://ftp.ebi.ac.uk/pub/databases/ena/sequence/release/.
It contains 441,582,575 sequences comprising 883,726,016,221 nucleotides. You can see the full release notes at:http://bit.ly/LKFtrE.
See http://www.ebi.ac.uk/ena/about/news/change-ena-release-ftp-location-september-2014 for full details.
|23 May 2014||
Change to date format for advanced search
From Monday 16th June 2014 there will be a change in the date format supported by ENA's advanced search to ISO format as shown below. Note that ISO date ranges and time will not be supported for searching, but date ranges will be given where available in the tabulated reports.
Single date format (for search and report): YYYY-MM-DD
Example: collection_date < 2014-04-01
Date range format (for report only): YYYY-MM-DD/YYYY-MM-DD
|20 May 2014||
Update to the ENA SAMPLE checklist
From 10th of June 2014 the ENA SAMPLE checklist XML will be updated and the older version will be deprecated.
(a) support the version 4.0 of the Genomic Standards Consortium standard on Minimum Information about any (x) Sequence (GSC MIxS),
(b) activate the 'GSC MIxS miscellaneous natural or artificial environment' checklist designed for description of molecular samples where
(c) introduce the 'ENA Tara Oceans' checklist designed for description of molecular samples acquired during the Tara Oceans Expedition.
(d) update the 'ENA Micro B3' checklist designed for description of molecular samples acquired during the Ocean Sampling Day (OSD) Campaign.
(e) incorporate recent upgrade to the 'ENA GMI report' checklist designed for description of pathogen samples for the
|13 May 2014||
MD5 sequence checksums in Coding flat files
MD5 sequence checksums are planned to be included in Coding flat files on 11th of June.
The MD5 checksum will be available on the DR line, for example:
DR MD5; 5830b6060dc4ec4602fc6e5629505072.
Please contact firstname.lastname@example.org if you have any questions of concerns about this change.
|27 Mar 2014||
Retirement of legacy file report URL in July 2014
From 1st July 2014, we will be retiring the following report URLs
In their place, you should use the new URL outlined here.
This new method of fetching file information provides flexibility as to which data are contained within the report (you decide which data you want to be returned). The new file report URL will generate the reports faster than the older URLs, and is also expected to be more reliable. The format of the report output, described here will also undergo some changes at this point.
|18 Mar 2014||
Change of ENA release FTP location in September 2014
Starting from March 2014 (release 119) onwards the release of assembled/annotated sequences from the European Nucleotide Archive (ENA) is available both in a new FTP location:
as well as in the old FTP location:
The directory structure in the new FTP location is slightly different from the old FTP location.
Starting from September 2014 (release 121) onwards the old FTP location will be symlinked to the new FTP location. Consequently, in September 2014 the directory structure in the old FTP location will change to match the one in the new FTP location.
In the new FTP location, documents and auxiliary files are made available in a separate sub-directory:
The CON (constructed) and STD (standard) sequences will also be in separate sub-directories:
If you have any questions or concerns about the change please do not hesitate to contact email@example.com.
|17 Mar 2014||
ENA release 119
Release 119 of assembled/annotated sequences from the European Nucleotide Archive (ENA) is now available on the EBI public ftp server at
It contains 393,460,058 sequences comprising 783,467,257,469 nucleotides. You can see the full release notes at: http://bit.ly/LKFtrE.ENA captures, preserves and presents the world's nucleotide sequence data. New content is included in ENA on a continuous basis and are distributed daily from our browser and RESTful service. The ENA assembled/annotated sequence release provides a quarterly snapshot of content in this important subset of ENA content.
|19 Feb 2014||
NGS data CRAMming, Archiving & Exploring course on April 2nd 2014
NGS data CRAMming, Archiving & Exploring course at EMBL-EBI, Cambridge on 2nd of April 2014. http://www.ebi.ac.uk/training/course/ENA_April2014 Learn about data standards, the ENA data model, compression of NGS data, large-scale data management, submission tools, data retrieval, APIs and more... Please pass this invitation on if appropriate. To avoid disappointment, please book your place as early as possible.
|14 Feb 2014||
The CRAM specification has been updated to version 2.1 to allow for "end of file" marker. The change is expected to be compatible with existing tools.
|4 Feb 2014||
Change to coding results in ENA Browser
On Thursday 6th February (between 10am and 11am), the advanced search is being updated to replace the current single coding result with two new results: one for release and one for update. This will impact searches performed on the coding and marker domains, as well as the results listed in the Taxonomy and Project portals.
For programmatic users, note that the current coding result ID (sequence_coding) will no longer be supported and in its place you should use the coding_release and/or coding_update result IDs.
|29 Jan 2014||
From 29 January 2pm onwards submitters may login into all ENA's submission systems (Webin) using the same user name and password.
We recommend the use of Webin submission account name (Webin-<number>) as the user name. However, e-mail address or era-drop-<number> FTP account name are also supported.
All new submitters will be instructed to use our new FTP server at webin.ebi.ac.uk (port 8021) to upload files.
Users of the Webin Data Upload application will be transparently directed to this new FTP service.
We will also continue to support file uploads using the existing era-drop-<number> accounts at ftp.sra.ebi.ac.uk.
Programmatic submitters may choose between the existing authentication method:
auth=ERA era-drop-<number> <password digest>
and a new method:
The latter will allow authentication using the Webin-<number> account name or an e-mail address associated with the submission account.
Please note that password reset requests made through the interactive Webin application will not affect the password required to authenticate as the era-drop-<number> FTP user through the programmatic interface (auth=ERA) or through ftp.sra.ebi.ac.uk.
If you have any questions or experience any difficulties after the maintenance please contact us at firstname.lastname@example.org.
|10 Jan 2014||
Change in CDS FTP products
CDS flat file and fasta FTP products are being replaced by new products. Old products to be deprecated by end of March 2014.
We have introduced a new coding flat file and fasta products in:
|17 Dec 2013||
ENA release 118
Release 118 of assembled/annotated sequences from the European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena/) is now available on the EBI public ftp server at ftp://ftp.ebi.ac.uk/pub/databases/embl/release/. It contains 332,944,272 sequences comprising 714,448,074,322 nucleotides. You can see the fullrelease notes at:http://bit.ly/LKFtrE.
ENA captures, preserves and presents the world's nucleotide sequence data. New content is included in ENA on a continuous basis and are distributed daily from our browser and RESTful service. The ENA assembled/annotated sequence release provides a quarterly snapshot of content in this important subset of ENA content.
|29 Oct 2013||
First bulk CRAM submission to ENA
The first large-scale read data set in the CRAM compressed format has been submitted, undergone processing and been made public at ENA. Using CRAM in lossless mode, the submission represents a pre-publication data release from the Wellcome Trust Sanger Insitute and comprises around 4,000 run records covering a number of pathogen species. Data are available for download in both CRAM and FASTQ formats. The data set in CRAM format consumes 80% of the disk space or network bandwidth for download required for its gzipped FASTQ equivalent.
An example of a study in this data set can seen here.
|12 Sep 2013||
ENA release 117
Release 117 of assembled/annotated sequences from the European Nucleotide Archive (ENA) is now available on the EBI public ftp server at ftp://ftp.ebi.ac.uk/pub/databases/embl/release/.
ENA captures, preserves and presents the world's nucleotide sequence data. New content is included in ENA on a continuous basis and are distributed daily from our browser and RESTful service.
|2 Jul 2013||
Change to EMBL-Bank flat file AC lines
Use of AC lines to denote 'WGS/TSA set membership' in addition to 'record replacement' has been a frequent cause of confusion among our users. To solve this we have moved WGS/TSA master accession numbers from AC lines to DR lines.
After the change:
DR ENA; <WGS/TSA master accession number with set version>; SET
|27 Jun 2013||
ENA release 116
Release 116 of assembled/annotated sequences from the European Nucleotide Archive (ENA) is now available on the EBI public ftp server at ftp://ftp.ebi.ac.uk/pub/databases/embl/release/.
|13 Jun 2013||
CRAM 2.0 launch
|26 Mar 2013||
CRAM 2.0 pre-launch
CRAM 2.0 release candidate made available and two month review period announced.
Today sees the pre-launch of CRAM 2.0 and an announcement of the expected full launch date as the 1st of June, 2013.
CRAM 2.0 contains numerous improvements over CRAM 1.0 made possible by active community participation. Minor modifications to the format are possible during the two month review period that now begins.
CRAM 1.0 will be superseded by CRAM 2.0 at time of full launch.
More information about CRAM can be found here.
|13 Feb 2013||
Changes to EMBL-CDS FTP products
The introduction of ENA Advanced Search and other recent service improvements have made a number of EMBL-CDS FTP products redundant. Any remaining users are asked to contact email@example.com to discuss alternative ways to retrieve this information.
Changes to EMBL-CDS FTP reports
|20 Nov 2012||
Today sees the launch of the CRAM compression software and format. CRAM provides not only powerful compression through its lossless and lossy models, but also supports full computational access to data in compressed form.
|9 Nov 2012||
RESTful and Query builder interfaces to ENA launched in beta
New tools to discover and retrieve the world’s nucleotide sequence data have been launched today, in early beta, based on a new custom data warehouse that brings together the diverse content of ENA.
The Query builder function, a web interface for the construction of powerful queries, provides interactive access, while the RESTful interface supports programmatic calls to the warehouse. The Query builder is available from the ‘Advanced search’ tab on any search results page and is documented, along with the RESTful interface here.
These two interfaces will in due course be complemented with a third warehouse-based search interface that offers more intuitive access with much of the search power.
We encourage users to send feedback on this and other ENA services to firstname.lastname@example.org.
|3 Aug 2012||
Approaching full production for CRAM
Since our proof-of-principle publication of reference-based sequence read data compression in 2011 (http://genome.cshlp.org/content/21/5/734), the EBI has been moving towards a production release of CRAM, a framework technology comprising file format and toolkit in which we combine highly efficient and tunable compression with a data format that is directly available for computational use (http://www.ebi.ac.uk/ena/about/cram_toolkit). Details of our plans are provided here.
|13 Jul 2012||
Just published: 'The future of DNA sequence archiving'
Our latest paper has been published in the inaugural issue of Gigascience. In this commentary, our starting point is the existence of a viable compression method, CRAM - currently in late beta and soon to be released in full production - in which compression can be applied with varying levels of intensity to different data sets. Our goal with the paper is to stimulate the broadest possible community discussion about exactly where the most aggressive forms of compression should be applied and where the approach should be far more cautious.
|25 Apr 2012||
CRAM 0.8 released
CRAM toolkit 0.8 has been released. More information with download, installation and usage instructions are available here.
|25 Apr 2012||
ENA policy relating to compression of submitted data
ENA details its policy on use of CRAM raw sequence data compression during the archiving process.
|7 Mar 2012||
CRAM toolkit 0.7 released
CRAM toolkit 0.7 has been released. More information with download, installation and usage instructions are available here.
|2 Mar 2012||
The future of sequence archiving
The future of sequence archiving and the role of data compression is explored in a new paper from EBI to be published in Gigascience. Planned for the first issue of the journal, the paper is available pre-publication here. In the paper, we propose a way forward through the use a graded system in which the ease of reproduction of a sequencing-based experiment and the relative availability of a sample for resequencing be used as a means to define the level of lossy compression to apply to the stored data.
|13 Feb 2012||
CRAM toolkit 0.6 released
CRAM toolkit version 0.6 has been released. More information with download, installation and usage instructions are available here.
|20 Jan 2012||
ENA training videos now available on the EMBL-EBI YouTube Channel
Train yourself on Sequence Read Archive submissions theory and practice with ENA training videos now uploaded to the EMBL-EBI YouTube Channel for simpler access. We believe that such videos are useful additions to our support services and plan to cover more areas of ENA as time goes by. Contact us at email@example.com with suggestions of ENA services that you would like to see covered.
|6 Jan 2012||
ENA data go Galactic
Next generation sequence data from ENA's Sequence Read Archive (SRA) are now available as a data source in the Galaxy analysis system.
SRA data can be browsed and selected for upload using the Galaxy 'Get Data' tool. The 'EBI SRA' data source forwards users to ENA's pages to upload data into Galaxy.
In ENA, links are provided to Galaxy that upload data into the system and launch a session. For example, the following URL provides a list of all fastq files associated with SRA study ERP000591: http://www.ebi.ac.uk/ena/data/view/ERP000591. The Galaxy file upload links are available in the column 'Galaxy'.
|4 Nov 2011||
CRAM toolkit 0.5 released
CRAM toolkit version 0.5 has been released with support for several new quality budget models, random access index, and support for the Picard API. More information with download, installation and usage instructions are available here.
|3 Aug 2011||
CRAM toolkit 0.3 released
CRAM toolkit version 0.3 has been released with support for read quality masking (selective preservation of base quality scores). More information with download, installation and usage instructions are available here.
|16 Mar 2011||
ArchiveBAM 1.0 specification
The ArchiveBAM 1.0 specification has been published. SRA submitters are adviced to submit their data using the BAM format.
|11 Mar 2011||
ENA User Survey 2011 now available
Have your say and help us to improve ENA in our brief survey at http://www.surveymonkey.com/s/ENA_User_Survey_2011.
|7 Mar 2011||
Next Generation Sequencing Workshop at EBI
Spaces still available on the SRA user course: EBI Affiliated Course - Next Generation Sequencing Workshop 2011. The course will take place at EBI on the 4th-6th April 2011.
Further details are available here.
|26 Jan 2011||
EMBL-EBI and Sequence Read Archive
EMBL-EBI will continue to support the Sequence Read Archive for raw data.
See here for further details.
|6 Dec 2010||
The MIENS specification for describing marker genes was published in Nature Precedings.
The full article is available here.
|9 Nov 2010||
An article describing the Sequence Read Archive (SRA) services was published in the Nucleic Acids Research 2011 Database Issue.
The full article is available here.
|23 Oct 2010||