First bulk CRAM submission to ENA

European Nucleotide Archive (ENA)

The first large-scaleĀ read data set in the CRAM compressed format has been submitted, undergone processing and been made public in the European Nucleotide Archive (ENA).

Using CRAM in lossless mode, the submission represents a pre-publication data release from the Wellcome Trust Sanger Insitute and comprises around 4000 run records covering a number of pathogen species. Data are available for download in both CRAM and FASTQ formats. The dataset in CRAM format consumes 80% of the disk space or network bandwidth for download required for its gzipped FASTQ equivalent.

You can find out more about CRAM sequence data compression technology - and view an example of a study in this new dataset - on the ENA website. Our documentation also has information aboutĀ starting point for data submissions to ENA, including those in CRAM format.