CRAM

CRAM is a framework technology comprising file format and toolkit in which we combine highly efficient and tunable reference-based compression of sequence data with a data format that is directly available for computational use. In support of CRAM, we also provide the CRAM reference registry.

Building on early proof-of-principle for reference-based compression (Hsi-Yang Fritz, et al. (2011). Genome Res. 21:734-740), our approach has been to balance usability with compression efficiency. CRAM supports production pipelines for the European Nucleotide Archive*. Current work includes improvements to functionality and broader integration with third party tools. We remain involved in the community discussion around the application of CRAM to different types of data (Cochrane G. et al. (2012). GigaScience 1:2)

The latest CRAM version is CRAM 3.0.

*Note ENA policy on data compression

 CRAM 3.0

samtools/htslib C API 1.2.1 and later

samtools/htsjdk Java API 1.133 and later

CRAMTools version 3.0 and later

SAMtools version 1.2 and later

Picard version 1.133 and later

Format specification

CRAM 3.0 specification 

CRAM 2.1 specification

CRAM 1.0 specification

Mailing list

We encourage membership of the samtools developers mailing list.

Latest ENA news

12 Jul 2017: Submission service maintenance - 14/7/17 to 17/7/17

Webin submission services will not be available between Friday 14/7...

07 Jul 2017: Update to Aspera server

EBI has built a new Aspera server on up-dated hardware with the latest Aspera version and configuration. This should improve...

06 Jul 2017: ENA Release 132

Release 132 of ENA's assembled/annotated sequences now available

30 Jun 2017: Taxon support for sequence, WGS and assembly in ENA Browser Tools

You can now download sequence, WGS and assembly data by tax ID using ENA Browser Tools

23 Jun 2017: New tools to download data from ENA

Introducing two new tools to make retrieving data from ENA much easier: enaBrowserTools and ENA FTP Downloader.