CRAM is a sequencing read file format that is highly space efficient by using reference-based compression of sequence data and offers both lossless and lossy modes of compression. Building on early proof-of-principle for reference-based compression (Hsi-Yang Fritz, et al. (2011). Genome Res. 21:734-740), the CRAM format balances usability with compression efficiency.
The format specification is maintained by the Global Alliance for Genomics and Health (GA4GH) Large Scale Genomics workstream, whose members provide multiple implementations and coordinate future specification changes. In support of CRAM, the ENA provides the CRAM reference registry for serving reference sequences to users of the CRAM format.
The latest CRAM version is CRAM 3.0.
*Note ENA policy on data compression.
CRAM 3.0 Implementations
CRAM 1.0 specification