What is ENA?
The European Nucleotide Archive (ENA) is a comprehensive resource of primary nucleotide sequence information. ENA provides access to both assembled sequence and unassembled (raw) sequence reads, but places them in separate databases in order to optimise accessibility and analysis (2). Figure 2 provides a schematic representation of how the data is stored in ENA.
ENA consists of three databases:
(1) EMBL-Bank consists of:
- Assembled sequence data, where the submitter has assembled the sequence into one long contiguous length.
- Annotation information that describes the biological function of specific regions of the sequence (such as protein-coding regions, exons and introns), which is provided by the submitter.
(2) Sequence Read Archive (SRA) consists of (3):
- Reads of raw data consisting of typically short, unassembled fragments of sequence generated using Next Generation Sequencing (NGS) technology.
(3) Trace Archive consists of:
- Reads of raw data consisting of unassembled fragments of sequence generated using capillary sequencing technology.