How is ENA structured?

All ENA data is structured around a robust, intuitive metadata model, visualised in Figure 1. As well as sequence data, users must register a Study/Project to contain the data and describe its purpose. Most sequence data also require the registration of Samples, which describes the origins of the biomaterial which was sequenced.

**Figure 1** The ENA metadata model provides a framework to ensure that data always includes plenty of metadata to contextualise it

The majority of sequence data can be divided into three tiers, which build upon one another (Figure 2):

Reads: the raw output of sequencing machines
Assembly: reconstructions of replicons (or fragments thereof) made from raw reads
Annotation: functional information projected onto assemblies at coordinate defined locations

European Nucleotide Archive

How is ENA structured?

Congratulations!