The relational model

Relational databases have two distinct sets of components (Figure 5):

  1. The schema defines the structure of the database - the metaphorical equivalent of a filing cabinet and hanging files.
  2. The data is the information (the set of records) stored in the database - the metaphorical equivalent of the documents stored inside hanging files.

Part of the schema for the Ensembl database

Figure 5 Part of the schema for the Ensembl database.

Designing a database schema takes more up-front planning than designing a flat file database but, once established, database schemas typically don’t change much. By contrast, the data may be in constant flux. The data in many bioinformatics databases is growing supra-exponentially and is constantly being enriched through the efforts of automated and manual data curation.

The schema comprises a set of tables (or ‘relations’). Each row in the table has its own ‘key’ - an attribute that makes it unique in that particular table. For a database of genes and gene products, for example, the key might be an identifier number. A key can be created from combinations of unique attributes too so, for example, in a database of enzyme activities, the substrate and product together might be an appropriate key.

You can learn more about the relational model by watching Jennifer Widom’s video.