Relational databases

Many bioinformatics databases - and indeed, many large, complex databases in all walks of life - are relational databases. Put simply, a relational database comprises two or more separate tables, with explicitly defined relationships linking the tables together via key fields.

The advantages of relational databases over flat files include the following:

  1. Data integrity: it’s less easy to corrupt the data in relational databases because, owing to their structure, flat files frequently contain redundant information, which corrupts more easily.
  2. Data consistency: each ‘entry’ in a relational database has a unique ID; this reduces the chances of having inconsistent data and multiple entries for the same item.
  3. Smaller file sizes for the same data: this lack of redundancy makes relational databases more compact than their flat file equivalents.
  4. Data availability: databases can be shared over a network and updated from many points.
  5. Speed: retrieval of information is typically faster from relational databases than it is from flat files containing the same data.

In a nutshell, relational databases provide efficient, reliable, convenient and safe multi-user storage of, and access to, massive amounts of persistent data. For a lengthier discussion on this topic, see Jennifer Widom’s Introductory video to relational databases, below.