Using ontologies to provide controlled vocabularies

What is an ontology?

In informatics and computer science, an ontology is a representation of the shared background knowledge for a community (7). An ontology describes the categories of objects described in a body of data, the relationships between those objects, and the relationships between those categories. In doing so, an ontology describes the objects themselves and sometimes defines what you need to know to recognise one of those objects. The labels used to describe the objects can be used to deliver a controlled vocabulary, but an ontology is much more than a controlled vocabulary.

The Gene Ontology

The archetypal example of an ontology in the molecular life sciences is the Gene Ontology (GO), created and maintained by the Gene Ontology Consortium. GO describes the function and cellular localisation of gene products across all species (Figure 11), and you can find out more about it in our quick tour of GO.

GO is used to describe genes and their products in major public databases, including UniProt, Ensembl and many model organism databases such as FlyBase, SGD and MGI. It provides a powerful means of analysing data sets, such as a set of genes identified as over-expressed in a functional genomics experiment, and determining whether they have related functions or similar locations (GO enrichment analysis).

 Snapshot of the Protein Tyrosine Kinase entry in GO, showing the richness of an ontology and highlighting the terms that are used to populate controlled vocabularies

Figure 11 Snapshot of the protein tyrosine kinase entry in GO, showing the richness of an ontology and highlighting the terms that are used to populate controlled vocabularies.

Since the development of GO in the late 1990s, there has been a proliferation of ontologies developed along the same principles, which include open access, collaborative development and interoperability. These ontologies are collected in the OBO Foundry and are also listed at Biosharing.org.

If you have a data set and want to annotate it using an ontology, it can be quite a challenge to identify the most appropriate one. The Ontology Lookup Service is a good place to start, requiring no prior knowledge of what ontologies exist.