Annotation enrichment analysis


There are many different approaches that can be used to understand the biological context of protein-protein interaction networks. Annotation enrichment analysis is one of the most popular methods. Although it is not strictly speaking a network analysis tool, it is often used in combination with topological network analysis.

There are different varieties of this type of analysis, but in its most basic form, annotation enrichment analysis uses gene/protein annotations provided by knowledge-bases such as Gene Ontology (GO) or Reactome to infer which annotations are over-represented in a list of genes/proteins that can be taken from a network (Figure 32). Essentially, annotation tools perform some type of statistical test (usually a hypergeometric test, often also a binomial test) that tries to answer the following question:

"When sampling X proteins (test set) out of N proteins (reference set; graph or annotation), what is the probability that x, or more, of these proteins belong to a functional category C shared by n of the N proteins in the reference set." (21).

The result of this test provides us with a list of terms that describe the list/network, or rather a part of it, as a whole.

GSEA using GO and Reactome in a network

Figure 32 Annotation enrichment analysis using GO and Reactome in a network.

This type of analysis is most frequently performed using GO annotation as a reference, but tools such as the Cytoscape apps BiNGO and ClueGO can also manage other annotation databases such as Reactome and KEGG. This is a widely used technique that helps characterise the network as a whole or sub-sets of it, such as inter-connected communities found through topological clustering analysis.

More complex versions of this technique can factor in continuous variables such as expression fold change. The GSEA tool is a good example of a more advanced technique that makes use of similar basic concepts. A slightly old but very thorough overview of different tools of this family and the advantages and limitations of their different approaches can be found in Huang da et al 2009 (22).

To learn more about annotation and Reactome pathways have a look at these Train online courses:

UniProt-GOA: Quick tour

GO: Quick tour

Reactome: Exploring and analysing biological pathways