Thornton Group

Computational biology of proteins and ageing

Our research uses protein 3D structural information to understand molecular evolution and how variants and small molecules can cause or modulate diseases and ageing. We have a strong focus on enzymes, the transformations they perform, their mechanisms, flexibility and how they evolve novel functions.


We explore the structure, function and evolution of proteins. These basic studies facilitate our ability to understand how proteins work and to interpret coding variations in humans and their impact on healthy ageing and disease. Our research is focused in three distinct but related areas:

  • We seek to understand how enzymes work and how they evolve to perform new enzyme functions, based on structural data. We have shown that most enzyme functions have evolved from other functions; this opens the path to rational design of novel enzymes with new functions and mechanisms. We also develop computational tools based on our analyses, to improve enzyme design. Our analysis of enzyme active sites reveals a high degree of structural conservation of the catalytic residues (shown by the grey outlines). This allows us to derive representative structural templates (black outlines, derived from PDB entry 1n4o) for each enzyme type. These can be used to search for related enzymes or pseudo-enzymes.
  • Our study of human coding variations aims to use protein structural knowledge to interpret their effects. Using the CATH domain database, we study variations occurring in related domains and explore how their genomic context influences the resultant phenotype.
  • Our work on enzymes and variants is ultimately related to human health, ageing and disease. Our goal is to trace the steps from the molecular protein variant or ligand and its effect on the protein’s function and, from there, to the organismal ‘disease and ageing’ sub-phenotype. In our ageing studies, we aim to combine data over multiple organisms and multiple data types to define sub-phenotypes of ageing to improve our understanding of the molecular basis of ageing. We will explore why ageing makes us more susceptible to some diseases and use computational methods to identify small molecules that may have an impact on ageing.
Catalytic Templates for Class A Beta-lactamases. This shows a superposition of the active site residues taken from 244 related but ‘unique’ structures. Most residues superpose well, but the Ser/Thr residues (which occur in different members of the family) can be seen to adopt two alternative conformations in these structures.

Future projects

For our enzyme work, our central question is whether we can predict the evolution of enzyme function – both in terms of adapting to operate on new substrates and evolving new mechanisms. Can we relate changes in function to changes in the structure of the enzyme and changes in the environment? Can we automatically predict or validate enzyme catalytic mechanisms in silico from structural data? We will further develop our data resources (M-CSA) and websites (PDBsum) and develop novel methods to predict transformations and mechanisms using knowledge-based and deep-learning approaches.

For coding variants, we will enhance our web tool (VarSite) to relate variant, 3D structure and function to help non-experts understand the impact of coding variants and how they generate disease phenotypes.  To address these questions, we plan to:

  • develop new methods to analyse the effects of mutations in ligand binding sites
  • explore variants in co-factor binding sites and their impact on function
  • apply our methods to specific genes of interest in collaboration with ‘domain’ experts
  • explore how the same mutations can cause many diseases, and how one disease can have many causes.

For ageing, we will develop tools to combine transcriptome data sets and analyse a small number of common diseases and the impact of ageing on their occurrence.

Selected publications

Laskowski RA, Stephenson JD, Sillitoe I, Orengo CA, Thornton JM. VarSite: Disease variants and protein structure (2020). Protein Science 29, 111-119

Ribeiro AJ, Tyzack JD, Borkakoti N, Thornton JM. Identifying pseudoenzymes using functional annotation: pitfalls of common practice (2020). The FEBS Journal 287, 4128-4140

Dönertaş HM, Fabian DK, Fuentealba Valenzuela MF, Partridge L, Thornton JM. Common genetic associations between age-related diseases (2020). Nature Ageing ref.

Laskowski, RA, Thornton, JM. PDBsum extras: SARS-CoV-2 and AlphaFold models (2022). Protein Science 31: 283-289.

Thornton J M, Laskowski R A, Borkakoti N. AlphaFold heralds a data-driven revolution in biology and medicine (2021). Nat Med, 27, 1666–1669.