Thornton group

Computational biology of proteins: structure, function and evolution

Our research builds on our accumulating knowledge of the three-dimensional structures of proteins and their complexes, to understand the evolution of life in 3D and how variants and small molecules can cause or modulate diseases and ageing. This understanding will ultimately lead to improving our ability to treat diseases and facilitate healthy ageing. 

Our research is focused in three distinct but related areas:

  • We explore the structure, function and evolution of enzymes and their mechanisms. These basic studies facilitate our ability to interpret coding variations in humans and their disease impact. We have strong, on-going collaborations with experimental groups, working on model organisms, to better understand the molecular basis of ageing.
  • We seek to understand how enzymes work and how they evolve to perform new enzyme functions, based on structural data. We have shown that most enzyme functions have evolved from other functions; this opens the path to rational design of novel enzymes with new functions and mechanisms. We also develop computational tools based on our analyses to improve enzyme design.
  • Our study of human coding variations, associated with developmental disorders and other diseases, aims to use protein structural knowledge to interpret their effects. Using the CATH domain database, we study variations occurring in related domains and explore how their genomic context influences the resultant disease. Our goal is to begin to trace the steps from the molecular protein variant and its effect on the protein’s function and, from there, to the organismal ‘disease and ageing’ sub-phenotype.

Disease-associated mutations affecting the DHSW tetrad in the WD40 motif of different proteins implicated in rare diseases

Disease-associated mutations affecting the DHSW tetrad in the WD40 motif of different proteins implicated in rare diseases a) The β-propeller structure of the WD40 domain b) The hydrogen-bonding network (dashed yellow lines) between the DHSW residues of a typical propeller blade which, when disrupted by mutation, can lead to different diseases involving different proteins which contain this motif.

Ageing can be described as the most ‘integrative or generic phenotype’, in that all organisms age and this ageing is affected by both the genome and the environment. We aim to combine data over multiple organisms and multiple data types to define sub-phenotypes of ageing to improve our understanding of the molecular basis of ageing. As part of our studies on the effects of variants, we will explore why ageing makes us more susceptible to some diseases. We also aim to use computational methods to identify small molecules that may have an impact on ageing.

Future projects

For our enzyme work, our central question will be whether we can predict enzyme-function evolution – both of new substrates and new mechanisms. Can we relate changes in function with changes in the structure of the enzyme and changes in the environment? Can we automatically predict or validate enzyme catalytic mechanisms in silico from structural data? We will further develop our data resources, CSA and MACiE, develop metrics to measure promiscuity and tools to help to predict promiscuity, and develop methods to predict transformations and mechanisms using deep-learning approaches.

For coding variants, we will provide web tools to relate variant, 3D structure and function to help non-experts to understand the impact of coding variants and how they generate disease phenotypes. To address these questions, we plan to:

  • visualise variants in structures and their links to diseases
  • develop new methods to analyse the effects of mutations in ligand binding sites
  • use our tools to analyse variants discovered in rare diseases and apply these methods to other specific example genes in collaboration with ‘domain’ experts
  • explore how the same mutations can cause many diseases, and how one disease can have many causes.

For ageing, we will explore:

  • whether it is possible to identify sub-phenotypes to better understand the ageing process, providing a better link between the molecular and whole-organism data
  • whether model organisms can be used to explore the impact of small molecules (potential drugs) on ageing
  • whether we can incorporate clinical data in our studies to bridge molecular and clinical data
  • the role of epigenetics in ageing and the epigenetic clock.

Selected publications

Rahman SA, et al. (2014) EC-BLAST: a tool to automatically search and compare enzyme reactions. Nature Methods 11:171-174

Laskowski RA, et al. (2016)  Integrating population variation and protein structural analysis to improve clinical interpretation of missense variation: application to the WD40 domain. Human Molecular Genetics 25:927-935

Martínez Cuesta S, Rahman SA, and Thornton JM (2016) Exploring the chemistry and evolution of the isomerases. Proceedings of the National Academy of Sciences of the United States of America 113:1796-1801

Furnham N, et al. (2016) Large-Scale Analysis Exploring Evolution of Catalytic Machineries and Mechanisms in Enzyme Superfamilies. Journal of Molecular Biology 428:253-267

Ziehm M, et al. (2017) Drug repurposing for ageing research using model organisms. Ageing Cell; doi: 10.1111/acel.12626