Some examples of what motivates our work:
- Bacteria are the most abundant and diverse cellular life form, with incredible ability to adapt. Part of that ability is due to their flexible genomes. If you study a bacterial species, you find most genes are either very common or very rare – the frequency distribution is U-shaped. As a result, the standard bioinformatics approach, of using “one reference genome to represent them all”, will not work. A large component of our work involves developing pan-genome graph approaches to model and study (Colquhoun et al., Genome Biology, https://pubmed.ncbi.nlm.nih.gov/34521456/) genetic variation in bacterial pan-genomes.
- There are several surface antigen genes in P. falciparum, the parasite that causes the deadliest form of malaria, that have dimorphisms – two main deeply diverged allelic forms. We have used these as drivers for method development in graph genome algorithms, and are now applying the methods to study global genetic variation in those genes.
- We study the genome of M. tuberculosis, the bacterium responsible for the disease tuberculosis, from many angles. Working in the CRyPTIC consortium, we have analysed the genomes of 80,000 isolates, of which around 20,000 have been phenotyped for resistance to 14 drugs. We have developed a rapid, lightweight app for predicting antibiotic resistance given sequence data from a sample of M. tuberculosis (also supports S. aureus and S. sonnei), called Mykrobe. There are various related projects, on sequencing TB without culture (directly taking sputum from patients), nanopore analysis, global real-time surveillance.
- Recently we have been working on combining genomic analysis of M. tuberculosis with image data of bacterial growth.
- We have developed new methods for a ‘Google search’ of all bacterial and viral genomic data (Bradley et al., Nat Biotechnol 2019; Bingmann et al., SPIRE 2019), and have been combining them with the global archives to study transposon and plasmid evolution.
The back and forth between methods and applications remains at the heart of our work. We will continue to develop our graph genome approaches for bacteria and parasites: the methods are still in their infancy, but we are now beginning to be able to realise their benefit in studying real data. Different applications of graph genomes, to infectious disease epidemiology in hospitals and to mobile element evolution, will lead to new insights and requirements. We will also continue to study the impact of genotype on M. tuberculosis growth and drug resistance.