International PhD Programme research topics

When you apply for the EMBL International PhD Programme, you are asked to select two EMBL research groups and to indicate up to four research areas that interest you. A variety of backgrounds - such as biology, chemistry, computational science, mathematics and statistics - are relevant to PhD projects at EMBL-EBI. As well as purely computational projects, there may also be possibilities to incorporate some experimental biology in collaborating laboratories.

Here, we show a provisional list of available PhD projects at EMBL-EBI which are available during the Spring recruitment 2016 round.

You can find other EMBL research units on the EMBL website, and browse all EMBL research groups in our Research at a Glance brochure.


Spring Recruitment 2016 is now closed. Our next call will open in February 2016.

Birney research group

Sequence algorithms and intra-species variation

Dr Ewan Birney is Director of EMBL-EBI and has a small research group that focuses on sequence algorithms and using intra-species variation to explore elements of basic biology. The Birney group has a long-standing interest in developing sequencing algorithms, with considerable focus on theoretical and practical implementations of data compression techniques. "Blue skies" research includes collaborating with Dr Nick Goldman on a method to store digital data in DNA molecules. The group continues to be involved in this area as new opportunities arise - including the application of new sequencing technologies. We are also interested in the interplay of natural DNA sequence variation with cellular assays and basic biology.

Over the past five years there has been a tremendous increase in the use of genome-wide association to study human diseases. However, this approach is very general and need not be restricted to the human disease arena. Association analysis can be applied to nearly any measureable phenotype in a cellular or organismal system where an accessible, outbred population is available. We are pursuing association analysis for a number of both molecular (e.g. RNA expression levels and chromatin levels) and basic biology traits in a number of species where favourable populations are available including human, and Drosophila. In the future we hope to expand this to a variety of other basic biological phenotypes in other species, including establishing the first vertebrate near-isogenic wild panel in Japanese Rice Paddy fish (Medaka, Oryzias latipes).

Contact Birney research group

Brazma research group

Functional genomics research

Dr Alvis Brazma's research group complements the Functional Genomics service team, and focuses on developing new methods and algorithms and integrating new types of data across multiple platforms. The group is particularly interested in cancer genomics and transcript isoform usage, and collaborates closely with the Marioni group and others throughout EMBL.

Contact Brazma research group

Flicek research group

Evolution of transcriptional regulation

Dr Paul Flicek's research group focuses on computational models for genome annotation and evolution based on models incorporating DNA-protein interactions, epigenetic modifications, and the DNA sequence itself. The group is also interested in the large-scale infrastructure required for modern bioinformatics including storage and access methods for high-throughput sequencing data.

Contact Flicek research group

Gerstung group

Developing statistical models and bioinformatics tools for understanding cause and consequence of cancer genomes.

Cancer is a genetic disease caused by mutations to the genome. When such mutations hit critical genetic elements, they perturb cellular signalling resulting in overly proliferative cells. The availability of cheap sequencing technologies has led to large international efforts such as the International Cancer Genome Consortium for charting the genomic lesions leading to cancer. A revelation of these projects was an even greater genomic complexity of cancer genomes than previously anticipated: Despite having the same disease each patient harbours a unique constellation of mutations. 

This complexity is a challenge and an opportunity at the same time. A challenge to understand the underlying mechanisms of cancer development - and an opportunity for finding an explanation for differences in therapy success and outcome. We have developed statistical models for relating different layers of genomic, molecular and clinical data to extract the precise connections among variables to understand the connection of genotype and phenotype. Moreover we have been working on biostatistical models and informatics tools for predicting outcome based on comprehensive high-dimensional data sets. 

Another area of research is the evolutionary dynamics of cancer. The process of developing cancer is driven by mutation and selection; hence the language to quantify that process is that of evolutionary dynamics. Deep sequencing unmasks the clonal composition of a cancer, which sheds some light on its evolutionary history. Accurate detection of subclonal mutations and reconstruction of phylogenies requires, however, accurate bioinformatics tools that we are actively developing.

Contact Gerstung group

Goldman research group

Evolutionary tools for genomic analysis

Dr Nick Goldman's research group centres on three main research activities: developing new evolutionary models and methods; providing these methods to other scientists via stand-alone software and web services; and applying such techniques to tackle biological questions of interest. We participate in comparative genomic studies, both independently and in collaboration with others, including the analysis of next-generation sequencing (NGS) data. This vast source of new data promises great gains in understanding genomes and brings with it many new challenges.

Contact Goldman research group

Marioni research group

Computational and evolutionary genomics

Dr John Marioni's research group develops effective statistical and computational methods for analysing the vast amounts of data generated in high-throughput experiments. To gain a deeper understanding of complex biological processes such as gene regulation, the group develops computational methods for interrogating high-throughput genomics data. Their work focuses primarily on modelling variation in gene expression levels in different contexts: between individual cells from the same tissue; across different samples taken from the same tumour; and at the population level where a single, large sample of cells is taken from the organism and tissue of interest. Working with experimental colleagues within and beyond EMBL, the group applies their methods to biological questions ranging from the regulation of mammalian gene expression levels to the brain development in a marine annelid.

Contact Marioni research group

Stegle research group

Statistical genomics and systems genetics

Dr Oliver Stegle's research group uses computational approaches to unravel the genotype--phenotype map on a genome-wide scale. Their work focuses on the development and use of statistical methodology to dissect the causes of molecular variation. The group has shown how comprehensive modelling can greatly improve the statistical power to find genetic associations with gene expression levels and provide for an enhanced interpretation of the interplay between genetic variation, transcriptional regulation and molecular traits. The address these methodological questions in the context of close collaborations with experimental groups, where they apply novel statistical tools to study molecular traits in model organisms, plant systems and biomedical applications.

Contact Stegle research group

Steinbeck research group

Small molecule metabolism in biological systems

Dr Christoph Steinbeck leads the Cheminformatics and metabolism service team, which runs a number of key services and develops algorithms to: process chemical information; predict metabolomes based on genomic and other information; determine the structure of metabolites by stochastic screening of large candidate spaces; and enable the identification of molecules with desired properties. This requires algorithms based on machine learning and other statistical methods for the prediction of spectroscopic and other physicochemical properties represented in chemical graphs. Dr Steinbeck also has a research group, which focuses on the understanding of the small-molecule metabolism of living organisms. The group is interested in the analysis of metabolomics experiments, including methods for computer-assisted structure elucidation of biological metabolites and metabolic pathways. They develop and maintain chemistry-related databases of biological interest, and develop machine-learning methods for the prediction of mass (MS) and nuclear magnetic resonance (NMR) spectra for use in rereplication and structure elucidation. The methods and algorithms developed in the group are available through an open-source library for structural chemo- and bioinformatics.

Contact Steinbeck research group

Thornton research group

Proteins: structure, function and evolution

Prof Dame Janet Thornton's research group seeks to understand more about how biology works at the molecular level, with a particular focus on proteins and their 3D structure and evolution. They explore how enzymes perform catalysis by gathering relevant data from the literature and developing novel software tools, which allows for the characterisation of enzyme mechanisms. In parallel, they investigate the evolution of these enzymes to discover how they can evolve new mechanisms and specificities. In close collaboration with colleagues at University College London (UCL), the group investigates ways to improve the prediction of function from sequence and structure and to enable the design of new proteins or small molecules with novel functions, and to understand more about the molecular basis of ageing in different organisms.

Contact Thornton research group