Course at EMBL-EBI

Summer school in bioinformatics

This course provides an introduction to the use of bioinformatics in biological research, giving participants guidance for using bioinformatics in their work whilst also providing hands-on training in tools and resources appropriate to their research.

Participants will initially be introduced to bioinformatics theory and practice, including best practices for undertaking bioinformatics analysis, data management and reproducibility. To enable specific exploration of resources in their particular field of interest, participants will be divided into focused groups to work on a small project set by EMBL-EBI resource and research staff, ending in a presentation from each group on the final day of the course.

The course includes training and mentoring provided by experts from EMBL-EBI and external institutes.

Group projects

A major element of this course is a group project, where participants will be placed in small groups to work together on a challenge set by trainers from EMBL-EBI data resource and research teams.  This allows people to explore the bioinformatics tools and resources available in their area of interest and to apply these to a set problem, providing them with hands-on experience of relevance to their own research. The group work will culminate in a presentation session involving all participants on the final day of the course, giving an opportunity for wider discussion on the benefits and challenges of working with biological data.

Groups are mentored by the trainers who set the initial challenge, but active participation from all group members is expected.  Groups are pre-organised before the course, and all group members will be sent some short “homework” in preparation for their project work prior to the start of the course.

The basic outline of the projects on offer this year are given below.  In your application you should indicate your first and second choice of project, based on your judgement of which would benefit your research most.  Not all projects may be offered, final decisions on which projects will be run during the course will be made based on the number of applicants per project.

This year’s projects are as follows:

Networks and pathways

The project will make use of gene expression data (RNA-seq) to build protein-protein interaction networks which can be used to explore functional relationships between the (potentially) expressed protein products. You will use Cytoscape to visualise protein networks, identify key regulators of biological pathways and explore biological function through network analysis, integration and co-visualisation of additional data, and ontology/functional enrichment analysis - helping to build a better view of the wider biological context.

Metabolic network engineering using a systems model based approach

You will work with a model example of metabolic pathways set, coming from the BioModels database, and you will learn how to carry out computational analyses to find common patterns (i.e. set of reactions) in the network. These might include computing feasible pathways through the network, and minimal set of reactions to knock out specific metabolic functions. Visualisation of results will be achieved with an interactive graphical tool available as a web service.

Modelling cell signalling pathways

Curating models of biological processes is an effective training in computational systems biology, where the curators gain an integrative knowledge on biological systems, modelling and bioinformatics. You will learn to encode models of signalling pathways from a recent publication using COPASI and reproduce the simulation results. Furthermore you will learn to annotate models and learn to re-use pre-existing models from open repositories such as BioModels.

Proteomics (data analysis and functional annotation)

In this project, you will obtain real-life proteomics data from clinical tumour samples. Your task will be to process the raw data, analyse the results, and eventually interpret them in a wider context using the Open Targets Platform.

An introduction to deep learning through functional annotation of proteins

Automatically annotating protein sequences with functional information is vital in a world where sequences are produced so fast that humans can't keep up. In this project you will explore how deep learning can be used to enrich sequences automatically.

Single cell characterization of cell types and cell development

This project will make use of single cell RNA Sequencing data (scRNA-Seq) to show how to: 1) quality control the sequencing data; 2) understand the variances of the data; 3) cluster the cell types; 4) understand the cell development; 5) find differential expression genes that determine the cell types or cell development. You will use data from the Human Cell Atlas and Tabula Muris to understand human and mouse cell types respectively.

Finding and extracting meaningful structural data from PDBe

This project will introduce you to the wealth of data available at PDBe and how this can be extracted to analyse macromolecular structures. You will firstly explore the search and entry pages at PDBe to identify the type of data available for analysis. Using this knowledge, you will then use and adapt template scripts in order to access this data programmatically and analyse a subset of your results. This project should give you the foundation of knowledge about how to access data through the PDBe API, and how you can analyse subsets of PDB data related to your field of expertise.

Exploring variation data across human populations

Natural variation is required to generate the broad range of traits and phenotypes that exist between single individuals and between different populations. In this project you will explore the results of SNP-calling using web-based resources such as Ensembl Variant Effect Predictor. You will predict the functional consequences of variants between separate human populations and identify the variant(s) within your samples that have been associated with several interesting phenotypes.

Who is this course for?

This course is aimed at individuals working across biological sciences who have little or no experience in bioinformatics. Applicants are expected to be at an early stage of using bioinformatics in their research with the need to develop their skills and knowledge further. No previous knowledge of programming / coding is required for this course.

What will I learn?

Learning outcomes

After this course you should be able to:

  • Discuss applications of bioinformatics in biological research

  • Browse, search, and retrieve biological data from public repositories

  • Use appropriate bioinformatics tools to explore biological data

  • Comprehend some ways biological data can be stored, organised and interconverted
Course content

During this course you will learn about:

  • Bioinformatics as a science

  • Designing bioinformatics studies

  • Data management and reproducibility

  • Basic tools and resources for bioinformatics

The exact range of resources and tools covered will vary by project; there will be no opportunity for you to analyse your own data during this course.


Alex Bateman
Melissa Burke
Paolo Di Tommaso
Centre for Genomic Regulation, Spain
Evan Floden
Centre for Genomic Regulation, Spain
Alexandra Holinski
Nikiforos Karamanis
Lee Larcombe
Fabio Madeira
Sarah Morgan
Cedric Notredame
Centre for Genomic Regulation, Spain
Virginie Uhlmann
Mohamed Alibi
Peter McQuilton
Oxford e-Research Centre, UK
Anna Swan


Day 1 - Monday 24 June 2019
10:00 – 10:30 Registration and tea/coffee  
10:30 – 11:30 Welcome and introduction Alexandra Holinski
11:30 – 12:30 The science of bioinformatics Alex Bateman
12:30 – 14:00 Lunch + poster session  
14:00 – 15:30 Data visualisation 101: A practical introduction to designing scientific Figures Niki Karamanis
15:30 – 16:00 Tea/coffee break  
16:00 – 17:30 Good data management: Making your data FAIR + Activity Peter McQuilton
17:30 – 18:00 Bedroom-check-in  
18:00 – 19:00 Networking-drinks  
19:00 Evening meal @ Hinxton Hall  
Day 2 - Tuesday 25 June 2019
09:00 – 10:00 An introduction to EMBL-EBI data resources Melissa Burke, Sarah Morgan
10:00 – 10:45 Introductory computational skills Mohamed Alibi
10:45 – 11:15 Tea/coffee break  
11:15 – 12:00 An introduction to EMBL-EBI webservices Fabio Madeira
12:00 – 13:30 Lunch + poster session  
13:30 – 15:30 Managing reproducible in silico analysis with Nextflow Evan Floden / Paolo Di Tommaso
15:30 – 16:00 Tea/coffee break  
16:00 - 18:00 Introduction to group projects and meet your mentors Sarah Morgan; all mentors
19:00 Evening meal @ Hinxton Hall  
Day 3 - Wednesday 26 June 2019
09:00 - 10:30 Group work  
10:30 - 11:00 Tea/coffee break  
11:00 - 12:30 Group work  
12:30 - 13:30 Lunch  

13:30 - 14:15

Keynote lecture:  A guided tour of parametric representations for the analysis of objects in bioimages Virginie Uhlmann
14:15 - 15:30 Group work  
15:30 - 16:00 Tea/coffee break  
16:00 - 17:30 Group work  
17:30 - 18:00 Mini tutorial: Machine Learning - a short introduction Anna Swan
19:00 Evening meal @ Hinxton Hall  
Day 4 - Thursday 27 June 2019
09:00 - 09:30 Group projects 2 minute interim report  
09:00 - 10:30 Group work  
10:30 - 11:00 Tea/coffee break  
11:00 - 12:00 Group work  
12:00 - 12:30 Mini tutorial: Interpreting Integrated Data Lee Larcombe
12:30 - 13:30 Lunch  
13:30 - 14:30 Group work  
14:30 - 15:00 Tea/coffee break  
15:00 - 17:45 Group work  
17:45 - 18:30 Keynote lecture: Big Alignment, Big Phylogenies, Big Mess? Cedric Notredame
18:30 - 19:30 Pre Dinner Drinks  
19:30 prompt Silver Service Dinner  
19:30 Cash Bar  
Day 5 - Friday 28 June 2019
09:00 – 10:30 Preparation of group presentations  
10:30 – 11:00 Tea/coffee break  
11:00 – 12:00 Group presentation  
12:00 – 13:00 Lunch  
13:00 – 14:00 Group presentation  
14:00 – 15:00 Award ceremony, course feedback and wrap up  
15:30 Bus to train station  

Registration is handled through the Wellcome Trust Advanced Courses website.

Accommodation will be provided in the Wellcome Genome Campus Conference Centre Monday-Friday inclusive. Please contact the Conference Centre directly if you wish to arrange to stay additional nights. The course fee includes breakfast and evening meals at Hinxton Hall, as well as breaks and lunches outside the EMBL-EBI training rooms.

This course has ended

24 - 28 June 2019
European Bioinformatics Institute United Kingdom
Scientific conferences

  • Alexandra Holinski
  • Sarah Morgan
  • Cedric Notredame
    Centre for Genomic Regulation, Spain

In association with:

Share this event with: