Virtual course

Systems biology: From large datasets to biological insight

This course, run with Wellcome Connecting Science, covers the use of multi-omics data and methodologies in systems biology. The content will explore a range of approaches - ranging from network inference to machine learning - that can be used to extract biological insights from varied data types. Together these techniques will provide participants with a useful toolkit for designing new strategies to extract relevant information and understanding from large-scale biological data.

The motivation for running this course is a result of advances in computer science and high-performance computing that have led to groundbreaking developments in systems biology model inference. With the comparable increase of publicly-available, large-scale biological data, the challenge now lies in interpreting them in a biologically valuable manner. Likewise, machine learning approaches are making a significant impact in our analysis of large omics datasets and the extraction of useful biological knowledge.

Virtual course

Participants will learn via a mix of pre-recorded lectures, live presentations, and trainer Q&A sessions. Practical experience will be developed through group activities and trainer-led computational exercises. Live sessions will be delivered using Zoom with additional support and communication via Slack

Pre-recorded material will be made available to registered participants prior to the start of the course and the week before the course there will be a brief induction session. Computational practicals will run on EMBL-EBI's virtual training infrastructure, meaning participants will not require access to a powerful computer or install complex software on their own machines; however, a small app will need to be installed to access the compute infrastructure.

Participants will need to be available between the hours of 09:30-17:30 BST each day of the course. Trainers will be available to assist, answer questions, and further explain the analysis during these times.

Who is this course for?

This course is aimed at advanced PhD students and post-doctoral researchers who are currently working with large-scale omics datasets with the aim of discerning biological function and processes. Ideal applicants should already have some experience (ideally 1-2 years) working with systems biology or related large-scale multi-omics data analyses.

Applicants are expected to have a working knowledge of the Linux operating system and the ability to use the command line. Experience of using a programming language (i.e. Python) is highly desirable, and while the course will make use of simple coding or streamlined approaches such as Python notebooks, higher levels of competency will allow participants to focus on the scientific methodologies rather than the practical aspects of coding and how they can be applied in their own research.

We recommend these free tutorials:

Regardless of your current knowledge we encourage successful participants to use these, and other materials, to prepare for attending the course and future work in this area.

What will I learn?

Learning outcomes

After the course you should be able to:

  • Discuss and apply a range of data integration and reduction approaches for large-scale omics data
  • Apply different approaches to explore omics data at the network level
  • Describe principles behind different machine learning methods and apply them on omics datasets to extract biological knowledge
  • Infer biological models using statistical methods
  • Identify strengths and weaknesses of different inference approaches
  • Compare signal propagation through logic modelling vs diffusion-based approaches
Course content

The course will include lectures, discussions, and practical computational exercises covering the following topics:

  • Data reduction and data integration methods – including comparisons of major approaches through lectures and practical exercises
  • Machine and deep learning – practical exercises on supervised machine learning, including classification and regression, and deep learning
  • Functional inference from omics data – approaches to extract signatures of cell state from omics data including transcription factor activation and kinase activity states. Extraction of upstream signaling pathways from transcriptomics datasets
  • Network inference and signal propagation – network inference approaches from omics data
  • Introduction to executable modeling – including how to fit omics data to executable and predictive logic models


Girolamo Giudice
Nataša Pržulj
University College London
Leo Parts
Wellcome Trust Sanger Institute
Dmytro Fishman
University of Tartu
Kalpana Paneerselvam
Birgit Meldal
Manik Garg
Evangelia Petsalaki
Javier De Las Rivas
University of Salamanca
Julio Saez Rodriguez
Heidelberg University
Federica Eduati
Eindhoven University of Techology
Konrad Förstner
TH Köln University of Applied Sciences & ZB MED Information Centre for Life Sciences
Anne-Laure Boulesteix
Ludwig-Maximilians University Munich
Aurelien Dugourd
Heidelberg University


Day 1 - Monday 21 June
Data reduction and batch effects
09:30-09:45 Arrival and networking hangout  
09:45-10:00 Welcome and introduction to virtual training  
10:00-10:30 Introductions and networking All
10:30-11:00 Data reduction practical Evangelia Petsalaki & Girolamo Giudice
11:00-11:15 Break  
11:15-12:30 Batch effects recap, Q&A, practical Ioannis Kamzolas
12:30-13:30 Break  
13:30-14:30 Keynote talk, Q&A Nataša Pržulj
14:30-15:00 Flash talks All
15:00-15:30 Open discussion All
15:30-16:00 Networking All
16:00 End of day  
Day 2 - Tuesday 22 June
Machine learning
09:30-09:45 Arrival and hangout  
09:45-10:00 Group activity All
10:00-10:30 Machine learning Q&A Konrad Foerstner
10:30-10:45 Break  
10:45-12:30 Machine learning practical Konrad Foerstner
12:30-13:30 Break  
13:30-14:00 Introduction to deep learning Q&A Leo Parts & Dmytro Fishman
14:00-15:45 Deep learning practical Leo Parts & Dmytro Fishman
15:45-16:00 Break  
16:00-16:30 Flash talks  
16:30 End of day  
Day 3 - Wednesday 23 June
Data integration
09:30-09:45 Arrival and hangout  
09:45-10:00 Group activity All
10:00-10:30 Integration using Cytoscape Q&A Kalpana Paneerselvam & Birgit Meldal
10:30-10:45 Break  
10:45-12:30 Integration using Cytoscape practical Kalpana Paneerselvam & Birgit Meldal
12:30-13:30 Break  
13:30-15:30 Integration using MOFA & JIVE practical Manik Garg
15:30-15:45 Break  
15:45-16:30 Integration using MOFA & JIVE practical Manik Garg
16:30-17:00 Flash talks  
17:00 End of day  
Day 4 - Thursday 24 June
Network inference and signal propagation
09:30-09:45 Arrival and hangout  
09:45-10:00 Group activity All
10:00-10:30 Network inference Q&A Federica Eduati & Javier De Las Rivas
10:30-10:45 Break  
10:45-12:30 Network inference practical Javier De Las Rivas
12:30-13:30 Break  
13:30-14:30 Basics of logic modeling practical Federica Eduati
14:30-15:15 Flash talks  
15:15-15:30 Break  
15:30-16:30 Key note talk, Q&A Julio Saez Rodriguez
16:30 End of day  
Day 5 - Friday 25 June
Signal propagation
09:30-09:45 Arrival and hangout  
09:45-10:00 Group activity All
10:00-12:00 Data analysis to logic modeling practical Aurelien Dugourd
12:00-13:00 Break  
13:00-14:00 A replication crisis in methodological computational research Anne-Laure Boulesteix
14:00-14:30 Final discussion session Scientific Organisers
14:30-15:00 Wrap-up & feedback Dayane Araujo

Registration for this course will be handled through the Wellcome Trust Genome Campus Courses & Conferences website or by clicking on the "Apply now" button.

Participant flash talks

All participants will be asked to give a short presentation about their research work as part of the course. These provide an opportunity to share their research with the other participants and provide a forum for discussion. Further details will be provided following registration.

This course has ended

21 - 25 June 2021
Advanced Courses

  • Dayane Araujo
  • Evangelia Petsalaki
  • Konrad Förstner
    TH Köln University of Applied Sciences & ZB MED Information Centre for Life Sciences, Germany
  • Federica Eduati
    Eindhoven University of Technology, Netherlands

In association with:

Share this event with: