Virtual course
Systems biology: From large datasets to biological insight
This course, run with Wellcome Connecting Science, covers the use of multi-omics data and methodologies in systems biology. The content will explore a range of approaches - ranging from network inference to machine learning - that can be used to extract biological insights from varied data types. Together these techniques will provide participants with a useful toolkit for designing new strategies to extract relevant information and understanding from large-scale biological data.
The motivation for running this course is a result of advances in computer science and high-performance computing that have led to groundbreaking developments in systems biology model inference. With the comparable increase of publicly-available, large-scale biological data, the challenge now lies in interpreting them in a biologically valuable manner. Likewise, machine learning approaches are making a significant impact in our analysis of large omics datasets and the extraction of useful biological knowledge.
Virtual course
Participants will learn via a mix of pre-recorded lectures, live presentations, and trainer Q&A sessions. Practical experience will be developed through group activities and trainer-led computational exercises. Live sessions will be delivered using Zoom with additional support and communication via Slack.
Pre-recorded material will be made available to registered participants prior to the start of the course and the week before the course there will be a brief induction session. Computational practicals will run on EMBL-EBI's virtual training infrastructure, meaning participants will not require access to a powerful computer or install complex software on their own machines; however, a small app will need to be installed to access the compute infrastructure.
Participants will need to be available between the hours of 09:30-17:30 BST each day of the course. Trainers will be available to assist, answer questions, and further explain the analysis during these times.
Who is this course for?
This course is aimed at advanced PhD students and post-doctoral researchers who are currently working with large-scale omics datasets with the aim of discerning biological function and processes. Ideal applicants should already have some experience (ideally 1-2 years) working with systems biology or related large-scale multi-omics data analyses.
Applicants are expected to have a working knowledge of the Linux operating system and the ability to use the command line. Experience of using a programming language (i.e. Python) is highly desirable, and while the course will make use of simple coding or streamlined approaches such as Python notebooks, higher levels of competency will allow participants to focus on the scientific methodologies rather than the practical aspects of coding and how they can be applied in their own research.
We recommend these free tutorials:
- Basic introduction to the Unix environment:
- Introduction and exercises for Linux:
- Python turorial:
- R tutorial:
Regardless of your current knowledge we encourage successful participants to use these, and other materials, to prepare for attending the course and future work in this area.
What will I learn?
Learning outcomes
After the course you should be able to:
- Discuss and apply a range of data integration and reduction approaches for large-scale omics data
- Apply different approaches to explore omics data at the network level
- Describe principles behind different machine learning methods and apply them on omics datasets to extract biological knowledge
- Infer biological models using statistical methods
- Identify strengths and weaknesses of different inference approaches
- Compare signal propagation through logic modelling vs diffusion-based approaches
Course content
The course will include lectures, discussions, and practical computational exercises covering the following topics:
- Data reduction and data integration methods – including comparisons of major approaches through lectures and practical exercises
- Machine and deep learning – practical exercises on supervised machine learning, including classification and regression, and deep learning
- Functional inference from omics data – approaches to extract signatures of cell state from omics data including transcription factor activation and kinase activity states. Extraction of upstream signaling pathways from transcriptomics datasets
- Network inference and signal propagation – network inference approaches from omics data
- Introduction to executable modeling – including how to fit omics data to executable and predictive logic models
Trainers
EMBL-EBI
University College London
Wellcome Trust Sanger Institute
University of Tartu
EMBL-EBI
EMBL-EBI
EMBL-EBI
EMBL-EBI
University of Salamanca
Heidelberg University
Eindhoven University of Techology
TH Köln University of Applied Sciences & ZB MED Information Centre for Life Sciences
Ludwig-Maximilians University Munich
Heidelberg University
Programme
Day 1 - Monday 21 June | ||
Data reduction and batch effects | ||
09:30-09:45 | Arrival and networking hangout | |
09:45-10:00 | Welcome and introduction to virtual training | |
10:00-10:30 | Introductions and networking | All |
10:30-11:00 | Data reduction practical | Evangelia Petsalaki & Girolamo Giudice |
11:00-11:15 | Break | |
11:15-12:30 | Batch effects recap, Q&A, practical | Ioannis Kamzolas |
12:30-13:30 | Break | |
13:30-14:30 | Keynote talk, Q&A | Nataša Pržulj |
14:30-15:00 | Flash talks | All |
15:00-15:30 | Open discussion | All |
15:30-16:00 | Networking | All |
16:00 | End of day | |
Day 2 - Tuesday 22 June | ||
Machine learning | ||
09:30-09:45 | Arrival and hangout | |
09:45-10:00 | Group activity | All |
10:00-10:30 | Machine learning Q&A | Konrad Foerstner |
10:30-10:45 | Break | |
10:45-12:30 | Machine learning practical | Konrad Foerstner |
12:30-13:30 | Break | |
13:30-14:00 | Introduction to deep learning Q&A | Leo Parts & Dmytro Fishman |
14:00-15:45 | Deep learning practical | Leo Parts & Dmytro Fishman |
15:45-16:00 | Break | |
16:00-16:30 | Flash talks | |
16:30 | End of day | |
Day 3 - Wednesday 23 June | ||
Data integration | ||
09:30-09:45 | Arrival and hangout | |
09:45-10:00 | Group activity | All |
10:00-10:30 | Integration using Cytoscape Q&A | Kalpana Paneerselvam & Birgit Meldal |
10:30-10:45 | Break | |
10:45-12:30 | Integration using Cytoscape practical | Kalpana Paneerselvam & Birgit Meldal |
12:30-13:30 | Break | |
13:30-15:30 | Integration using MOFA & JIVE practical | Manik Garg |
15:30-15:45 | Break | |
15:45-16:30 | Integration using MOFA & JIVE practical | Manik Garg |
16:30-17:00 | Flash talks | |
17:00 | End of day | |
Day 4 - Thursday 24 June | ||
Network inference and signal propagation | ||
09:30-09:45 | Arrival and hangout | |
09:45-10:00 | Group activity | All |
10:00-10:30 | Network inference Q&A | Federica Eduati & Javier De Las Rivas |
10:30-10:45 | Break | |
10:45-12:30 | Network inference practical | Javier De Las Rivas |
12:30-13:30 | Break | |
13:30-14:30 | Basics of logic modeling practical | Federica Eduati |
14:30-15:15 | Flash talks | |
15:15-15:30 | Break | |
15:30-16:30 | Key note talk, Q&A | Julio Saez Rodriguez |
16:30 | End of day | |
Day 5 - Friday 25 June | ||
Signal propagation | ||
09:30-09:45 | Arrival and hangout | |
09:45-10:00 | Group activity | All |
10:00-12:00 | Data analysis to logic modeling practical | Aurelien Dugourd |
12:00-13:00 | Break | |
13:00-14:00 | A replication crisis in methodological computational research | Anne-Laure Boulesteix |
14:00-14:30 | Final discussion session | Scientific Organisers |
14:30-15:00 | Wrap-up & feedback | Dayane Araujo |
Registration for this course will be handled through the Wellcome Trust Genome Campus Courses & Conferences website or by clicking on the "Apply now" button.
Participant flash talks
All participants will be asked to give a short presentation about their research work as part of the course. These provide an opportunity to share their research with the other participants and provide a forum for discussion. Further details will be provided following registration.