Course at EMBL-EBI

Genome bioinformatics: from short- to long-read sequencing

A guide to the technology, analysis workflows, tools, and resources for next generation sequencing data analysis.

This course will provide insights and training into how biological knowledge can be derived from genomics experiments and explain different approaches in analysing such data. The main focus will be on introducing sequence informatics, re-sequencing, differences between short- and long-read sequencing, and variant calling during the analysis of higher-eukaryotes, with an emphasis on human genetic research. Throughout the week, more advanced topics will introduce the creation of pipelines, automation, and the scaling-up of analysis experiments.

Practical sessions will be run on datasets prepared by the trainers, not on personal research data. Participants will learn how to process these training datasets and to apply appropriate statistical methods in their analyses.

Who is this course for?

The course is aimed at PhD students and post-doctoral researchers who are starting to use high-throughput sequencing technologies and bioinformatics methods in their research. The content is most applicable for those working with eukaryotic genomes, especially in the area of human genomics.

Participants will require a basic knowledge of the Unix command line in order to adequately complete the practical sessions. A short pre-course session will be offered. Additionally, we recommend this free tutorial or other similar ones: 

Please note that participants without basic knowledge of these resources will have difficulty in completing the practical sessions.

 

What will I learn?

Learning outcomes

After the course participants will be able to:

  • State the advantages and limitations of short- and long-read sequencing technologies
  • Apply appropriate quality control (QC) and aligners to unassembled short- and long-reads
  • Perform variant calling analysis and annotation
  • Scale-up and automate simple genomics pipelines
  • Access genomic datasets from online public resources

Course content

During this course you will learn about: 

  • Quality control methods for cleaning raw sequencing data
  • Alignment of reads to a reference genome
  • File format conversion and processing
  • Tools for variant calling (both single nucleotide and copy number analysis)
  • Approaches for scaling up and reproducible research
  • Methodologies for variant annotation
  • Resources for genomic data:

Trainers

Chiara Batini
University of Leicester
Kayesha Coley
University of Leicester
Erik Garrison
University of Tennessee Health Science Center
Mohab Helmy Abdelfattah Mostafa Elbishbishy
EMBL-EBI
Maira Ihsan
EMBL-EBI
Sean Laidlaw
Wellcome Sanger Institute
Raheleh Rahbari
Wellcome Sanger Institute
Dona Shaju
EMBL-EBI
Malvika Sharan
The Alan Turing Institute
Charles Solomon
University of Leicester
Maxime Tarabichi
Institut de Recherche Interdisciplinaire en Biologie Humaine et Moléculaire (IRIBHM)
This course has ended

20 – 24 November 2023
European Bioinformatics Institute
United Kingdom
£825.00 inclusive of four nights accommodation and catering, including dinner
Contact
Sophie Spencer

Organisers

Share this event with: