Raw read pre-processing, quality control, and read mapping

All data and files for this section can be found in the EBI training FTP.

Overview of NGS technology

Trainers: Chiara Batini

Overview: This lecture provides an overview of NGS technologies and provides both a theoretical and historical perspective to sequencing technology and data.

Learning outcomes:

By the end of this session you will be able to:

  • List the main technologies used to sequence lengths of genes and genomes

Materials:


Introduction to UNIX

Trainers: Kayesha Coley

Overview: In this lecture an introduction to UNIX, Linux and the command line, covering the basic commands required for working within the terminal window, is provided.

Learning outcomes:

By the end of this session you will be able to:

  • Navigate UNIX subsystems using the command line in a terminal window

Materials:


Quality control

Trainers: Charles Solomon

Overview: This lecture and practical gives an overview of sequencing file formats and how to interpret quality of sequencing before proceeding to downstream processing.

Learning outcomes:

By the end of this session you will be able to:

  • List the file formats common within sequencing experiments and pipelines
  • Interpret the quality of sequencing experiments’ initial results

Materials:


Read mapping

Trainers: Charles Solomon and Chiara Batini

Overview: This lecture covers the usage of fastq/fastqc files in alignment to a reference genome, the output files SAM/BAM and the downstram processes thereafter. The practical provides training in how to you use samtools, bwa, picard and other tools to align reads to a reference genome.

Learning outcomes:

By the end of this session you will be able to:

  • Describe the usage of sequencing output files in alignment to a reference genome
  • Use various tools to align reads to a reference genome

Materials: