Online tutorial

Metagenomics bioinformatics

A practical introduction

Mark as favourite progress

Time to complete:

> 3 hours

This course includes:

  • Activities
  • Videos

Written by:

Last reviewed:

July 2021


Creative Commons

All materials are free cultural works licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) license, except where further licensing details are provided.


Share this course with:

Metagenomics, the genomic analysis of microbial communities from samples like water and soil, involves high-throughput sequencing of the microbial DNA, collecting, archiving and re-sharing the genomic data for taxonomic and functional analysis.

Feedback and help

Who is this course for?

This online course ran virtually between 2-6 November 2020. The lecture sessions were recorded and have been made available here along with the practical exercise guides. Unfortunately we cannot provide compute or support to help install the various software packages used during the course, but Docker containers and GitHub repositories should help in setting up your own environment to try out the computational parts of the course.

A full overview of the live course, including the programme, can be found here.

This course will cover the metagenomics data analysis workflow from the point of newly generated sequence data. Participants will explore the use of publicly available resources and tools to manage, share, analyse and interpret metagenomics data. The content will include issues of data quality control and how to submit to public repositories. While sessions will detail marker-gene and whole-genome shotgun (WGS) approaches; the primary focus will be on assembly-based approaches. Discussions will also explore considerations when assembling metagenomic data, the analysis that can be carried out by MGnify on such datasets, and what downstream analysis options and tools are available.

This course is aimed at life scientists who are working in the field of metagenomics and are currently in the early stages of data analysis. Participants should have some prior experience of using bioinformatics in their research.

The practical sessions in the course require a basic understanding of the Unix command line and the R statistics package. Participants might want to work through these free tutorials before attending the course:

What will I achieve?

By the end of the course you will be able to:

  • Conduct appropriate quality control and decontamination of metagenomic data and run simple assembly pipelines on short-read data
  • Utilise public datasets and resources to identify relevant data for analysis
  • Apply appropriate tools in the analysis of metagenomic data
  • Submit metagenomics data to online repositories for sharing and future analysis
  • Apply relevant knowledge in strain resolution and comparative metagenomic analysis to their own research

What resources do I need?

In order to run the MGnify practicals from the course, you will need to make use of several Docker containers. These software packages contains all the tools and data you need for the MGnify practicals to work. All you need to do is download and run the relevant container on your local machine, meaning you will not need to worry about installing the individual software programmes yourself. The course materials and software page has more information.

DOI: 10.6019/TOL.Metagenomics-t.2019.00001.1

Course contents