Parallel processing, version control and open, reproducible science
Scaling things up
Trainers: Sean Laidlaw and Raheleh Rahbari
Overview: This lecture provides an overview of processing multiple biological datasets through a variety of methods, such as sequential and parallel computing. The practical provides training on parallelize processes for genomic data processing and analysis.
Learning outcomes:
By the end of this session you will be able to:
- List the steps required to conduct sequential and parallel computing.
- Know how to apply sequential and parallel computing for the processing and analysis of genomic data.
Materials:
Introduction to GitHub
Trainers: Sean Laidlaw and Raheleh Rahbari
Overview: This lecture provides a theoretical and historical overview of version control using Git, and gives technical guidance on Git, Github and the open-source Gitlab. The practical provides training on Git commits in the command line.
Learning outcomes:
By the end of this session you will be able to:
- Define the concepts within version control using Git.
- Apply version control using Git, in the command line
Materials:
The Turing Way and reproducible research aspects of data science
Trainers: Malvika Sharan
Overview: This lecture provides an introduction to reproducible, ethically-led open science and the Turing Way.
Learning outcomes:
By the end of this session you will be able to:
- List steps scientists can take to make their research more reproducible and ethically-led in open science frameworks.
Materials: