Course materials

Federated data analysis

From finding data to data analysis following FAIR principles

These materials include:

  • Videos
  • Practicals
Published
23 February 2023
English

In association with: CINECA

Creative Commons

All materials are free cultural works licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) license, except where further licensing details are provided.

The aim of this learning pathway is to explain and demonstrate how to perform federated analysis of human cohort data.


Using these materials

This learning pathway is intended for two groups of end users: 

  • Researchers in biomedical or life sciences interested in accessing research human data and in carrying out federated data analysis with the data 
  • Bioinformatics tool and software developers interested in synthetic datasets to be used in demonstration and testing of a tool 

Before visiting the learning pathway, it is recommended to have knowledge of or experience with Galaxy as well as with performing bioinformatic analyses. If you are not familiar with Galaxy, you can check this short Introduction to Galaxy tutorial.

These materials provide a mixture of explanations, videos and tutorials to help advance your knowledge and skills in the federated analysis of human cohort data. Go to the Introduction page to get started on working your way through the learning pathway or select a topic on the left hand navigation once you have started the pathway. 

The average time to read through the main body of this learning pathway is 1 to 3 hours. Additional time might be allocated for exercises and external links. The time may vary depending on your prior knowledge and how you choose to work through the course, you do not have to complete the full pathway in one session.

If you would like to provide feedback on this set of course materials, please use the form on the Your feedback page.

This learning pathway has been developed as part of the CINECA project. The CINECA project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 825775 and from the Canadian Institute of Health Research under grant agreement No. 404896.

CINECA logoEU logoCIHR logo

 

Material collection authors

  • Kaur Alasoo (Institute of Computer Science, University of Tartu)
  • Ana Alonso Ayuso (Centre for Genomic Regulation)
  • Mallory Freeberg (EMBL-EBI)
  • Lauren Fromont (Centre for Genomic Regulation)
  • Teresa García Lezana (Centre for Genomic Regulation)
  • Melanie Goisauf (BBMRI-ERIC)
  • Saskia Hiltemann (Erasmus MC)
  • Thomas Keane (EMBL-EBI)
  • Mikael Linden (CSC – IT Center for Science)
  • Marta Lloret Llinares (EMBL-EBI)
  • Gemma Milla Martín (Centre for Genomic Regulation)
  • Emmanuelle Rial-Sebbag (BBMRI-ERIC)
  • Daniel Thomas Lopez (EMBL-EBI)

Learning outcomes

By the end of this learning pathway you should be able to:

  • Cite different resources for finding cohort data
  • Describe the process of obtaining authorisation for using access-controlled human cohort datasets
  • Describe the process of importing datasets for analysis using GA4GH passports after authorisation, on the command-line or using Galaxy.
  • Implement the workflow that you will use for your federated data analysis 
  • Apply the main ELSI aspects you should consider when working with human cohort data
  • Explain the characteristics and advantages of working with synthetic cohort data

Material collection editors

  • Daniel Thomas Lopez, EMBL-EBI
  • Marta Lloret Llinares, EMBL-EBI

DOI: 10.6019/TOL.FederatedData-t.2023.00001.1