Performing end-to-end molecular QTLs analysis with the eQTL Catalogue workflows
In this section, we are going to perform an end-to-end molecular QTLs analysis with the eQTL Catalogue workflows. The aim of molecular quantitative trait locus (molQTL) analysis is to identify genetic variants associated with molecular traits such as gene expression levels, transcript usage or splice-junction usage.
This analysis is done using Nextflow. If you want to familiarise yourself with the platform, you can follow this ’Introduction to Nextflow’ tutorial.
This analysis involves multiple complex steps that can be logically separated from each other.
- First, raw genotype data from genotyping microarrays needs to be quality controlled and imputed against the latest reference panel. This can be done with the eQTL-Catalogue/genimpute workflow.
- Secondly, molecular traits need to be quantified from the raw sequencing data. For RNA-sequencing-based traits such as gene expression levels or splice-junction usage, we present the eQTL-Catalogue/rnaseq workflow in this tutorial. For other molecular traits such as chromatin accessibility or histone modifications, we recommend the excellent workflows developed by the nf-core community such as nf-core/atacseq or nf-core/chipseq.
- Once the molecular traits have been quantified, they need to be normalised and standardised for association testing with linear regression. This typically involves adjusting for the sequencing read coverage of each sample and other covariates, log transformation of the data and further inverse normal transformation to reduce the impact of outliers. For the RNA-seq-based traits, these steps have been implemented in the eQTL-Catalogue/qcnorm workflow presented in step 3 of the tutorial.
- Finally, we need to test for associations between normalised molecular traits and imputed genetic variants using the eQTL-Catalogue/qtlmap workflow presented in step 4 of the tutorial.
To finish, watch the webinar below to learn more about Modular and reproducible workflows for federated molecular QTL analysis default.