Statistical analysis offers powerful approaches to mine complex and high-dimension datasets, find significant features, and build prediction models with high-prediction performance. Due to the high number of methods available, their complex mathematical background, and the potential pitfalls due to biases and overfitting, a good understanding of each step as well as an environment allowing to efficiently manage the whole workflow, are of major importance.
In this webinar, we will see how to build a statistical workflow within the user-friendly Galaxy environment, including: normalization (signal drift and batch effect), quality control, univariate hypothesis testing, multivariate modelling with (Orthogonal) Partial Least Squares, and feature selection (with Partial Least Squares, Random Forest and Support Vector Machines).
Our example dataset (MTBLS404) can be downloaded from the MetaboLights repository, the Sacurine study, aims at discovering physiological variations of the human urine metabolome with age, body mass index, and gender.
We will be using the PhenoMeNal platform. We will also see how data from the MetaboLights repository can be uploaded directly into Galaxy workflows. Additional statistical modules and public analyses are available on the Workflow4Metabolomics platform.
This webinar took place on 18th April 2018 and was presented by Etienne Thévenot. You can watch the recording and download the slides in Train online.
This webinar is aimed at metabolomics researchers and bioinformaticians. An undergraduate-level understanding of biology and metabolomics would be an advantage.