Course progress: 0%

How IMPC data is generated: data processing workflow

The IMPC generates phenotypic data by analysing mice using a standardised set of protocols designed to learn as much as possible about different aspects of biology in a high throughput fashion (figure 2). After validation and quality control checks, ETL (Extract, Transform, Load), statistical, and annotation pipelines are applied to identify mutant phenotypes and make associations of phenotypes to specific genes. Based on the results, human-disease associations are generated. Finally, the data is prepared for release and made available on the IMPC website.

Mouse clinics across the world generate data that then go to the data coordination center for data validation and QC checks. GenTaR and IMPReSS resources are used. Then preliminary data goes to the core data archive for the statistical analysis. The output is used to generate human disease-association with PhenoDigm algorithm. Finally, data release is displayed on the mousephenotype.org. All data freely available for users.

Figure 2. Processing workflow.

For more details on data generation, explore this section of the tutorial The International Mouse Phenotyping Consortium: Finding phenotypes for your gene of interest.

In this tutorial you will learn how to access the following programmatically:

Raw experimental data
Statistical results
Hits with significant p-values and assigned mp_terms
Images
Human disease association data

Accessing Mouse Phenotypes and Disease Associations with the IMPC Solr API

How IMPC data is generated: data processing workflow

Congratulations!