Samples, Phenotypes and Ontologies

The Samples, Phenotypes and Ontologies team, led by Helen Parkinson, is organised into two themes: BioSamples and Semantic Data Integration,  Mouse Informatics. The team is part of the Molecular Archives Cluster and the Genes, Genomes and Variation Cluster and provides ontologies, ontology tooling, and resources providing access to samples and ontologies both for EBI resources and external users. 

Mouse Informatics

Mouse Informatics led by Coordinator Terry Meehan, Project Leads Nathalie Conte and Jeremy Mason develops international infrastructure for mouse data archiving, integration and dissemination. The team is a partner on several international projects, including: The International Mouse Phenotyping Consortium (IMPC), KOMP2, INFRAFRONTIER-I3PDXIntegrator and IPAD-MD. All these externally funded projects share a common theme of data integration of rich phenotypic data derived from model species to better understand gene function and how genetic variation contributes to disease.

BioSamples and Semantic Data Integration

The Biosamples and Data Integration activities of the team are led by Coordinator Tony Burdett and develops metadata- and ontology-rich resources including the BioSamples database and the GWAS Catalog, as well as ontology and semantic integration tooling to support EMBL-EBI resources and external collaborators. Tony's team is also involved in building the infrastructure behind the Human Cell Atlas Data Coordination Platform ingestion services and collaborates with the EMBL-EBI Unified Submissions Interface project. 


Melanie Courtot leads the BioSamples database team. BioSD provides the centralised samples resource at EMBL-EBI. It accepts direct sample submissions from the community, as well as aggregates samples from other resources.  The bulk of BioSD’s data derives from EMBL-EBI references resources such as the European Nucleotide Archive, thereby integrating sample related data across data archives. This offers an opportunity to standardize metadata, detect related samples and be the authority for samples for downstream use while providing a consolidated view over them. Semantic Data Integration

Semantic Data Integration

Simon Jupp leads the semantic data integration team delivering ontologies and tools such as OLS


Members of the team work on ~15 projects with external collaborators where we are funded by the Chan-Zuckerberg Institute (CZI), the European Commission, the BBSRC, the Wellcome Trust, the National Institutes of Health and OpenTargets in collaboration with GSK, BioGen, Takeda and the Wellcome Trust Sanger Institute. These range from data analysis and generation projects such as IMPC and projects which are mainly infrastructural such as CORBEL and EXCELERATE. We work closely with ELIXIR as members of the Interoperability Platform.

In collaboration with partners in the KOMP2 project and the International Mouse Phenotyping Consortium, the team manages, analyses, and distributes complex phenotypic data from 20,000 knockout mouse lines and, promotes mouse data integration internationally via the IMPC. The team also develops open-source software tools for managing data, developing and integrating ontologies and data, and works with semantic web technologies.

The team actively develops ontologies including the Gene Ontology and Experimental Factor Ontology and delivers Ontology Lookup Service, an ontology search and cross referencing tool

We collaborate closely with Melissa Haendel and Chris Mungall of the Monarch Inititiave, an integration project for genotype/phenotype data, by supplying human and mouse phenotypic data and using their tools in delivery of the IMPC portal. We consume the Uberon Ontology - a critical integrative resource for vertebrate anatomy used across EBI's sample representation.

Annual reports

  • Parkinson team, 2012
  • Parkinson team, 2011
  • Parkinson team, 2010