You are here
Driving genomics: EMBL-EBI and GA4GH
Driving genomics: EMBL-EBI and GA4GH
- The Global Alliance for Genomics and Health (GA4GH) is developing and testing the data standards needed to bring genomics into the clinic and enable precision medicine. This work is being spearheaded by 15 real-world ‘Driver Projects’.
- EMBL-EBI is involved in three Driver Projects through its molecular archives (ENA, EGA and EVA), membership in ELIXIR and collaboration in the Human Cell Atlas data-coordination platform.
- GA4GH has delivered its first specification: a data-retrieval interface called ‘htsget’, developed with significant input from EMBL-EBI.
EMBL-EBI’s genomic data resources are among 15 formal collaborations launched this year by the Global Alliance for Genomics and Health (GA4GH), the standards-setting body for genomics in healthcare. Other ‘Driver Projects’ include Genomics England, the Human Cell Atlas and the All of Us Research Program in the US. Chaired by EMBL-EBI Director Ewan Birney, GA4GH seeks to enable the responsible sharing of clinical-grade genomic data by 2022.
First standard delivered
‘Htsget’, an application program interface (API) for genomic data retrieval developed with significant input from EMBL-EBI, is the first genomics standard under ‘GA4GH Connect’, the five-year strategic plan announced in October.
Htsget enables researchers to download sequencing data in bulk, selecting only the most relevant sections of their genome of interest. Htsget builds on existing standards and is a practical interface for anyone working with genomic data. It saves significant storage and compute power by avoiding the need to download whole files.
Most human-genome datasets have been generated in research, but now the vast majority are expected to be generated by healthcare. That is a paradigm shift for human genetics research.
"We have a responsibility to enable this future for everyone."
-Ewan Birney, GA4GH Chair
GA4GH Driver Projects are established international genomic data initiatives. They were selected to identify, develop and pilot data-sharing frameworks and standards in real-world settings. This ensures GA4GH efforts are rooted in the immediate, practical needs of the scientific and clinical communities.
One driver project is enabling a ‘federated’ approach to data sharing in healthcare. It focuses on interoperability between databases through open, standardised formats and APIs. It combines the efforts of three public data resources at EMBL-EBI:
- The European Nucleotide Archive (ENA): an open data resource that provides experimental workflows for nucleotide sequencing.
- The European Genome–phenome Archive (EGA): a resource for secure, controlled-access, permanent archiving and sharing of human data – both genetic and phenotypic – produced in biomedical research projects (co-managed by EMBL-EBI and the Centre for Genomic Regulation, Barcelona).
- The European Variation Archive (EVA): an open-access database of all types of genetic variation data from all species.
These services already provide the infrastructure for sharing genetic data in research. Now, they are helping GA4GH create practical data-sharing standards for clinical research and healthcare around the world.
“Now, genomic data for healthcare is warehoused in many locations, in many countries. Our goal is to connect them, kind of like a postal service between data infrastructures – a secure, consistent delivery system that gets the right data to the right recipient,” says Thomas Keane, Team Leader for EGA infrastructure at EMBL-EBI and co-lead of the GA4GH Large Scale Genomics Work Stream.
Responsible data sharing
EMBL-EBI provides essential services to the scientific community by providing open data and tools for analysis. Its involvement in GA4GH will ensure it can address the needs of new end-users, and adapt its service delivery in response to the changing needs of clinical research communities.
“Healthcare is harnessing the power of genomics to make better diagnoses and treatment decisions in rare disease and cancer across the world,” says Birney. “We have a responsibility to enable this future for everyone, and to harness the resulting data for further research on human health and fundamental biology.”
Read the GA4GH Roadmap here.
Birney E, Vamathevan J, Goodhand P (2017) Genomics in healthcare: GA4GH looks to 2022. bioRxiv (preprint). Published online 15 October; doi: 10.1101/203554
Videos and presentations
- GA4GH Connect: International genomics and health leaders speak on the benefit of data sharing
- Justin's Odyssey: A journey in rare disease data sharing
- GA4GH 5th Plenary video and slides: See what GA4GH leaders had to say at the launch of GA4GH Connect in October 2017.
GA4GH Driver Projects in 2017
- NIH All of Us Research Program
- Australian Genomics
- BRCA Challenge
- ELIXIR Beacon
- EMBL-EBI archives: ENA / EVA / EGA
- Genomics England
- Matchmaker Exchange
- Monarch Initiative
- NCI Genomic Data Commons
- Variant Interpretation for Cancer Consortium
- Human Cell Atlas
- TopMed: Trans-Omics for Precision Medicine Program
If you are interested in getting involved, start by looking through the GA4GH Work Streams, which provide guidance to GA4GH projects for regulatory and ethics, data security and standards development in genomics.