Pathogen data sharing – key to pandemic preparedness

The COVID-19 pandemic was the first major stress test for using genome sequencing as a new tool to understand and monitor a pathogen that can spread globally at lightning speed.
Illustration showing globe made of human figures, with viruses and bacteria floating around. This illustrates the idea of sharing pathogen data.
Credit: Adobe Stock image edited by Karen Arnott

Despite the significant efforts of scientists and public health officials, the global response was not perfect due to a number of factors, including frictions in global data sharing.

A group of scientists at EMBL’s European Bioinformatics Institute (EMBL-EBI) and ELIXIR, have now published an opinion piece that aims to inform the ongoing discussion around pathogen genomics data sharing. The article, published on open-access repository Zenodo, outlines a set of principles for pathogen genomic data sharing, which would allow researchers and public health authorities to leverage the potential of new genome sequencing technologies.

Read the opinion piece on Zenodo.

Public health and research applications

Genomic sequencing is a versatile technology that can be applied in different ways to explore a range of questions about pathogens. Sequencing can resolve pathogen genomic variations allowing public health scientists to track outbreaks, cluster cases and investigate sources. The same datasets can also provide researchers with mechanistic details of the biology of the pathogen, drug resistance and infection processes, and yield insights that allow the design of therapies and vaccines.

An optimal public health response to a disease outbreak requires the latest outputs from research; at the same time, driving research forwards requires data from public health sequencing activities. Together – argue the authors – the two applications of the data across both public health and research will greatly strengthen our ability to manage future infectious disease.

Foundations for better data sharing

According to the authors, there is a missed opportunity in the fact that genomic sequencing is not yet used fully as a shared platform by public health and research; they go on to say that much is lost by not integrating these two workflows better.

While they acknowledge the barriers and sensitivities around the issue, they make the point that data sharing for public health – especially in the case of a pandemic – is most effective at a global level and with full openness. Where there is lower urgency, data sharing can happen later to protect the research opportunities of data producers in lower resource settings.

To address these issues in the longer term, the scientific community must seek better ways to protect data providers and those involved in curation and processing prior to the publication of research results in the literature. Appropriate recognition and credit to all who contribute to the chain of events from raw data sharing to the release of processed data will help to redress the balance.

Other elements that should be taken into account are the urgency of data sharing, the types of data, the added value of bioinformatics for data analysis, and finally, the need for appropriate governance to be in place.

The authors set out a set of principles for data sharing:

  • Pathogen genomics data sharing is a shared interest and responsibility for public health and research communities
  • Pathogen genomics data have the fullest value as a comprehensive set rather than as fragmented collections
  • Open pathogen genomics data linking is required to provide context to pathogen genomics data
  • All pathogen genomics data should be as open as possible as early as possible with the urgency of data sharing for different applications reflected in best practice
  • Data standards should be developed and maintained by the scientific communities that provide and consume pathogen genomics data
  • Open pathogen genomics data sharing is a foundation for a more equitable global approach to infectious disease
  • Data generation and provision into the shared data set is a scientific and technical contribution invaluable to downstream outcomes and impacts
  • Pathogen data sharing must take place under a governance framework of transparency, accountability and mutual benefit

Pandemic preparedness is complex and goes far beyond data sharing, but looking back at the start of the COVID-19 pandemic, it’s easy to see just how crucial pathogen genomic data was for the research community, and what the gaps were in the data sharing landscape globally. Bridging these gaps is crucial, if we want the global response to the next pandemic to be faster, more coordinated and more equitable.

Find out more

COCHRANE, G., et al. (2022). Pathogen genomics data sharing: public health meets research. Zenodo. Published online 18 03 ; DOI: 10.5281/zenodo.6368839.

Edit

Source article(s)

Related links

Tags: data sharing, genomics, pathogen, public health,