Fuelling discovery together: 2024 user survey learnings

Blog post by Eleni Tzampatzopoulou, EMBL-EBI Impact Manager In summer 2024, EMBL-EBI ran a user survey, inviting our community to let us know how they use the open data resources we jointly manage with our collaborators. With over 2,300 responses from 126 countries, we were delighted by the depth…
Credit: Karen Arnott/EMBL-EBI

Blog post by Eleni Tzampatzopoulou, EMBL-EBI Impact Manager

In summer 2024, EMBL-EBI ran a user survey, inviting our community to let us know how they use the open data resources we jointly manage with our collaborators. With over 2,300 responses from 126 countries, we were delighted by the depth and breadth of the feedback provided, and the richness of discoveries empowered by the resources managed at EMBL-EBI. 

We are extremely grateful to you, our user community, for sharing your insights. The survey helps us understand the usage and impact of the data resources managed by EMBL-EBI and collaborators, allowing us to continuously improve and to meet our users’ needs.

89% of respondents reported that EMBL-EBI data resources empowered them to undertake work that would otherwise not have been possible.

In return for users taking the time to fill in our survey, and in the spirit of openness, we would like to share some of the key findings and trends, offering a snapshot of how our data resources are making a difference across a diverse and ever-growing user base.

“A key point is that the data resources we co-manage with our collaborators are used worldwide by an increasing number and breadth of users,” said Jo McEntyre, Deputy Director of EMBL’s European Bioinformatics Institute (EMBL-EBI). “Although we’re aware of the limitations of surveys, it’s encouraging to consistently see the many ways in which open data resources support research and discovery, from everyday computational analyses to training the next generation AI models and beyond.”

Geographic distribution of the respondents of the 2024 EMBL-EBI user survey, illustrating EMBL-EBI’s global user base.

A broad spectrum of resources supporting a broad spectrum of needs

In 2023, EMBL-EBI data resources received 33 billion web requests from 36 million unique IP hosts; that is 101 million web requests on an average day. One standout finding from the survey is the heterogeneity of how EMBl-EBI managed resources are being used. It’s not only the large genomics resources that play a central role in scientific research, but also structural biology, electron microscopy (EM) and ontologies. 

73% of survey respondents said recreating or regenerating the data they currently access through EMBL-EBI would not be practical.

For 73% of survey respondents (N = 1,843), recreating or regenerating the data they currently access through EMBL-EBI would simply not be practical. This highlights the breadth of the data resources managed by EMBL-EBI and the unique value they bring in lowering operational, financial and technical barriers to using biodata.

Beyond the bench: The evolving landscape of EMBL-EBI users

This year’s survey suggests that the EMBL-EBI resources have an increasing number of users from industry, academia and not-for-profit organisations across the healthcare, pharmaceutical, agritech and biotechnology sectors. This points to the increasing relevance of open data resources in applied and translational contexts.

Between 2015 (N = 4,185) and 2024 (N = 2,331), we saw an increase in respondents working in the government, hospital and not-for-profit sectors from 8% to 16%. The number of respondents in the commercial sector increased from 9% in 2015 to 12%.  

We also observe a rise in usage among clinical practitioners and experts in science-adjacent roles, such as policy advisors, research managers as well as learning and development specialists. For instance, genomic resources play an important role in understanding the genetic underpinnings of disease. Teaching assistants and training officers highlighted the importance of public datasets for teaching machine learning techniques.

Trends in sector representation of the respondents of the EMBL-EBI user surveys in 2015, 2020 and 2024. The graph highlights a consistent broadening usage of EMBL-EBI resources across academia with a gradual rise in engagement from the industry and other sectors such as hospitals, governmental services and not-for-profit organisations.

Where biodata meets AI innovation

Over half of our respondents (58%, N = 1,822) indicated that datasets and tools managed by EMBL-EBI and collaborators formed the foundation for scientific advancements, from training machine learning (ML) models to driving innovative commercial products. In recent years we have seen a steady rise in the use of EMBL-EBI data resources for artificial intelligence and machine learning applications. For example:

  • ChEMBL and UniProt are considered as core training datasets for ML applications in drug discovery and protein function prediction. 
  • EMPIAR and EMDB datasets are leveraged to refine AI-driven image recognition models for cryo-EM analysis. 
  • Resources such as the Ontology Lookup Service (OLS) enable programmatic access to data, facilitating automated workflows that integrate ontologies into ML pipelines.

This highlights that EMBL-EBI is not only supporting traditional research but is also driving the evolution of modern science through contribution to training AI algorithms. Our resources contribute to faster discoveries and more efficient processes across a variety of scientific and industrial applications. Many of these advancements would be nearly impossible without open and FAIR data resources.

Trends in the nature of work of the respondents across the EMBL-EBI surveys in 2015, 2020 and 2024. The chart shows a steady representation of wet lab and dry lab users, alongside an increase from 10% in 2015 (N = 4,172) to 23% in 2024 (N = 1,862) in the number of users involved in clinical practice or roles that are adjacent to science.

Modern science runs on biodata

The overarching takeaway is that open data resources have become indispensable for everyday scientific practice. 77% of the respondents (N = 1,882) shared that losing access to EMBL-EBI managed resources would disrupt their work, delay critical research, and in some cases, render their projects infeasible altogether. 

Many users also highlighted that EMBL-EBI managed resources saved them time and effort, enabling them to focus on higher-value tasks. Some 89% (N = 1,887) of respondents reported that EMBL-EBI data resources empowered them to undertake work that would otherwise not have been possible.

The increasing reliance on public data resources such as those managed by EMBL-EBI and its trusted collaborators, underlines the critical need for sustainable support of research infrastructures. As supporters of the Global Biodata Coalition (GBC) Open Letter Campaign, we join the global scientific community in advocating for long-term investments in open resources that benefit humanity by advancing scientific discovery through biodata.

Thank you for being part of our journey

The feedback of our users reaffirms the vital role biodata plays in modern science across all settings, from academic labs and teaching rooms to hospitals and start-ups. It also reminds us of the responsibility we carry to ensure these resources remain accessible, reliable and relevant.

We thank you for your continued support and for sharing your insights. Your input helps us not only demonstrate our impact but also shape the future of EMBL-EBI’s data resources and tools to better support you.

Edit

Source article(s)

Related links

Tags: bioinformatics, embl-ebi, FAIR data, open data, survey,