This article was originally published on the EMBL news site on 15 June 2015.
EMBL scientists are funded by the public to produce top-notch research and the organisation’s open-access policy makes the resulting publications freely available to everyone to read. Furthermore, a Creative Commons with attribution (CC-BY) open-access licence enables innovation in text analytics and text mining, which will be vital to enable researchers to discover relevant content efficiently in an increasingly interdisciplinary environment.
Having research outputs accessible without barrier – and properly attributed – is essential.
“Recognition for influential work is part of the social fabric of science,” explains Jo McEntyre, Head of Literature Services at EMBL-EBI. “Scientists, institutes and funders all want to be acknowledged when they have played a role, so having research outputs accessible without barrier – and properly attributed – is essential. But the open-access policy is about much more than that – it also encourages people to make their research reusable, so it can be explored and re-analysed in different contexts as new methods and technologies come online.”
EMBL-EBI runs Europe PMC, a publicly accessible literature resource that holds over three million full-text articles from peer-reviewed journals and 30 million abstracts from PubMed, PubMed Central, Agricola, patent offices and many other sources. All the articles in Europe PMC can be searched in full and read freely by anyone in the world. They are also discoverable via major search engines.
“Why should we remain stuck in this mind set, thinking that we’ll all get around to reading every paper?” posits McEntyre. “We all need to discover things from the literature – and that might be something big and obvious, but equally it could be just an aside in someone’s paper. We need to make the whole process of exploring and consuming the scientific literature much more efficient, and text mining and analytics are a big part of that.”
‘CC BY’ and the right to reuse
All of the articles in Europe PMC can be read by anyone, anywhere, any time. But open-access papers with a Creative Commons attribution license (CC-BY) or similar are much more useful, because they can be reused.
“We make full-text articles with the appropriate licence available for download, which means that text-analytics researchers can experiment with it,” says McEntyre. “Having a CC-BY license makes it possible to do computation on swathes of articles, which is fundamentally important if we’re going to scour the whole of the literature for relevance and serendipitous discovery.”
This is fundamentally important if we’re going to scour the whole of the literature for relevance and serendipitous discovery.
This becomes particularly pertinent in scientific curation work, an activity at the heart of EMBL-EBI services that involves highly skilled scientists extracting biological facts from the literature to be incorporated into various data resources. Curation adds significant value, making data easier to use, distilling knowledge about a given concept into accessible interfaces.
Text analytics that operate on large collections of articles like Europe PMC could help improve curator activities – giving them the time and flexibility they need to focus on the trickier aspects of curation that only humans can do.
How do people use text mining with Europe PMC? One example is the Protein Data Bank in Europe (PDBe), a public data resource that helps people study protein structures. Recently, PDBe started to include figures and captions from Europe PMC articles, providing much-need context for structural data records. This gives researchers a rich, multi-dimensional view of complex information, saving time and improving understanding.
“From the Europe PMC perspective, open access is not only about articles being free to read, but also about making the outputs of science more discoverable by linking the article narrative to related information,” explains McEntyre. “For example, data presented in figures can be linked to other core public data resources, or to other sources like spreadsheets and raw images. Things like this really add value and make the literature more useful over the longer term.”
How EMBL's Open Access policy works
A healthy percentage of Europe’s life-science researchers cycle through EMBL at some point in their career, whether it’s to run a research group or simply to work on a project as a visiting scientist. EMBL scientists conduct their research independently, and are free to choose which journals are best for publishing their work.
The open-access policy does not restrict publication decisions, but does require that at minimum, wherever the work is published, the results should be made available in Europe PMC within six months of publication.
“Most journals have open-access tracks and will also deposit the ‘final author version’ on behalf of the researcher on PubMedCentral or Europe PMC,” says Tobias Sack, EMBL alumnus and former head librarian in the Szilárd Library. “We’ve made arrangements with publishers, including Elsevier, Wiley and MacMillan, to make sure that EMBL-supported work is published open-access. They’re happy to support it and in principle they support pushing the content directly to Europe PMC on EMBL’s behalf.”
The Szilárd Library at EMBL Heidelberg is on hand to help researchers with any question concerning the policy or depositing articles on Europe PMC in the event that the service is not available through the publisher. Detailed information about the open access scheme can be found on the Library’s webpage.
Open Access: a commitment to the public good
EMBL’s open access policy is a commitment to the public good. It makes science more open, scalable and sustainable, pooling an extremely diverse mix of results from thousands of journals covering every life-science speciality. EMBL was one of the first signatories of the Hague Declaration on Knowledge Discovery in the Digital Age, which promotes ethical research practice, legislative reform and the development of open access policies and infrastructure.
Open access has been embraced by many of the world’s major science funders, some of which make funding conditional on the publication of results under open-access licenses. To make this requirement widely visible, many of Europe PMC’s 28 funders – including the European Research Council, the World Health Organisation and the Biotechnology and Biological Sciences Research Council – link grant information to articles to make it easier for anyone to see the links between funding and results. By running Europe PMC, together with the University of Manchester and the British Library, EMBL bolsters the open-access movement in ways that go beyond a single institutional policy.
We have an obligation to address both of these barriers – not just EMBL, but the global scientific community.
“Many of the Intellectual Property laws around scientific publishing were established well before the web and certainly before bioinformatics came on the scene,” explains Iain Mattaj, Director General of EMBL. “In many countries these laws create a barrier to analysis, although this has changed in the UK and is beginning to change across the EU as a whole. It really is a moral issue when a huge proportion of the world’s scientists can simply not access new research at all. We have an obligation to address both of these barriers – not just EMBL, but the global scientific community.”