The rise of open proteomics
The ProteomeXchange consortium has launched a public portal for exchanging proteomics data and information. The integrating service highlights a growing trend in the sharing of data generated in mass-spectrometry experiments.
Proteomics data resources such as PRIDE and PeptideAtlas have accepted submissions for many years, but until recently they have been acting independently with limited global coordination. This placed a burden on researchers, who had to figure out which repository to choose and which type of data – raw or processed – to share. Conversely, data consumers found it difficult to find supporting data, or to untangle whether a dataset from one resource had been integrated into another.
ProteomeXchange, an EU-funded international consortium of proteomics researchers, data providers and publishers, aims to improve the integration of public proteomics repositories. By implementing common guidelines, the consortium is greatly facilitating the assessment, reuse, comparative analyses and extraction of new findings from published data.
The portal provides a welcome boost to the long-standing efforts of scientific journals and funding agencies. Users can now easily view data and information at different levels of processing, all linked by a universal, shared identifier. In addition, authors are now given a traceable ProteomeXchange accession number for datasets reported in their publications.
How ProteomeXchange works
Existing proteomics repositories like PRIDE and PeptideAtlas capture and annotate mass spectra (e.g. raw data), peptide identifications and quantification values, protein identifications and protein ratios. Knowledge bases (e.g. UniProt, neXtProt) abstract the resulting biological information and present it in a wider context. ProteomeXchange integrates all these independent resources within a single framework, maximising reusability of proteomics data.
Researchers who submit to participating repositories can share data with journal editors and reviewers during peer review. After the paper is accepted, the data is publicly released and an announcement made through a public RSS feed.
Comments from consortium members
Juan Antonio Vizcaíno, ProteomeXchange project manager at EMBL-EBI, says: “Our goal is to heighten the visibility of proteomics data and promote their uptake by different facets of the research community. Simply having a common identifier space for proteomics data vastly improves their citability and traceability, which are important over the longer term.”
Eric Deutsch, head of PeptideAtlas at the Institute for Systems Biology in the U.S., says: “I’m certain that the ProteomeXchange system is already leading to greater awareness and reuse of publicly available datasets. I’m regularly contacted by people who want to explore new data they’ve heard about through our alert system, and am pleased to see so much growth in this area of science.”
Henning Hermjakob, head of Proteomics Services at EMBL-EBI and a founding member of the ProteomeXchange consortium, says: “ProteomeXchange is a key step in overcoming the fragmentation of proteomics data. I am confident that as more resources join ProteomeXchange, these valuable, public datasets will return even more benefits to research and lead to important insights in biology.”
How to join
Individual resources can join ProteomeXchange by implementing the ProteomeXchange data submission and dissemination guidelines and metadata requirements. These are available at www.proteomexchange.org/concept.
Since it began accepting submissions in June 2012, ProteomeXchange has received more than 700 datasets. These can be centrally accessed at http://proteomecentral.proteomexchange.org
Vizcaíno, J.A., Deutsch, E.W., Wang, R., et al. (2014) ProteomeXchange provides globally coordinated proteomics data submission and dissemination. Nature Biotechnology 32, 223-226. DOI: doi:10.1038/nbt.2839.