100,000 structures

14 May 2014 - 09:59

Hinxton, Osaka, Madison, New Brunswick and San Diego, 14 May 2014 – The Worldwide Protein Data Bank (wwPDB) has released its 100,000th molecular structure to the scientific community. Established in 1971, this central, public archive of experimentally determined protein and nucleic acid structures has reached this important milestone thanks to the efforts of structural biologists throughout the world. 

Function follows form

In the 1950s, scientists had their first real look at the structures of proteins and DNA at the atomic level. The determination of these early three-dimensional structures by X-ray crystallography inspired a new era in biology. The value of archiving and sharing these data were indisputable and in 1971 the Protein Data Bank (PDB) was established as an international collaboration with sites in the US and the UK.

One of the first structures deposited in the PDB was that of myoglobin, an oxygen-binding molecule whose structure was elucidated by Nobel laureates John Kendrew and Max Perutz in 1958. This week the PDB welcomes 219 new structures into the archive. These structures join others vital to pharmacology, bioinformatics and education, and take the total number in the archive to 100,147.

The PDB is growing swiftly, doubling in size since 2008 and releasing around 200 new structures to the scientific community every week. The resource is accessed hundreds of millions of times every year by researchers, students, and educators wishing to explore how different proteins are related to one another, to clarify biological mechanisms and to develop new medicines.

“The PDB is a critical resource for the international community of working scientists, which includes everyone from geneticists to pharmaceutical companies interested in drug targets,” says Nobel laureate Venki Ramakrishnan of the MRC Laboratory of Molecular Biology in Cambridge, UK.

A growing community

Since its inception, the PDB has been a community-driven archive, evolving into a critical international resource for biological research. Since 2003 the Worldwide PDB (wwPDB), a collaboration between the US, UK and Japan, has ensured that these valuable data continue to be stored, managed and kept freely available for the benefit of scientists worldwide. The wwPDB partner sites work closely with community experts to define deposition and annotation policies, data representation issues and validation standards. In addition, the wwPDB works to raise the profile of structural biology with increasingly broad audiences.

Each structure submitted to the archive is carefully curated by wwPDB staff before release. New depositions are checked and enhanced with value-added annotations and cross-linked with other important biological data to ensure that PDB structures are discoverable and interpretable by users with a wide range of backgrounds and interests.

Future challenges

The scientific community eagerly awaits the next 100,000 structures and the knowledge these will undoubtedly bring. However, the increasing number, size and complexity of biological data being deposited in the PDB and the emergence of hybrid methods, which use a variety of biophysical, biochemical, and modelling techniques to determine the shapes of biologically relevant molecules, all present major challenges for the management and presentation of structural data. wwPDB will continue to work with the community to meet these challenges and to ensure that the archive maintains the highest possible standards of quality, integrity and consistency.

Notes for editors

About the wwPDB

The wwPDB is the international partnership that manages the PDB archive. Its mission is to maintain a single archive of macromolecular structural data that is freely and publicly available to the global community. It comprises: the Protein Data Bank in Europe (PDBe) at EMBL-EBI; the Protein Data Bank Japan (PDBj) at Osaka University; and the Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB) in the USA at Rutgers State University of New Jersey, the San Diego Supercomputer Center (SDSC), Skaggs School of Pharmacy and Pharmaceutical Sciences at the University of California San Diego,0 and BioMagResBank (BMRB) at the University of Wisconsin. Funding: PDBe receives funding from EMBL, the Wellcome Trust, NIH, EU, BBSRC and MRC. PDBj is funded by the Japan Science and Technology Agency. The RCSB PDB receives funds from the NSF, NIH and DOE. BioMagResBank is funded by NLM.


Europe: Gary Battle, Protein Data Bank in Europe, European Bioinformatics Institute (EMBL-EBI), Cambridge, UK; +44 1223 49-4654, battle@ebi.ac.uk

United States: Christine Zardecki, RCSB Protein Data Bank, Rutgers, The State University of New Jersey, New Brunswick, NJ, USA; +1 848 445 0103, info@rcsb.org

Japan: Nahoko Haruki, Protein Data Bank Japan, Institute for Protein Research, Osaka University, Japan; +81 6 6879 4311, nahokoh@protein.osaka-u.ac.jp

