PDB passes 100,000 structure milestone

PDB passes 100,000 structure milestone

The Worldwide Protein Data Bank (wwPDB) today announces that it has released to the community its 100,000th molecular structure. Established in 1971, this central, public archive of experimentally-determined protein and nucleic acid structures has reached this important milestone thanks to the efforts of structural biologists throughout the world.

Function follows form
In the 1950s, scientists had their first real look at the structures of proteins and DNA at the atomic level. The determination of these early three-dimensional structures by X-ray crystallography inspired a new era in biology. The value of archiving and sharing these data were indisputable and in 1971 the Protein Data Bank (PDB) was established as an international collaboration with sites in the US and the UK. 

One of the first structures deposited in the PDB was that of myoglobin, an oxygen-binding molecule whose structure was elucidated by Nobel laureates John Kendrew and Max Perutz in 1958. This week the PDB welcomes 219 new structures into the archive. These structures join others vital to pharmacology, bioinformatics and education, and take the total number in the archive to 100,147. 

The PDB is growing swiftly, doubling in size since 2008 and releasing around 200 new structures to the scientific community every week. The resource is accessed hundreds of millions of times every year by researchers, students, and educators wishing to explore how different proteins are related to one another, to clarify biological mechanisms and to develop new medicines. 

“The PDB is a critical resource for the international community of working scientists which includes everyone from geneticists to pharmaceutical companies interested in drug targets,” said Nobel laureate Venki Ramakrishnan of the MRC Laboratory of Molecular Biology in Cambridge, UK. 

PDB100K
Figure 1 - Number of structures available in the PDB per year through May 14, 2014, with selected examples. Early structures included myoglobin (1; PDB ID 1mbn), the first structure solved by X-ray crystallography, and small enzymes (2; top: 4pti, bottom right: 2cha, bottom left: 3cpa). As technologies developed, the archive grew to host examples of tRNA (3; 6tna), viruses (4; 4rhv), antibodies (5; 1igt), protein-DNA complexes (6; top to bottom, 1j59, 1tro, 2bop, 1aoi), ribosomes (7; 1fjg, 1fka, 1ffk), and chaperones (8; 1aon). 

 

A growing community
Since its inception, the PDB has been a community-driven archive, evolving into a critical international resource for biological research. Since 2003 the Worldwide PDB (wwPDB), a collaboration between the US, UK and Japan, has ensured that these valuable data continue to be stored, managed and kept freely available for the benefit of scientists worldwide. The wwPDB partner sites work closely with community experts to define deposition and annotation policies, data representation issues and validation standards. In addition, the wwPDB works to raise the profile of structural biology with increasingly broad audiences. 

Each structure submitted to the archive is carefully curated by wwPDB staff before release. New depositions are checked and enhanced with value-added annotations and cross-linked with other important biological data to ensure that PDB structures are discoverable and interpretable by users with a wide range of backgrounds and interests. 

Future challenges
The scientific community eagerly awaits the next 100,000 structures and the knowledge these will undoubtedly bring. However, the increasing number, size and complexity of biological data being deposited in the PDB and the emergence of hybrid methods, which use a variety of biophysical, biochemical, and modelling techniques to determine the shapes of biologically relevant molecules, all present major challenges for the management and presentation of structural data. wwPDB will continue to work with the community to meet these challenges and to ensure that the archive maintains the highest possible standards of quality, integrity and consistency. 

About wwPDB
The wwPDB (http://wwpdb.org) is the international partnership that manages the PDB archive. Its mission is to maintain a single archive of macromolecular structural data that is freely and publicly available to the global community. It consists of the Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB; http://rcsb.org) at Rutgers, The State University of New Jersey and the San Diego Supercomputer Center (SDSC) and Skaggs School of Pharmacy and Pharmaceutical Sciences at the University of California San Diego and BioMagResBank (BMRB; http://bmrb.wisc.edu) at the University of Wisconsin in the USA, the Protein Data Bank in Europe (PDBe; http://pdbe.org) at the EMBL European Bioinformatics Institute, and the Protein Data Bank Japan (PDBj; http://pdbj.org) at Osaka University. 

The RCSB PDB receives funds from the NSF, NIH and DOE. The PDBe receives funding from EMBL, the Wellcome Trust, NIH, EU, BBSRC and MRC. PDBj is funded by the Japan Science and Technology Agency, and BioMagResBank by NLM.