Protein Data Bank in Europe - Knowledge Base

PDBe-KB (Protein Data Bank in Europe - Knowledge Base) is a community-driven resource managed by the PDBe team, collating functional annotations and predictions for structure data in the PDB archive. PDBe-KB is a collaborative effort between PDBe and a diverse group of bioinformatics resources and research teams.

PDBe-KB currently includes projects such as SIFTS and FunPDBe , aimed at placing structures from the PDB in their biological context.

PDBe-KB is funded by EMBL and BBSRC.

NEW Protein Pages 21 Mar 2019

PDBe entry pages traditionally focus on single PDB entries, with limited information on entries that are related to the same macromolecule. The PDBe-KB team is proud to present a new type of entry page, that focuses on full length proteins instead of single PDB entries. Currently, you can use PDB or UniProt identifiers of proteins to display all the related PDB data, from the list of all the available PDB entries, to functional annotations of ligand binding sites, macromolecular interaction interfaces, and related publications.

You can help us by providing feedback: Click here 

Here are some examples from the beta version of the protein pages:

Calpain and Calpastatin proteins (PDB 3BOW): https://pdbekb.org/proteins/3bow
CREB-binding protein (UniProt Q92793): https://pdbekb.org/proteins/Q92793
Histone-lysine N-methyltransferase 2A (UniProt Q03164): https://pdbekb.org/proteins/Q03164
Mediator of DNA damage checkpoint protein 1 (UniProt Q14676): https://pdbekb.org/proteins/Q14676

You can try it out with your protein of interest by typing a PDB or UniProt identifier below:

Projects

SIFTS

Structure Integration with Function, Taxonomy and Sequence (SIFTS) is project in the PDBe-KB resource for residue-level mapping between UniProt and PDB entries. SIFTS also provides residue-level annotation from the IntEnz, GO, Pfam, InterPro, SCOP, CATH, PubMed, Ensembl, Homologene resources. The information is updated and released every week at the same time as the release of new PDB entries and is widely used by resources such as RCSB, PDBsum, Pfam, SCOP, InterPro.

FunPDBe

FunPDBe project integrates and makes available structural and functional annotations for macromolecular structure data in the Protein Data Bank (PDB). It is a collaboration between the Protein Data Bank in Europe (PDBe) and world-leading providers of structural bioinformatics data.

Co-factors

Cofactors are essential for many enzyme reactions. The Protein Data Bank (PDB) contains several enzyme structures, many bound to cofactor or cofactor-like molecules. In collaboration with the Thornton team, we have implemented a semi-automated annotation process that identifies such molecules in the PDB. The information is updated weekly with each PDB release and is stored in the PDBe database. The up-to-date information is made available via the PDBe REST API and query system, and can be visualised on the PDBe entry pages.

Expanding Genome3D

The goal of the Genome3D project is to provide predicted macromolecular structures for structurally uncharacterised protein sequences, thereby helping biologists exploit structural data to understand protein functions. As part of the project, one of the current objectives is to increase the volume of predicted structural data (i.e. map CATH and SCOP protein superfamilies, determine evolutionary conserved residues) and integrate it into the major central data repositories InterPro and PDBe.

Rfam mapping

Finding RNA molecules in the PDB has previously been difficult due to the lack of standard naming and classification, compared to proteins. Many RNA molecules are classified by Rfam, a database of non-coding RNA families based at the European Bioinformatics Institute (EMBL-EBI). We’ve worked together with our colleagues at Rfam to make use of their mappings of RNA molecules present in the PDB. As part of PDBe-KB, we are continuing to work on other features that will improve representation of RNA molecules at PDBe.

Integrating M-CSA data

M-CSA is a database of enzyme reaction mechanisms, maintained by the Thornton team, that is being integrated with PDBe-KB. M-CSA provides annotation on the protein, catalytic residues, cofactors, and the reaction mechanisms of hundreds of enzymes. There are two kinds of entries in M-CSA. 'Detailed mechanism' entries are more complete and show the individual chemical steps of the mechanism as schemes with electron flow arrows. 'Catalytic Site' entries annotate the catalytic residues necessary for the reaction, but do not show the mechanism.

Ligand component

The protein-ligand interactions are recorded and interactively displayed for all the bound molecules found in a structural assembly. All the structures are pre-processed with ChimeraX to add hydrogens at chemically favourable positions for the interactions to be identified using the software Arpeggio. Arpeggio follows the nomenclature established by CREDO to identify various molecular interactions between pairs of atoms including steric clashes, Van der Waals-, hydrogen-, polar-, ionic-, or hydrophobic-hydrophobic interactions on top of atom/plane-plane interactions.

PepVEP

The PepVEP project aims to use existing services from UniProt, the EBI Variation team, the Thornton team and PDBe to implement an integrated platform for interpreting the functional effects of variants. The project concentrates mainly on developing user interface to depict variation data on protein sequence and structure. PDBe is adding additional API calls to its existing API to provide structural residue information for the platform and aims to extend the sequence feature viewer to provide these data on PDBe pages.