PROCOGNATE

Main Page |
Help |
Stats |
Download


Help

About PROCOGNATE

The database contains an assignment of PDB ligands to the domains of enzyme structures classified by CATH, SCOP and Pfam. Cognate ligands have been identified using data from the ENZYME and KEGG databases and compared to the PDB ligand using graph matching to assess chemical similarity. Cognate ligands from the known reactions in ENZYME and KEGG for a particular enzyme are then assigned to enzymes structures which have EC numbers.

This procedure involves two steps; firstly we assign the binding of particular ligands to particular domains; secondly we compare the chemical similarity of the PDB ligands to ligands in KEGG in order to assign cognate ligands.


Domain - Ligand Assignment

In order to produce the cognate ligand mapping we firstly assigned the binding of the PDB ligands to specific domains in protein structures. Binding sites may be located on different chains or even discontinuous segments of sequence. Some ligands may be bound by more than one domain, either proportionally in a shared manner, or disproportionately with the vast majority of contacts coming from one domain only.

We retrieve the total number of contacts made to any one ligand by the whole structural assembly and each domain of CATH, SCOP and Pfam in each chain from the Macromolecular Structure Database (MSD) at the EBI.

If any one domain has greater than, or equal to, 75% of the total contacts to a particular ligand, then the binding of that ligand is assigned to that domain, and the mode of binding is recorded as 'non-shared'. If no one domain has 75% or more of the contacts then all contacting domains are recorded as binding the ligand and the mode of binding is recorded as 'shared'.


Cognate ligand Assignment

All ligands in a PDB entry for a structure are compared using two-dimensional graph-matching to all compounds known to be substrates, products, or cofactors for that enzyme, using data from the ENZYME and KEGG databases, and the most appropriate (i.e. chemically similar) cognate ligands are then matched up with the PDB ligands present in the PDB structure. We used SMSD (Small Molecule Subgraph Detector) [1] to perform 2D graph matching to compare the chemical structures of the PDB ligands and those from KEGG. The Tanimoto score:


S=Nsub/(NA+NB-Nsub)


was used to assess the similarity of the ligands. Where Nsub is the number of atoms in the maximum common substructure, NA is the number of atoms of molecule A and NB the number of atoms in molecule B.

In order to qualify as 'cognate-like' a PDB ligand needs to have a Tanimoto score of > 0.5. We chose this cut-off as approximately 99% of all random graph matching scores are equal to or less than 0.5, hence we can safely consider values higher than that as significant.

Finally the domain-ligand mapping is cross referenced with the cognate-ligand mapping to give a cognate-ligand-domain-mapping whereby each domain, which binds a ligand, has an assigned potential cognate taken from the various reactions catalysed by the enzyme.


Navigation

The database is seachable via text stings from the main page. The simplest query is via PDB code. You are required to pick either the CATH, SCOP or Pfam domain definitions when initiating a search, although it is possible to switch domain definitions using links in the results pages.


Per PDB entry view

This is the basic entry of the database to which a search at a higher level (e.g. superfamily) will eventually reach. This page is designed to give an overview of the different ligands bound by the different domains and the different cognate ligands assigned by our procedure. A assigned cognate ligand with a score of 1 here indicates that the PDB and cognate KEGG ligands are identical. Otherwise the cognate ligands for the various reactions catalysed by the enzyme are listed in orders of similarity the most similar first.

Several links in this page offer your various options. The chain of the particular structure can be changed, this is particular useful for heteroligomers. The symbol [S] allows you to search thought the PROCOGNATE database with the particular entity adjacent to it, e.g. superfamily, PDB ligand, EC number, cognate ligand. Clicking on [C] will show you a list of contacts made by the structure to a particular PDB ligand. [R] will give you the details of the reaction a particular cognate ligand is involved in. [L] will give you a list of links to other databases for a particular ligand. Clicking on the name of a PDB or cognate ligand will show a 2D representation of its structure.


References

A paper detailing the PROCOGNATE database and findings is available here. Reference: Bashton, M. et al. (2006). Cognate Ligand Domain Mapping for Enzymes. Journal of Molecular Biology 364(4): 836-852.

A paper describing more recent devlopments and the website is also available here. Reference: Bashton, M et al. (2008). PROCOGNATE: a cognate ligand domain mapping for enzymes. Nucleic Acids Research 36: D618-D622.

1. The SMSD paper is available here Reference: Rahman, S.A. et al. (2009). Small Molecule Subgraph Detector (SMSD) toolkit. Journal of Cheminformatics 112.