PDB Chemical Components
The Chemical component dictionary service provides web access to the "Chemical Component Dictionary" of the wwPDB as this is loaded in the PDBe database at EBI.
This dictionary is part of the core "reference" information of the PDBe relational database and is consistently referenced by all macromolecular structures for all bound molecules as well as standard and modified amino acids.
Since every residue and every atom in the PDBe database references a ligand and an atom in this dictionary, this is the repository that defines the link between proteins and chemistry.
Chemical component in PDBechem
The term chemical component (or ligand) refers to the distinct chemical entity of a stereoisomer of a small molecule or monomer.
This means that structural isomers, geometric isomers, and enantoimers (but not conformation isomers) are distinct chemical components (ligands) in PDBechem. The properties that define the chemical identity of a ligand are:
- atoms (including hydrogens) and atom elements,
- bonds and bond orders as well as
- atom and bond stereo descriptors
Ligands in wwPDB
The ligand dictionary is not an isolated effort of the PDBe group. The fundamental parts of the dictionary are exchanged on a weekly basis with collaborators of the international wwPDB (RCSB - PDBj) in the form of mmCif chem_comp_group files and are in sync with the PDB archive. During this process new ligands are manually and semi-automatically processed by the wwPDB members, before they become official 3-letter code identifiers of the PDB.
How to search with PDBeChem
There is a wide range of possibilities for searching and exploring the dictionary.
- Code: This is the PDB 3 letter code for the ligand (e.g. ATP).
You may also select the "like" operator while searching for ligand codes. This search field is auto-completed with codes as you type.
- Molecule name: An expression or word that is part of any of known molecule name (standard name, common name, systematic name).
You may use the '=' operator for an exact match, or the 'like' operator to search for your input as a substring. This field is auto-completed with molecule names, starting with a minimum of two input characters.
- Formula: An expression that sets range constraints for the number of atoms from each element. The value that you have to provide is of the form [<E><n>-<m> ]* where <E> is an element <n> is the minimum number and <m> is the maximum number that the element must appear on the formula . The order in which the elements are given is not important. For example if you want to find ligands that have more than 10 and less than 15 carbons, 3 nitrogens and one oxygen, you should give 'C10-15 N3 O1'.Other examples:
By clicking on the button next to the item, you may use the formula range editor to build your formula expression interactively.
You may also use the '=' operator for an exact formula match.
- - 'CL3 N0' find molecules with exactly 3 Clorines and no nitrogens,
- - 'C40-50 N5-10 O5-15 S1-10' molecules with 40-50 carbons, 5-10 nitrogens, 5-15 oxygens and 1-10 sulfurs.
- Non stereo smile: For structure based searches. By clicking on the edit button, a form appears that will allow you to specify a molecule or a molecule segment by using one of the three options:
After you load a ligand you may also modify it. For example if you are looking for ligands similar to ATP you may load ATP on the JME editor and then
remove some atoms and bonds, keeping just the substructure you are interested in.
As soon as a molecule or molecule segment is specified then you may use it to search the dictionary.
All these search operations ignore stereochemistry. This means that a molecule will also match its stereoisomers.
Additionally aromatic bonds are treated as single-double. This means that in the case of aromatic rings etc, there may be also some false positives.
Finally, you can choose to discard bond orders, as well as consider your input as ring-strict by using the check-boxes on the editor page.
- - Draw the molecule using the JME Molecular editor
- - Upload a standard chemical file like Mol2,Sdf,PDB e.t.c. in the JME editor. You may specify any file types and formats accepted by the ChemAxon MolConvert
- - Give the standard code (i.e. ATP) of a ligand that already exists in the database in order to be loaded on the JME editor. This field is auto-completed as you type.
- Fragments: Similar to formula search but now the search items are not chemical elements but chemical fragments. An expression sets range constraints for the number of occurences from each fragment. The value that you have to provide is of the form [<E><n>-<m> ]* where <E> is a fragment <n> is the minimum number and <m> is the maximum number that the fragment must be contained in the molecule. The order in which the fragments are given is not important. For example if you want to find ligands that have more than 1 and less than 3 adenine groups and a furan ring, you should give 'adenine:1-2 furan:1'.
The library of chemical fragments is predefined it includes about 84 fragments while the fragment expression search is quite fast,
By clicking on the button next to the item, you may use the fragment pattern editor which is practically the easier way to build a fragment expression.
Functionality of PDBeChem
The PDBeChem service offer a generic browsing interface of all areas of the ligand dictionary. The user may follow links that are available from every
record in order to navigate through the relationships of the dictionary.
For example he may follow a relationship link to view the atoms of a ligand
and then for a particular atom, its bonds and energy types and so on.
The "complete" link provides a single page with all the information available (including coordinates and energy types).
There is additional functionality provided for ligands. From a ligand page you may also:
- 3-D View: choose the set of coordinates you want to use (i.e. idealised or PDB) to view in JMol
- File Export: You have to choose the set of coordinates you want to use (i.e. idealised, PDB) , the export format (PDB,SDF,mmCIf or CML)
- PDB entries: Follow links to the atlas pages of the entries that are including this ligand
- Stereoisomers: This will provide all the stereoisomers of the current molecule, if any are available.
The database that is accessible by the service is the PDBe database, which is based on the wwPDB archive
Additionally the dictionary contains classification of the atoms of the ligands in energy types, and associates them with the energy types reference dictionary for different set of libraries (different classification sets).
Please contact PDBe group for suggestions, comments or problem reports. Your input is very helpfull.
Several external programs (CACTVS - VEGA) are also used for the ligand dictionary in order to provide derived information like
- - Gif Images
- - Atom energy types
The derivation of this information is performed by the PDBe group.
- UNIT 14.3: Using PDBeChem to Search the PDB Ligand Dictionary
Dimitropoulos, D., Ionides, J. and Henrick K. (2006) In Current Protocols in Bioinformatics
(A.D. Baxevanis, R.D.M. Page, G.A. Petsko, L.D. Stein, and G.D. Stormo, eds.) pp 14.3.1-14.3.3 John Wiley & Sons, Hoboken, N. J. ISBN: 978-0-471-25093-7
- MSDsite: behind the scene: The technology used in database searching and retrieval for the analysis and viewing of bound ligands and active sites.
Golovin, A., Dimitropoulos, D., Oldfield, T. and Henrick, K. (2004)
The eCheminfo 2004 Conference "Applications of Cheminformatics and Modelling to Drug Discovery 8-19 November.
- MSD database and MSD database services
A. Golovin, T. J. Oldfield, J. G. Tate, S. Velankar, G. J. Barton,
H. Boutselakis, D. Dimitropoulos, J. Fillon, A. Hussain,
J. M. C. Ionides, M. John, P. A. Keller, E. Krissinel, P. McNeil,
A. Naim, R. Newman, A. Pajon, J. Pineda, A. Rachedi, J. Copeland,
A. Sitnov, S. Sobhany, A. Suarez-Uruena, J. Swaminathan, M. Tagari,
S. Tromm, W. Vranken and K. Henrick (2004) E-MSD: an integrated data
Nucleic Acids Research, 32 (Database issue), D211-D216. 2004
The following methods and packages have also be used for PDBeChem
- CACTVS (http://www2.chemie.uni-erlangen.de/software/cactvs/index.html)
CACTVS: A Chemistry Algorithm Development Environment
W. D. Ihlenfeldt, Y. Takahasi, H. Abe, S. Sasaki,
in: Daijuukagakutouronkai Dainijuukai Kouzoukasseisoukan Shinpojiumu Kouenyoushishuu,
Machida, K., Nishioka, T. (Eds),
Kyoto University Press (1992), 102-105
- CORINA (http://www2.chemie.uni-erlangen.de/software/corina/index.html)
Gasteiger, J.; Rudolph, C.; Sadowski, J.
Automatic Generation of 3D-Atomic Coordinates for Organic Molecules.
Tetrahedron Comp. Method. 1990, 3, 537-547.
- JME (http://www.molinspiration.com/jme/)
- JMOL (http://jmol.sourceforge.net/)
Christoph Steinbeck, Yongquan Han, Stefan Kuhn, Oliver Horlacher, Edgar Luttmann, and Egon Willighagen, The Chemistry Development Kit (CDK): An Open-Source Java Library for Chemo- and Bioinformatics, J.Chem.Inf.Comp.Sci., 2003
- ACDLABS (http://www.acdlabs.com/)
- VEGA (http://users.unimi.it/~ddl/vega/index_noanim.htm)
A. Pedretti, L. Villa, G. Vistoli
"Vega - an open platform to develop chemo-bio-informatics applications, using plug-in architecture and script programming"
J.C.A.M.D., Vol. 18, 167-173 (2004)
- InChi The IUPAC International Chemical Identifier (InChI TM)
Copyright © The International Union of Pure and Applied Chemistry 2005: IUPAC International Chemical Identifier (InChI) (contact: email@example.com)