Introduction

The "Ligand Chemistry" service provides web access to the "ligands and small molecule dictionary" of the PDBe database. This database contains a self-consistent set of "residues" found within the experimental data deposited to the protein data bank (PDB) that are referenced within in any macromolecular structure of a PDB entry, and therefore contains all the amino acids, nucleic acids and any associated ligand submitted to the PDB.

The ligand dictionary contains all properties of each molecules, such as bonds between the atoms, bond orders, stereochemistry, energy types and chemical graph. This small molecule can be searched in a number of ways, and the following tutorial indicates the possible use for this expert system designed for chemists.


How to use it

You can start the msdChem service here if you have not done so already. You should see the web page as shown in figure 1. This contains a number of search fields that can be used to identify small molecules of interest.

Figure 1

We will work down each of the query fields and try out different search questions.


Looking at the details of a single molecule

We will start by picking out an important biological molecule and look at the detail contained within this database. The easiest way to extract a single molecule is to use the 3 letter code, we will study the adenosine tri-phosphate (ATP) molecule. Type in the letters ATP into the text field for "3 letter code" and hit [search]; you should see the single molecule as shown in figure 2.

Figure 2

This page gives a summary of any hit found from a search, you can use this to find the full details about the ATP molecule by clicking the database entry mark as shown in figure 2, you will get a page similar to that in figure 3. On this page there are many links to data pages on the left grey sidebar. We will look at some of this detail.

Figure 3

You can see that this gives all the details about the ATP molecule including many derived pieces of information and schematic 2D picture of the molecule. This is the information we store for every entry within our database. You will now view a 3D set of idealized coordinates of the molecule within an applet viewer. If you cannot see the view button as shown in figure 3 then scroll the window down. Click the view button and an applet will be downloaded to your computer and a 3D molecule shown. This is the JMOL applet, if you have RASMOL installed on your computer you can display the coordinates within this program by selecting a different viewer from the list button (figure 4).

Figure 4: A RASMOL viewer window showing the molecule of ATP.

The Carbon atoms are grey, nitrogen blue, phosphorous yellow and oxygen red; you can rotate molecule using a virtual track ball action (link here Janet ?): We can see from the 3D view of the molecule that it is not flat, the 3D shape and the chiral centres are critical to its biological action.

Virtual track ball: While holding the <LEFT><MOUSE-BUTTON> down move the mouse left/right and up/down.

How many chiral centers has this molecule ? (Hint : pick the Atoms link at the top of the grey sidebar of the ATP information page)

This molecule is the primary energy store in all cells; the reaction to form ADP is exothermic and provides energy to the many thousands of biological reactions that take place. What do you think would the outcome if a different stereo-isomer was present in a cell?

To remove the graphical window please click the top right [X] button of the graphical window. You can save the ideal coordinates to a file, select the output button list, and pick HTML. You should have Output = HTML, Format = PDB, Library = ideal, and then click the save button. The browser should ask you whether you want to open or save these coordinates, just open them for now, and you will get a new window showing containing the ATP coordinates in a PDB format. You can see that we store a vast array of information regarding all these molecules.

Spend a few minutes browsing the links on the left hand side bar, what formats can this information be save as on each page?

The code field is identical to the short code in most cases, but allows for specifications such as "D" or "L" stereo-chemistry.

Satisfy yourself that you can find the ATP molecule using the "code" field and "short-code" search fields.

The molecule name allows search using the standard chemical name for a molecule. It also allows different characters (specified with a "." character, or a range of characters "*"). For example

Think a few ways to search for Adenosine tri phosphate using the molecule name


Searches using the molecular formula

The search interface allows you to specify a chemical formula, or a formula with different ranges of element content. In general you can define a search using the following format:
Element(number)
Element(start-finish)
For example to specify the formula for 3 chlorophenol then use C6 O1 CL1, where each element specification is separated by a space character.

Search the database with a "search-range" of C6 O1 Cl1
You will get 2 (or more) answers, why?

We can tell the search to exclude atoms using "N0" (ie element name followed by a zero) this example will return a hit list with molecules containing no nitrogen atoms

Repeat the search so you only return 3-chlorophenol


Sub and super structure searching the chemistry database

We are now going to carry out some searches of the database using various search targets, the first is to find all the structures that contains the chemical substructure 3-chlorophenol. The easiest way to define a chemical substructure is to draw the molecule using the JME molecular editor. Figure 5 shows you the button to click, this [edit] button will start the JME molecular editor, and allow you to create the Non-stereo smiles string, which is then filled into the target field. So, although it is probably quicker to type in the non-stereo string into the target field, it is usually easier and certainly less error prone to draw the molecule.

Figure 5


Drawing 3-chlorophenol in the JME editor.

Figure 6 shows the JME editor with a sketch of the 3-chlorophenol drawn. We are going to through each step necessary to create this molecule within this editor. To draw this molecule in the sketcher, select the aromatic ring button at the top of the sketcher (<LEFT MOUSE> click), then click once within the drawing panel using a <LEFT-MOUSE> click. An aromatic ring will appear. Notice that as you move the mouse cursor over the aromatic ring the bonds and atoms are highlighted. These selection markers on atoms/bonds define where any new features (atoms/bonds) will be added when you do a <LEFT-MOUSE> button click. But first you need to define the new feature to be added by the program. Since you have selected the aromatic 6 membered ring any addition to the drawing window now would add additional 6 atom aromatic rings. To draw the single bonds to the hydroxyl and chlorine atom select the single bond button using a single <LEFT-MOUSE> click (second from left in the row of bonds objects). Now move the mouse over an atom in the aromatic ring where you want to add this bond, make sure the atom is highlighted, then click the <LEFT-MOUSE> button. The single bond is added. You will notice that the default element type added is carbon, and carbon atoms are not marked with the element type. This is because all most bio-active molecules contain many carbon atoms. You will also notice that hydrogen atoms are not shown when attached to carbon atoms , these positions are assumed so as to maintain valency of the other atoms. We want to change the atom on the bond you have added to a hydroxyl atom; select the oxygen element button (The red O on the left vertical bar of atoms), make sure it is active, move the cursor over the end of the single bond for the hydroxyl, and click the mouse button. Notice that the JME editor will always try to satisfy the valency of added atom using hydrogen atoms so this oxygen is added as "OH".

Note that [DEL] means delete, and if this button is current you will delete any selected bond/atom !

How would you make this negatively charged oxygen with no hydrogen?

How would you make the single bond to the oxygen a double bond - what happen to the hydrogen atom if you do?

Now add the chlorine atom.

Figure 6

When you are done and you see the 3-cloropheno then click the [OK] button. Notice that [LOAD] means load a molecule from a file but unfortunately will also delete what you have done ! You will see that the smile string for this molecule fills the Non-Stereo Smile search text field once the sketcher window has gone. Make sure there are no other fields containing search targets filled and then click [Search].
A Please Wait pop-up will appear for about 20 seconds, and then 9 results will fill the browser window (figure 7).

Figure 7

What is the meaning of the green/blue and red squares in chemical structures?

Select the second result in the list "EEA" using the link for the 3-letter-code. You will be taken to the page for this chemical.

Where is the substructure 3-clorophenol within this molecule? (View answer)

Repeat the query using with the "is sub-structure" qualifier active and the target still 3-chorophenol.


Similarity searching the chemistry database

The chemistry database can be searched using the query target of "looks like" chemically. (This search is based on the bit string definition of the chemical structure rather than the mathematical graph for this structure if you really want to know). The search is also limited to a comparison over all the molecule, so will not find similar fragments of structure as this has a tendency to return the vast majority of the database for most search queries. This is known as the Fingerprint search on the msdChem main search page. (see Figure 8) You can use the JME edit to construct a search target again, refer to figure 6 to build the 3-chlorophenol molecule again, or cut and past the smile string into the search field.

Figure 8

Why do you only get one structure?

OK, so that may not be immediately obvious why this is the case, but consider that the database only contains about 4000 molecules, and contains those chemical molecules found in/associated-with macromolecular structure solved by the techniques of crystallography and NMR. Lets use a search query with a more "biologically relevant" molecule which will therefore have many similar compounds.

Open the JME editor for the Fingerprint search (figure 8), but this time we will use a code to load it into the JME editor. At the base of the web page containing the JME editor you will see the option :

Or give Code of Existing Molecule [             ] (ie ATP).

Type in ATP into this field, and then click the [Load] button. The structure of ATP will be loaded into the JME editor and you can now hit OK to place this into the search target field. You should be back at the main msdChem page with the smile string for ATP entered into the target field. Now hit the [search] button and you should get about 50 results.

Why are there many results for this search?

Now try some molecules of interest and try a number of searches base on this. Remember that the database is populated by molecules that are biologically relevant.