|
Structural genomics
tutorial Sameer Velankar |
|
First, let's find all entries in the PDBe deposited by Structural Genomics Projects, using a crude search in MSDlite.
Put "Structural Genomics" in the Text Search field and hit Start search. You should get a page of results - actually you should get lots of pages of results... You should see some 5000+ results for this search.
Let us select one entry from the result list for further analysis. Go to Page 7 of the results by clicking on the page number tab of the results page. Find the result line for the entry 1J6P and click on the ID code link on the left of the line. This will take you to the MSD atlas page for this entry.
Once the search is complete, you will be taken automatically to the
results page:
The MSDfold query for PDB entry 1J6P returns only 4 structures that are
likely to have a similar arrangement of secondary structure elements.
The
most meaningful value in the results table (for our purposes) is the Query
% sse. This shows the percentage similarity level between the
target structure and
the various results. The first two results of this list refer to the
same protein (PDB entries 1J6P
and 1P1M),
with no functional annotation in the PDB or
SwissProt
databank. Interestingly, the third and the fourth hit both correspond
to cytosine
deaminase (EC: 3.5.4.1)from
E.coli. Both these results only have 15% sequence identity
with
1J6P (the query molecule), but appear to be over 70% similar in
secondary structure.
Look down the results and click on the link in the result line for PDB entry 1K70, the structure of E.coli cytosine deaminase. If you have RasMol installed, you can look at the superposition of this result with the search structure (the hypothetical protein, 1J6P). Alternatively, you can look down the table of results for this match and see the match schematically:
The superposition clearly shows that the fold of our hypothetical
protein (1J6P) is very similar to that of E.coli cytosine
deaminase.
Similarity of fold does not necessarily translate into analogous
function (everybody knows the classic example of the TIM barrel fold
being exploited in various functions). To understand more about
our hypothetical protein we need to investigate it in more detail at
residue level. But as we can see from the MSDfold results the sequence
similarity is rather low. Taking a cue from the proteases trypsin and
chymotrypsin, where the folds are very different but the functional
residues have the same three dimensional disposition, we will now
investigate whether the
structural similarity of the hypothetical protein and cytosine
deaminase translates into analogous function.
Let us try and find out more about the PDB entry 1K70. Go to
the PDBe home page and enter
1K70 in the Get PDB by ID and click go.
This will take you to the atlas page for the entry 1K70.
In the Ligand page of the 1K70 atlas pages, there are schematic diagrams of the various heterogens that are bound to the molecule. You can see the environment in which each ligand is found, by clicking the link marked View the interactions of HPY with 1K70. This takes you to the MSDsite search tool. You can also look at the binding environment for Iron (Fe) by clicking on the View the interactions of Fe with 1K70.
The MSDsite database catalogues the interactions between ligands in the PDB and the protein to which they are bound. You can see all of the bonds, non-bonding interactions, etc. that mannose makes with the protein, and can upload your own structure and use the system to compare it against the whole PDB:
|
AstexViewer@EBI-MSD - this is a simple java applet that should work on any system which has java installed |
|
RasMol - this requires your machine to have RasMol already installed (may not work on all machines). |
You can also look at the statistics for each type of site throughout the PDBe database, by clicking on the small chart icons next to each hit. The red chart shows the statistics in the database for each kind of ligand, whilst the blue chart links to the statistics environment that is similar to the one which surrounds this particular instance of a ligand. The bonds link shows you the details of the interactions between the ligand and the protein.
It is now important to see if any of the residues in the binding
envionment for HPY in PDB entry 1K70 are conserved in the hypothetical
protein (PDB entry 1J6P). It is important to realize that these
residues, if conserved, may not show up in the sequence alignment as
the sequence identity between the two proteins is very low. Also,
having the same geometrical arrangement for functional residues is
more important for protein function than conservation of the
relationship in
the one dimensional protein polypeptide sequence.
GLN156,GLU217,HIS246,ASP313,HIS63,LEU81,PHE154,HIS214,TRP319
Scrolling through the results from MSDfold we get the following:
| PDB entry
1K70 (E. coli cytosine
Deaminase) |
RMSD |
PDB entry
1J6P (Hypothetical protein) |
| His
63 |
0.73 |
His
57 |
| Leu 81 |
4.52 |
Glu 73 |
| Phe 154 |
1.09 |
Gly 137 |
| Gln 156 |
xx |
No
equivalent residue |
| His
214 |
1.22 |
His
200 |
| Glu
217 |
1.54 |
Glu
203 |
| His
246 |
1.21 |
His
228 |
| Asp
313 |
0.29 |
Asp
279 |
| Trp 319 |
3.26 |
Ser 283 |
The residues involved in the binding environment for Iron (Fe) in
the E.coli cytosine deaminase are: