Now it's your turn to do some model validation! Below are a number of exercises (some requiring knowledge of protein crystallography) from which you can choose a couple.
Q. 17. Of all the X-ray crystal structures of proteins released in the period 1990-1995 that are still in the PDB, and with resolution between 2.5 and 3.0 Å, which one is the best and which one is the worst? (Hint: use the PDBe search facility to select all X-ray protein crystal structures that satisfy the resolution and release date criteria, and sort by quality.)
Q. 18. What is your opinion of the quality of PDB entry 2GN5?
Q. 19. PDB entry 1RIP is an NMR structure. What is your impression of its quality?
Q. 20. 2HHB, 3HHB and 4HHB are all crystal structures of human hemoglobin. In fact, all three structures were derived from the same crystallographic data. Nevertheless, the quality of the three entries differs rather dramatically. Compare and contrast the three entries and discuss their quality. How would you rank the three models?
Q. 21. In 1993, the 1.74 Å structure of a complex of a mutant of intestinal fatty-acid binding protein (IFABP) with oleic acid was reported (reference). The density for the carboxylate group was ambiguous and the model as deposited in the PDB (1ICN) contains three alternate conformations for this moiety. In a later study, this structure was used by Klebe and co-workers (reference) to validate their docking program and scoring function. The docking calculations indicated that the "observed" binding mode of the oleic acid was not particularly favourable. Instead, their method suggests that a different orientation of the entire ligand (in essence, swapping the head and the tail) is much more favourable. Inspect the density for the oleic acid ligand in the structure of 1ICN. Is the model with three alternative conformations of the carboxylate group credible in terms of (a) density, and (b) stabilising interactions? Is there support in the density for the alternative orientation, with the oleate's head and tail reversed, and with hydrogen bonds between the carboxylate oxygen atoms and an amide group in the protein? What is your conclusion?
Q. 22. PDB entries 1KEL (solved at 1.9 Å) and 1FL6 (solved at 2.8 Å) both contain a ligand with an excruciatingly long name that we shall refer to as simply AAH. For both these structures, assess how much you trust the (a) presence, (b) orientation, (c) conformation, and (d) coordinate precision of the AAH ligand.
Q. 23. Quite a few structures contain one or a few D-amino acids. These may be either genuine D-amino acids or artefacts due to model-building or refinement errors. PDB entry 1A7S is a 1.1 Å structure, in which valine 50 is a D-amino acid. Is this a genuine D-amino acid or an artefact? And how about residue E115 in the 2 Å structure with PDB code 1AN1?
Q. 24. Read this short paper (2 pages) if you have access to it. Describe in your own words what the authors are trying to say. Confirm your suspicions by inspecting the electron density in the binding site of PDB entry 2GWX and by comparing it to that in entry 2BAW.
Q. 25. There are hundreds of structures for hen egg-white lysozyme in the PDB. Which one has the highest quality according to the PDBe search facility? Is this also the one with the highest resolution data?
Q. 26. There are modified amino acids in which the hydrogen atom that is normally attached to the alpha carbon atom has been replaced by a methyl group. An example is shown in the figure above. Please answer the following questions: (a) what is the common amino acid from which this 2-methyl-variant is derived? (b) is it D or L? (c) what is its three-letter code? (d) how many PDB entries contain this modified amino acid? (e) can you think of another name for 2-methyl-glycine? (f) can you find another type of 2-methyl-variant of a regular amino acid that occurs in the PDB? (g) do you expect 2-methylated amino acids to have larger, smaller or roughly the same favourable regions in the Ramachandran plot?
An additional fun exercise is to look up all the structures of your instructor, professor, best friend, worst enemy, neighbour, colleague or yourself, to identify which of their structures is listed first when you sort the search results by descending quality, to look at the model and the density in detail and see if you can find aspects that could be modelled better. Finally, discuss your findings with the author of the structure...
Well, that’s all! Hopefully you found this practical informative.
During this practical you may have gotten the impression that most protein models are wrong or at least seriously flawed. This is of course not the case - the large majority of models are quite acceptable (although small errors may remain even in high-resolution models, and although even at high resolution there are still subjective choices involved in the model-building and refinement process). However, for teaching purposes poor models are more interesting than essentially correct ones!
If you have any feedback please use the Feedback button near the top of this page or email me directly (gerard@ebi.ac.uk). Thanks!








