0%

Metrics based on stereochemistry

Beyond fitting the experimental data, a good model must also adhere to basic chemical and physical laws governing molecular geometry and interactions. These metrics assess the stereochemistry and geometry of the model.

Regardless of the method of structure determination, the shapes and interactions of biomolecules are defined by (a) the chemical properties of atoms in the molecule and (b) how these atoms are positioned in 3D. The 3D model geometry is used to determine the nature of covalent and non-covalent interactions between atoms. Any distortions in the following are worth noting since they may indicate limitations in the model

Ramachandran plot

The Ramachandran plot is a fundamental tool for assessing the stereochemical quality of a protein backbone. It visualises the distribution of the two main dihedral angles of each amino acid residue’s backbone: phi (φ) and psi (ψ). These angles define the conformation of the polypeptide backbone around the alpha-carbon atom.

By plotting the φ and ψ angles for every residue (except the first and last) in the protein chain, the plot shows which combinations of these angles are sterically and energetically favoured or disallowed based on calculations from well-defined, high-resolution small molecule structures.

Residues falling in the “favoured” or “allowed” regions of the plot generally indicate a stereochemically plausible backbone conformation. Outliers, which are residues falling in “disallowed” regions, suggest potential errors in the model’s backbone trace or significant strain. While some residues in functional sites or intrinsically flexible regions can genuinely have unusual backbone conformations, each outlier should ideally be supported by experimental data. To verify support, one would examine the electron density map in the region of the outlier to see if the density clearly supports the unusual conformation, or if the model appears forced into that position without sufficient experimental evidence. Strained and outlier residues can, in fact, sometimes highlight functionally important regions, but this must be experimentally verified.

General case Ramachandran plot (PDB ID: 1cbs), generated using MolProbity).

Note: Glycine is unique among amino acids because its side chain (R-group) consists solely of a single hydrogen atom. This absence of a bulky side chain gives it much greater conformational freedom than other residues, allowing it to occupy regions of the Ramachandran plot that would be sterically unfavourable (disallowed) for others. Proline is another special case; its side chain is cyclic and is bonded back to the backbone nitrogen, which restricts its phi angle (ϕ) to a limited range, often placing it in a specific corner of the plot. Separate statistics and plots are often shown for Glycine and Proline residues.

Glycine Ramachandran plot (PDB ID: 1cbs), generated using MolProbity). The MolProbity analysis states that 97.8% (132/135) of all residues were in favoured (98%) regions, and 100.0% (135/135) of all residues were in allowed (>99.8%) regions, with no outliers. Visually, the plot does not trigger any major alarm.

This metric assesses the protein backbone conformation. The PDB slider (to learn more about this, see section: Validation slider) reports the percentage of amino acid residues whose backbone angles (ϕ and ψ) fall into “disallowed” regions of the Ramachandran plot (combinations of angles that are physically unfavourable due to atoms clashing).

A lower percentage of Ramachandran outliers is better. While some outliers can be genuine features of the structure, they often highlight potential errors, especially if not supported by strong experimental data. This information can also be visualised directly on the 3D structure itself, often accessible on the “Model quality” tab within a PDB entry viewer. To see how to visualise it, go to: the Hands-On: Visualising validation metrics on structure tutorial.

Sidechain outliers (rotamers)

Similar to the protein backbone adopting preferred Ramachandran angles, the amino acid side chains tend to adopt specific 3D arrangements of their atoms, called rotamers, that are energetically favoured.

This metric reports the percentage of side chains that are in unusual conformations compared to libraries (based on high-quality PDB entries) of favoured rotamers. This can indicate potential modelling errors, particularly at lower resolutions where side chain positions are less constrained by the experimental data. 

A lower percentage is better. Sidechain outliers can indicate modelling errors, especially at lower resolutions. Like Ramachandran outliers, genuinely unusual rotamers in functional regions should be supported by experimental data.

Clashscore

The clashscore is a metric that measures how many atoms in the model are packed too closely together, violating the basic physical principles that two atoms cannot occupy the same space at the same time. This metric is reported as the number of clashes per 1000 atoms. This provides a single number summarising the overall density of clashes in the model.

A lower clashscore is better, indicating fewer steric clashes and a more physically plausible packing of atoms throughout the structure. A high clashscore suggests significant geometric problems throughout the model.

RNA Backbone (for RNA structures)

Similar to the Ramachandran angles for protein backbones, the backbone of RNA chains is defined by a series of torsion angles (α, β, γ, δ, ϵ, ζ, and the χ angle describing the orientation of the base). These angles also adopt preferred conformations.

Validation tools assess the torsion angles of individual RNA residues against statistically favoured values derived from known RNA structures (Richardson et al., 2008). A global metric is calculated to assess the overall quality of the RNA backbone conformation, often reported as an average score or percentage of outliers for the entire RNA chain(s). This metric provides a global summary of RNA backbone plausibility.