0%

What to look for in the validation report

Each validation report is tailored to the experimental method used (X-ray, cryo-EM, or NMR), but generally includes the following sections:

Overall quality at a glance

This section provides a succinct “executive summary” of key global quality indicators, similar to the “Validation Slider” discussed previously. It presents a graphical overview of important metrics (like R-free, Clashscore, Ramachandran outliers, Sidechain outliers, and RSRZ outliers for X-ray, or their equivalents for cryo-EM/NMR), comparing the current structure to all other entries in the PDB, and to those determined by the same technique or at a similar resolution. This is your first quick check for serious overall issues.

Entry composition

This section summarises the unique molecules in the entry (proteins, nucleic acids, ligands, water) and how they are modelled, including the number of atoms, residues, and any issues like zero occupancy or alternative conformations (which will be discussed in Section 3). For X-ray reports, it also notes the number of atoms modelled with zero occupancy and residues with alternative conformations. For NMR, it details the number of unique molecules and instances, and whether any chains are modelled with a reduced set of atoms (“Trace”).

Residue-property plots

These plots provide a sequence-centric graphical summary of quality information on a per-residue basis for all polymeric chains (protein, DNA, RNA). They are crucial for quickly identifying problematic regions in the linear sequence. The first graphic for a chain summarises the proportions of the various outlier classes displayed in the second graphic. The second graphic shows the sequence view annotated by issues in geometry and inclusion.

And these are colour-coded (as seen below):

  • Green: Residues with no geometric quality outliers.
  • Yellow, orange, red: Indicate residues with 1, 2, or 3 types of geometric quality criteria outliers, respectively (e.g., unusual bond lengths, too-close contacts).
  • Red dot (X-ray) / Red diamond (cryo-EM): Marks residues with a poor fit to the experimental electron density (RSRZ outlier for X-ray, or low all-atom inclusion for cryo-EM).
  • Cyan segment (NMR): Indicates residues classified as “ill-defined” within the NMR ensemble (e.g., highly flexible regions that do not adopt a single well-defined conformation).
  • Grey segment: Represents residues present in the sample but not modelled in the final structure.
Example of Residue-property plot for a cryo-EM structure (PDB ID: 6VJA, Chain C). The colored segments (green, yellow, orange, red) indicate residues with geometric quality outliers. The red diamonds above certain residues mark a poor fit to the EM map (all-atom inclusion < 40%). Stretches of 2 or more consecutive residues without any outlier are shown as a green connector. Residues present in the sample, but not in the model, are shown in grey.
Example of Residue-property plot for an X-ray structure with issues (PDB ID: 1FCC, Chain B). The presence of yellow, orange, and red segments, along with a red bar above, indicates residues with geometric problems and/or poor fit to electron density (RSRZ outliers).
Example of Residue-property plot for an NMR structure (PDB ID: 6QWR, Chain A). This plot includes green, and yellow segments, for geometric outliers, and importantly, cyan segments indicating “ill-defined” regions within the NMR ensemble.

In general, the less red, orange, yellow, and grey these plots contain, the better. It is important to realise that residues that are outliers on one or more model-validation criteria could be either errors in the model or reflect genuine features of the structure. Careful analysis of the experimental data is typically required to make the distinction. Outlier residues that are important for structure or function (e.g., enzymatic residues, interface residues, ligand-binding residues) should be inspected extra carefully.

Model quality

This section details the geometric quality of the model, breaking down the global metrics into specific issues.

  • Standard geometry: provides tables summarising bond lengths, bond angles, chirality, and planarity deviations. These tables indicate the RMSZ (Root Mean Squared Z-score) values and the number of outliers (deviations outside a statistically expected range). Individual outliers (e.g., a specific bond that is too long) will be listed in sub-tables.
  • Too-close contacts (Clashscore details): this section lists every identified atomic clash (atoms packed too closely together, violating physical space). The “clashscore” is a global summary, but here you see the individual problematic contacts. It’s worth noting that MolProbity’s clashscore calculation adds hydrogen atoms to the structure before analysis, which helps identify potential improvements even if hydrogen atoms weren’t explicitly modelled in the original structure.
  • Torsion angles (Protein backbone, protein sidechains, RNA): provides detailed summaries and lists of outliers for Ramachandran angles, rotamers (protein sidechains), and RNA backbone torsion angles, supporting the global outlier percentages.
  • Ligand geometry: this is particularly important for small molecules. It assesses bond lengths, angles, chirality, torsions, and rings of bound ligands against the Cambridge Structural Database (CSD) of small molecule organic structures.
    • If you are using information about a small molecule ligand in a PDB entry to make functional conclusions about its interactions, then it is important to check the quality of that specific ligand in the validation report. The report will also highlight if a ligand is a “Ligand Of Interest (LOI),” meaning it was specifically flagged by the authors during deposition as central to their research.