Superimposition and batch downloads added to the PDBe-KB aggregated views

27 August 2020

We have introduced some major changes to the PDBe-KB aggregated views of proteins to allow easier in-depth analysis and comparison of protein structures. These changes include comprehensive batch file download options, superimposition and clustering of protein structures, and a raft of other features previously only available through the PDBe-KB COVID-19 portal.

We have introduced a process that superimposes structures for a given protein in the PDB based on their structural similarity. This enables users of our PDBe-KB protein aggregated views to easily identify unique structural conformations for each discrete section (“segment”) of the sequence for a given UniProt entry. It also allows easy visualisation of all ligands that interact with this segment of the protein. These structural clusters can be easily displayed in your browser using the Mol* visualisation software.

The superimposition process is run weekly on each discrete sequence range (“segment”) in the PDB for a specific UniProt entry. Each PDB chain in this segment is structurally aligned using the GESAMT software and these structures clustered based on their structural similarity. Each cluster therefore represents a distinct conformation of that region of protein. A single, best representative structure is selected for each cluster. Any bound ligands are also aligned within the clusters, enabling highlighting of common ligand binding sites for the segment. For more information about this clustering process please visit the documentation page.


PDBe-KB ligand binding site alignment superimposition

This image shows the ligand superimposition results for the COVID-19 main protease. Each ligand found bound to the COVID-19 main protease is superimposed and displayed in pink.


These superimposed views are accessible from the PDBe-KB aggregated views of proteins using three separate buttons in the summary section. Superimposed structures can be displayed for each individual segment by clicking the blue ‘view structure clusters for segment N’ (where N represents the segment number) button under the interactive component at the top-right of the page. Additionally, the button under the ‘structures’ icon also opens the superimposition, allowing you to browse the different segments. Furthermore, to display the position of all the bound ligands in the Mol* 3D viewer, you can select the blue ‘view all ligands’ button under the ‘ligands’ icon. This displays all the bound ligands, along with the representative structures for each cluster, highlighting the position of these ligands across all conformations of that protein region in the PDB.


PDBe-KB summary section superimposition and download buttons COVID-19 protease

This image shows the summary section for the COVID-19 main protease. The blue buttons enable users to access superimposition views for each segment and to display ligand binding positions. The green buttons enable download of files for each section of the page.


Also introduced in this update to the PDBe-KB aggregated views is a new batch download service, allowing users to download all files for a specific subset of entries. Users accessing the PDBe-KB aggregated views page for a specific protein can select the green ‘download’ buttons to access files for structures, sequences or validation data. The download options include all structures for a Uniprot entry, all containing ligands, all macromolecular complexes, and all structures for proteins with 90%+ sequence similarity. Furthermore, in the Ligands and Interactions section it is possible to download all files with a specific ligand bound or which interact with a certain macromolecule.

In addition to these new features, we have also integrated a number of features into all PDBe-KB aggregated views that were previously only available in the PDBe-KB COVID-19 portal. This includes better handling of viral polyproteins, with individual sub-pages for the mature proteins. The ‘similar proteins’ section has been expanded beyond UniRef90 clusters to include all PDB chains that have at least 90% similarity. There is also additional highlighting of molecules classified as antibodies or containing PRD (Peptide Reference Dictionary) annotations, providing clearer information about interactions between these and the protein of interest.

Finally, we have now introduced a changelog to the PDBe-KB aggregated views. As we continue to update these pages to make them more useful for our users, this will help to highlight recent changes and give more clarity about new features that have been introduced. You can view the changelog by hovering over the ‘What’s new?’ text at the top of any PDBe-KB aggregated views page or by visiting the PDB-KB aggregated views of proteins landing page.

We would love to hear your feedback on these pages so that we can continue to improve our services. Just press the ‘feedback’ button on any PDBe or PDBe-KB page and let us know what you think - we look forward to hearing from you!