- Course overview
- Search within this course
- What are volume data?
- What is volume matching?
- What biological questions can we answer with volume matching?
- Volume pre-processing
- Volume-matching methodologies
- Scoring functions of volume matching
- Volume matching software
- Volume matching use case
- Your feedback
- Learn more
- References
Volume matching software
A complete procedure for volume matching requires several steps:
- Pre-processing of volume data, including bandpass filtering, dust removal, determination of optimum contour level, etc
- Alignment of volumes against each other, using a search over relative rotations and translations
- Re-scoring of alignments
- Presentation and download of results, including postprocessing steps such as generation of difference maps
We have written such a pipeline, consisting of the underlying software SMaSB and a web service PDBeShape. The pipeline makes use of 3rd party programs for individual steps, and is written in such a way that it can be updated as new algorithms and software are developed. Programs currently used include:
TEMPy
TEMPy (Template and Electron Microscopy comparison using Python) is a toolkit designed for assessing density fits in intermediate-to-low resolution maps, both globally and locally (Farabella et al 2015).
Chimera
Chimera is a molecular graphics program with many analytical tools included. We make use of the map fit method, available as a command line python module, which implements a direct comparison of two maps. A 6D search is carried out by random initial placements of the search map, followed by steepest ascent search to maximize the alignment score. We typically use 200 trial placements (Pettersen et al 2004).
gmfit
Gmfit relies upon a reduced representation of the map density in terms of a Gaussian Mixture Model (GMM), which is a linear combination of several Gaussian Distribution Functions (GDFs). The GMM enables a quick fit of the density map and the subunit models, based on the overlap of two GMMs. The number of GDFs controls the description of the map; a larger number generates a more detailed density function at the expense of computational time. Overlap between two GMMs is quantified by the fitness energy. The 6D search to align two GMMs is carried out by random sampling, followed by a steepest descent search using the fitness energy gradient (Kawabata 2008).
Limitations
Volume matching is still computer intensive, especially if one wishes to search over a large set of volumes (e.g. all those contained in the EMDB). The PDBeShape server provides access to a set of pre-calculated volume alignments which can be browsed or searched. The first release contains high quality volume data for prokaryotic and eukaryotic ribosomes (823 volumes), and for class I and II chaperonins (233 volumes), taken from EMDB and PDB.