What are HMMs?


Hidden Markov models (HMMs) are used by many databases. Like profiles, they can be used to convert multiple sequence alignments into position-specific scoring systems. HMMs are adept at representing amino acid insertions and deletions, meaning that they can model entire alignments, including divergent regions. They are sophisticated and powerful statistical models, very well suited to searching databases for homologous sequences.


 Figure  17. A hidden Markov model  representation of a multiple sequence alignment. As  in profiles, the presence of a given amino acid in  each position is given a scoring value (M), but  in HMM insertions (I) and deletions (D)  are also considered.

Figure 14 Representation of a Hidden Markov model based on a multiple sequence alignment. Amino acids are given a score at each position in the sequence alignment according to the frequency with which they occur. Transition probabilities (i.e., the likelihood that one particular amino acid follows another particular amino acid) and insertion and deletion states are also modelled.

HMMs have wide utility, as is clear from the numerous databases that use this method for protein classification, including Pfam, SMART, TIGRFAM, PIRSF, PANTHER, SFLD, Superfamily and Gene3D.



For more information on HMMs, see Profile hidden Markov models (Eddy. SR. 1998).