|
The
Different Versions of the Dayhoff Rate Matrix
Phylogenetic inference methods require Markov models of sequence
evolution expressed in terms of instantaneous rate matrices (Q),
but some models, most notably the PAM model of Dayhoff et al.
(1978), were originally published only in terms of probability matrices
(P). Previous methods for deriving Q from
P, based on inverting the relationship P(t) =
etQ, have used eigen-decomposition of Dayhoff
et al.'s PAM1 matrix. We have found that PAM1 is not a
close-enough approximation to the required matrix P to ensure
convergence of the estimates of the elements of Q.
We have written a paper (available below) which describes the above
findings, and introduces two simple methods to derive Q from the
information published by Dayhoff et al. which require neither
eigen-decomposition nor consideration of the limit t ® 0. We identify the methods used to
derive various existing implementations of the Dayhoff matrix in current
software, and analyze 200 protein sequence alignments to test these
against the two new methods. We conclude with the recommendation that
one of the new methods, denoted DCMut, be used as a 'standard’
implementation of the Dayhoff et al. (1978) model, to facilitate
agreement amongst scientists using supposedly identical methods. Files
described in the paper giving our implementation of this model (and
others) are available below.
Our paper has been accepted for publication in Molecular Biology and
Evolution; click here to download it in PDF format.
Files described in the report (all in a format suitable for inclusion in
the PAML program Codeml):
|