Get   for     ? 
 Site search     ? 
GOLDMAN GROUP
 
  
  Index
  Overview
  Projects
  Softwares
  Publications
  People
  Links
  Contact Us

 
 
 Goldman Group   
  Projects


The Different Versions of the Dayhoff Rate Matrix

Phylogenetic inference methods require Markov models of sequence evolution expressed in terms of instantaneous rate matrices (Q), but some models, most notably the PAM model of Dayhoff et al. (1978), were originally published only in terms of probability matrices (P). Previous methods for deriving Q from P, based on inverting the relationship P(t) = etQ, have used eigen-decomposition of Dayhoff et al.'s PAM1 matrix. We have found that PAM1 is not a close-enough approximation to the required matrix P to ensure convergence of the estimates of the elements of Q.

We have written a paper (available below) which describes the above findings, and introduces two simple methods to derive Q from the information published by Dayhoff et al. which require neither eigen-decomposition nor consideration of the limit t ® 0. We identify the methods used to derive various existing implementations of the Dayhoff matrix in current software, and analyze 200 protein sequence alignments to test these against the two new methods. We conclude with the recommendation that one of the new methods, denoted DCMut, be used as a 'standard’ implementation of the Dayhoff et al. (1978) model, to facilitate agreement amongst scientists using supposedly identical methods. Files described in the paper giving our implementation of this model (and others) are available below.

Our paper has been accepted for publication in Molecular Biology and Evolution; click here to download it in PDF format.

Files described in the report (all in a format suitable for inclusion in the PAML program Codeml):

Implementations of the Dayhoff (1978) matrix:
Recommended:
dayhoff-dcmut.dat new implementation based on Dayhoff et al.'s raw data and amino acid mutabilities
Others:
dayhoff-dcfreq.dat new implementation based on Dayhoff et al.'s raw data and amino acid frequencies
dayhoff-paml.dat implementation used in PAML software
dayhoff-kmh.dat uses Kishino et al. (1990) method, but without rounding
dayhoff-molphy.dat implementation used in MOLPHY and TREE-PUZZLE softwares
dayhoff-proml.dat implementation used in PHYLIP software
 
Implementation of the JTT (1992) matrix:
jtt-dcmut.dat our new, recommended, implementation



 

Page maintained by goldman@ebi.ac.uk.  Last updated: 9 December 2003