SLR: Sitewise Likehood Ratio estimation of selection

About

image

SLR[1] is a program to detect sites in coding DNA that are unusually conserved and/or unusually variable (that is, evolving under purify or positive selection) by analysing the pattern of changes for an alignment of sequences on an evolutionary tree. The strength of selection at each site is determined by comparing the rate of nonsynonymous (amino acid changing) substitutions to that of synonymous (silent) substituions, the latter assumed to be invisible to selection and so evolving in a strictly neutral fashion.

SLR performs an explicit likelihood-ratio test for selection at each site in the alignment, making few assumptions about the distribution of selection and potentially allowing every site to be under a different level of evolutionary constraint. SLR is a direct test of whether a particular site is evolving in a non-neutral fashion; the many sitewise tests are then corrected for multiple comparisons to indicate which sites have strong evidence of purifying or positive selection and so whether there is any reliable evidence for the presence of selection in the alignment. Alternatively SLR can restricted to only detect unusually variable sites, indicating such sites and providing evidence for the presence of positive selection in the alignment.

In contrast to sitewise testing, random sites models for positive selection (e.g. the "M" models in PAML [2,3] make more assumptions about how the strength of selection is distributed, assuming that the level of selection at each site is chosen from some specified distribution, then testing explicitly for the presence but not location of positive selection. A post hoc proceedure is then used to determine which sites were under constraint. Due to the different questions being asked of the data, the two methods should be considered complementary.

SLR and PAML calibrate the amount of non-synonymous change at each site to the average background synonymous rate of evolution, implicitly assuming that synonymous substitutions are selectively neutral and that the rate of synonymous substitution is constant across sites. One consequence of this model is that sites that violate these assumptions by having more synonymous changes than would be expected by chance (i.e. increased synonymous rate or strong selection for synonymous changes) are detected as being under positive selection despite no nonsynonymous changes being observed.

News

02 May. 2013 - Version 1.4.3, bug fix; Branch lengths could occasional go slightly negative, resulting in SLR hanging. Fixed.
30 Nov. 2011 - Version 1.4.2 releases (source only); Better search directions when many parameters are on bounds.
30 Nov. 2011 - Version 1.4.1 released (source only); Improvement to handling of boundaries during optimisation. Minor increase in speed when many parameters are placed on boundaries.
29 Nov. 2011 - Version 1.4 released (source only); Improvements to likelihood scaling to deal with large trees (~2000 taxa). With thanks to Maxim Kapralov.

Download

Software implementing the SLR test for amino acid coding sites under positive and purifying selection is freely available under the GNU General Public Licence version 3 (see www.gnu.org for further information). A copy of the licence is provided with the software.

Source code and pre-compiled binaries for Mac OS X, ix86 Linux and Microsoft Windows can be obtained via the following links:

References

[1] T. Massingham and N. Goldman (2005) Detecting amino acid sites under positive selection and purifying selection. Genetics 169: 1853-1762.
[2] Z. Yang, R. Nielsen, N. Goldman and A.-M. K. Pedersen (2000) Codon-substitution models for heterogeneous selection pressure at amino acid sites. Genetics 155: 431-449.
[3] Z. Yang (2007) PAML 4: a program package for phylogenetic analysis by maximum likelihood. Molecular Biology and Evolution 24: 1586--1591. PAML website