Single-cell latent variable model

The single-cell latent variable model (scLVM) is an approach to reconstruct and account for hidden sources of variation in single-cell RNA_Seq studies. 

scLVM is available as python module with interfaces to R. For download and further information please see our github page.


  1. Buttner, F. et al. (2015) Computational analysis of cell-to-cell heterogeneity in single-cell RNA-sequencing data reveals hidden subpopulations of cells. Nat Biotech 33, 155-160.

PEER & PANAMA: estimating hidden confounders in gene expression

PEER is a collection of Bayesian approaches to infer hidden determinants and their effects from gene expression profiles using factor analysis methods. Applications of PEER have

  • detected batch effects and experimental confounders
  • increased the number of expression QTL findings by threefold
  • allowed inference of intermediate cellular traits, such as transcription factor or pathway activations

PEER is available as a command line tool, as well as python and R interfaces, which can be downloaded here.
PANAMA is a recent alternative to PEER, which provides similar functionality but can improve the results in certain settings (see [4]). PANAMA can be downloaded here.


  1. Stegle, O., et al. (2012) Using Probabilistic Estimation of Expression Residuals (PEER) to obtain increased power and interpretability of gene expression analyses. Nat Protoc 7, 500-507
  2. Parts, L., Stegle, O., Winn, J., Durbin, R. (2011)  Joint genetic analysis of gene expression data with inferred cellular phenotypes PLoS Genet 7, 1 p.e1001276.
  3. Stegle, O., Parts, L., Durbin, R., Winn, J. (2010) A Bayesian framework to account for complex non-genetic factors in gene expression levels greatly increases power in eQTL studies. PLoS Comput Biol 6, 5 p.e1000770
  4. Fusi, N., Stegle, O., Lawrence, N.D. (2012) Joint modelling of confounding factors and prominent genetic regulators provides increased accuracy in genetical genomics studies, 8, 1 p.e1002330

Gaussian Process Two Sample test

GPTwoSample is a Gaussian process based two sample test for time series datasets. 
The code release is python-based and can be downloaded online.


  1. Stegle, O., Denby, K.J., Cooke, E.J., Wild, D.L., Ghahramani, Z., Borgwardt, K.M. (2010), A robust Bayesian two-sample test for detecting intervals of differential gene expression in microarray time series. Journal of Comput Biol, 17, 3 p.355–367