FREGENE is a C++ program that simulates sequence-like data over large genomic regions in large diploid populations. Unlike coalescent-based simulation tools, such as MS (Hudson, 2002), FREGENE works forwards-in-time which allows a wide range of demographic and selection scenarios to be implemented. Many such models are already incorporated into FREGENE, and since it is open source users can modify or extend these. Coalescent methods have difficulty incorporating large amounts of gene conversion or crossover (Hoggart et al. 2007), whereas these pose no particular problem for FREGENE. FREGENE offers a flexible model for recombination hotspots, and can readily simulate regions up to tens of Mb on a standard desktop computer.

The principle limitation of forward-in-time algorithms is computational, since the entire population must be tracked through time, not only the chromosomes that are ancestral to the observed sample. FREGENE implements many features to enhance computational efficiency, and includes a rescaling option that greatly reduces computation time at the cost of some approximation.

The program SAMPLE, that comes with the FREGENE package, generates samples of individuals from a FREGENE output population, together with a phenotype that depends on the genotype at one or more SNPs and may be binary (case/control) or Gaussian. SAMPLE can also summarize the SNP minor allele frequency (MAF) spectrum and calculate $ r^2$ values for SNP pairs.

For further details about FREGENE, including the rescaling, see:

Balding DJ (2008). FREGENE: Simulation of realistic sequence-level data in populations and ascertained samples . BMC Bioinformatics in press.

Hoggart CJ, Chadeau-Hyam M, Clark TG, Lampariello R, Whittaker JC, De Iorio M, Balding DJ (2007) Sequence-level population simulations over large genomic regions. Genetics 177: 1725-1731, 2007, doi: 10.1534/genetics.106.069088
Please cite these articles in any publication that uses FREGENE.

FREGENE is free to use, distribute and modify, under the terms of the GNU General Public License as published by the Free Software Foundation; either version 3 of the License, or any later version. In particular FREGENE comes WITHOUT ANY WARRANTY.

Please report any problems or bugs to d.balding@ic.ac.uk

Summary of FREGENE's modelling assumptions

FREGENE simulates a, possibly subdivided, population of monoecious, diploid, individuals whose genomes consist of a single, linear chromosome. The population evolves over non-overlapping generations according to a Wright-Fisher model, with or without selection. Many of these assumptions are easy to relax by small changes to the source code, but typically at a cost in computational efficiency.

Features of the implementation

See Hoggart et al. (2007) for further details.

Imperial College -- August 2008