FREGENE can readily be used without studying all the options in this document, by modifying the files specified by the -i, -p, and -recomb arguments of Example/fregene_example.sh.
This xml-format input file details the chromosomes in the starting population. Most often the initial population will either be
The following tags are required and do not have a default value:
See Example/data/in_example.xml for an example with an invariant starting population, and Example/data/rin_example.xml for an example in which the starting population has been generated by a previous FREGENE run.
This file specifies mutation and selection parameters, and parameters that control some details of the simulation run. See Example/data/par_example.xml for an example. Table 2 briefly describes the tags. To implement selection, the minimal FREGENE command is
When a mutation occurs, it is under selection with probability . The intensity coefficient (identified as *_COEF_* in the parameter file) and dominance coefficient (referred as *_DOM_*) are each sampled as a mixture of two Gaussian distributions. For convenience, the first of these distributions is called ``positive'' (labelled *_POS) and the second is called ``negative'' (*_NEG), but their values need not reflect these labels. The user specifies the relative weight (between 0 and 1) of the positive distribution (PROP_POS_SEL*). If , the negative distribution parameters are ignored; if , the positive distribution parameters are ignored.
When a new selected site arises in a subdivided population, with probability PROP_SEL_LOCAL it is under selection only in the subpopulation where it arose. Otherwise, the site is under selection in all subpopulations.
Finally, each selected site is ``switched off'' ( its selection and dominance coefficients are set to 0) with a probability specified by the -sel_LE option (Table 4). This is intended to allow the user to avoid accumulation of large numbers of sites under balancing selection, and also allows an equilibrium to be reached even when balancing selection is present. At each generation, a selected site is switched off with default probability 1/75,000 (corresponding to a mean time under selection of 75,000 generations if neither allele reaches fixation).
The recombination model is hierarchical, and is highly flexible, allowing a uniform recombination rate, or rates that can vary both on a fine scale (hotspots) and on a broad scale.
Chromosomes are divided into N_REGIONS equal-size regions, each of which is subdivided into SUBS_PER_REGION equal-size subregions. The mean per-site recombination rate within a region is initially sampled from a Gamma distribution, with scale and shape parameters REGION_GAMMA_SCALE and REGION_GAMMA_SHAPE. However, the realised values are normalised so that the overall mean recombination rate is equal to RECOM_RATE. Thus, if there is only one region, its recombination rate is equal to RECOM_RATE irrespective of the parameters of the Gamma distribution. (NB in our parameterisation, the Gamma distribution with scale parameter and shape parameter has mean and variance .)
Similarly, in each subregion the recombination rate is sampled from a Gamma distribution, but in this case there is no normalising. The shape parameter is specified by the user (SUB_REGION_GAMMA_SHAPE), but the scale parameter is fixed by FREGENE so that the mean equals the region mean rate. Within each subregion, hotspots of fixed length (HS_LENGTH) are sampled such that the distance between hotspots follows a Gamma distribution with mean HS_SPACING and shape parameter HS_SPACING_GAMMA_SHAPE. From the mean recombination rate for the subregion, and the proportion of recombinations that occur in hotspots (PROP_RECOM_HS), FREGENE computes a background rate that applies to all sites as well as a mean rate within hotspots. The excess rate above background in a particular hotspot is sampled from a Gamma distribution with variance defined by its shape parameter (INTENSITY_GAMMA_SHAPE).
The start sites of gene conversions, with tract length GC_LENGTH, can be sampled uniformly (HF_COMB=0), or in proportion to crossover rates (HF_COMB=1) but with overall rate specified by GC_RATE.
Imperial College -- August 2008