6 Using SAMPLE

SAMPLE generates (multiple) haplotype or genotype samples from a FREGENE output population. Either a case-control sample, random sample with a continuous phenotype or a random sample with no phenotype can be generated. The number of cases must be set zero if a continuous phenotype or random sample is required. Otherwise, cases/control status is assigned according to a disease model that is multiplicative over genotypes at each causal SNP, and also multiplicative over SNPs if the number causal SNPs selected is greater than 1. To generate individuals with a continuous phenotype the -sigma option must be selected, this parameter specifies the phenotypic standard deviation. The heritability of each SNP must be specified from which the regression coefficient are calculated;

$\displaystyle \beta=\sqrt{\frac{\sigma^2h^2}{2f(1-f)(1-h^2)}}
$

where $ f$ is the MAF, $ h^2$ is the user specified heritability and the genotypes are coded as (-1, 0, 1). Thus, each individuals phenotype is sampled from the following Gaussian distribution

$\displaystyle x\sim$N$\displaystyle \left(\sum\beta_i x_i, \sigma^2\right)
$

where the $ x_i$ 's are the genotypes at the selected causal SNPs. Causal SNPs are selected at random within allele frequency bands specified by the -f option.

All options are set on the command line.

To generate a phenotype, either binary or continous, the following options must be specified
Imperial College -- August 2008