The rlsim package is a collection of tools for simulating RNA-seq library construction, aiming to reproduce the most important factors which are known to introduce significant biases in the currently used protocols: hexamer priming, PCR amplification and size selection. It allows for a systematic exploration of the effects of the individual biasing factors and their interactions on downstream applications by simulating data under a variety of parameter sets.
The implicit simulation model implemented in the main tool (rlsim) is inspired by the actual library preparation protocols and it is more general than the models used by the bias correction methods hence it allows for a fair assessment of their performance.
The package source can be found at GitHub.
Citing the the package
- Botond Sipos, Greg Slodkowicz, Tim Massingham, Nick Goldman (2013) Realistic simulations reveal extensive sample-specificity of RNA-seq biases arXiv:1308.3172
Key features
- Simulation of priming biases loosely based on a nearest-neighbor thermodynamic model.
- Exact simulation of PCR amplification on the level of individual fragments (consistent across expression levels, no approximations).
- Fragment-specific amplification efficiencies determined by GC-content and length.
- Possibility to simulate PCR and sampling pseudo-replicates.
- Simulation of size selection and polyadenylation with flexible target distributions.
- Estimation of GC-dependent amplification efficiencies from real data, relying on assumptions about locality of biases and the mean efficiency of the fragment pool.
- Estimation of relative expression levels.
- Estimation of empirical fragment size distribution, model selection between normal vs. skew normal distributions.
- Able to simulate experiments on the human transcriptome over a wide range of expression levels on a desktop machine.
Usage and examples
More information about the usage and simulation approach can be found in the readme and the package documentation.
Caveats
Please note that the Python tools were not updated to use Python version 3.0 or later.