RNA sequencing (RNA-seq) is the application of next generation sequencing technologies to cDNA molecules. This is obtained by reverse transcription from RNA, in order to get information about the RNA content of a sample. Thus, RNA-seq is the set of experimental procedures that generates cDNA molecules derived from RNA molecules, followed by sequencing-library construction and massively parallel deep sequencing.
Applications of RNA-seq
RNA-seq can be applied to a broad range of scientific questions such as:
- gene expression profiling between samples
- study of alternative splicing events (differential inclusion/exclusion of exons in the processed RNA product after splicing of a precursor RNA segment) associated with diseases
- identification of allele-specific expression, disease-associated single nucleotide polymorphisms (SNPs) and gene fusions to understand, e.g. disease causal variants in cancer
Furthermore, single-cell RNA-seq has recently emerged as a way to study complex biological processes, cellular heterogeneity, and diversity, especially in the stem cell biology and neuroscience fields (7,8).
Advantages of RNA-seq over hybridisation-based approaches
RNA-seq provides several advantages over hybridisation-based approaches:
- RNA-seq has higher sensitivity for genes expressed either at low or very high level and higher dynamic range of expression levels over which transcripts can be detected (> 8000-fold range). It also has lower technical variation and higher levels of reproducibility
- RNA-seq is not limited by prior knowledge of the genome of the organism. Moreover, it can be performed in species for which genomes are not yet available, making RNA-seq particularly attractive for non-model organisms
- RNA-seq gives unprecedented detail (to a single base resolution) about transcriptional features, such as novel transcribed regions, alternative splicing and allele-specific expression
- Finally, whilst it is well known that microarrays are subject to cross-hybridisation bias, RNA-seq is considered unbiased. However, several studies have observed a guanine-cytosine content bias in RNA-seq data and RNA-seq can suffer from mapping ambiguity for paralogous sequences