Large-scale assessment of transcriptome analysis software

Bertone et al. Nat Methods 2013

An international consortium of scientists has published a systematic assessment of gene expression analysis software. The results, which appear in two papers in Nature Methods, may inspire new computing approaches to handle current and future technologies for gene expression analysis.

Scientists use a method called RNA sequencing (RNA-seq) to see how genes are being expressed across an entire genome. But how can they analyse this information, and how good is the software they use to do so?

The RNA-seq Genome Annotation Assessment Project (RGASP), an ENCODE-affiliated initiative, evaluated the performance of a wide range of RNA-seq computer programs. They were able to specify which approaches work well for certain tasks, and which areas can be improved.

“We found a striking degree of variability in how these programs handle different aspects of RNA-seq data. Some methods performed well overall, whereas others have clever design features that excel at solving specific problems. We were also able to highlight areas where many of these computational approaches can improve,” says Paul Bertone of EMBL-EBI, who coordinated the study. “This kind of work provides an important resource for the genomics community, and the consortium model was a unique platform to deliver that in a large-scale, systematic way.”

In both studies, developers of leading software programs were invited to participate in a detailed evaluation of computational methods for processing and interpreting RNA-seq data. The framework was based on the Encyclopedia of DNA Elements (ENCODE) Genome Annotation Assessment Project (EGASP), in which the original program developers contribute their results for evaluation. Each of the methods compared in the study performs sequence alignment and transcript reconstruction: essential steps in the analysis of RNA-seq experiments.

The consortium’s systematic, meticulous approach to the performance assessment resulted in findings that can be used to enhance and expand the range of RNA-seq analysis tools that are available for different kinds of studies. They can also be used to inform developments that meet the demands of emerging sequencing technologies.

Source articles

Steijger, T., et al. (2013) Assessment of transcript reconstruction methods for RNA-seq. Nature Methods (in press); published online 3 November. DOI: doi:10.1038/nmeth.2714.

Engström, P., et al. (2013) Systematic evaluation of spliced alignment programs for RNA-seq data. Nature Methods (in press); published online 3 November. DOI: doi:10.1038/nmeth.2722.