De novo transcriptome assembler for very short reads


25/02/2011: Oases Paper published

The Oases paper is now published in Bioinformatics

06/12/2011: Oases 0.2

Oases was significantly improved for greater robustness and bundled with a multi-k assembly pipeline.

29/01/2010: Initial release

Oases is now available under GPL v3 license.

Frequently asked questions

What is Oases?

Oases is a de novo transcriptome assembler designed to produce transcripts from short read sequencing technologies, such as Illumina, SOLiD, or 454 in the absence of any genomic assembly. It was developed by Marcel Schulz (MPI for Molecular Genomics) and Daniel Zerbino (previously at the European Bioinformatics Institute (EMBL-EBI), now at UC Santa Cruz).

Oases uploads a preliminary assembly produced by Velvet, and clusters the contigs into small groups, called loci. It then exploits the paired-end read and long read information, when available, to construct transcript isoforms.

Can I see the source code?

The Oases source code is freely available under the GPL agreement.

What do I need to run it?

Oases was designed with a 64bit Linux-compatible environment in mind, with gcc. It requires the installation of the Velvet package.

It should compile and run on a 32-bit machine (albeit with a few secondary compiler warnings), but you might find memory to be a limiting factor.

Where can I get Oases?

Just download this tarball, and follow the instructions within.

Alternately, you can use Git to keep up with the updates. After installing the git package, go to the desired location and upload the project:

git clone git://github.com/dzerbino/oases.git

Each time you want to update Oases, just use the packaged update_oases.sh script.

Where can I get more information on Oases?

First of all, if your question is about some kind of segmentation fault/bus error, please ensure you have downloaded the very latest version of Oases (see number at the top of page).

Please feel free to send all your suggestions, questions, requests, bug reports or complaints to Marcel Schulz and Daniel Zerbino.

What do the confidence scores mean?

The confidence scores assigned by Oases are a heuristic measure that expresses the uniqueness of a transcript in a locus. In the case of loci with only one or two possible transcripts, they/it are assigned a confidence of 1. In the case of more complex loci, if a transcript of a locus shares the majority of the contigs associated with a locus than the Confidence of the transcript is high (close to 1), whereas a Confidence Score close to 0 indicates that only few of the contigs of a locus are part of the transcript.

How should I reference Oases?

M.H. Schulz, D.R. Zerbino, M. Vingron and Ewan Birney. Oases: Robust de novo RNA-seq assembly across the dynamic range of expression levels. Bioinformatics, 2012. DOI: 10.1093/bioinformatics/bts094.

Last edited May 17, 2012.
Daniel Zerbino