Kraken

Kraken: A set of tools for quality control and analysis of high-throughput sequence data

Matthew P.A. Davis, Stijn van Dongen, Cei Abreu-Goodger, Nenad Bartonicek and Anton J. Enright

Methods (2013) - Volume 63, Issue 1, 1 September 2013, Pages 41–49 

 

11 October 2013

Version 13-274 of Kraken has been released, in the form of seqimp-13-274.tgz and reaper-13-274.tgz (see below). This fixes the following bugs:

  • Versions 2.14 and 3.0 of R contain a bug in the RUtils package. This is affecting how the Kraken tools pass options to R. The R developers are aware and are fixing this bug. The previous version of Kraken (13-095) will not work with these versions of R.
  • Tally would hang indefinitely on very small input files, due to a missed check on unrealistically small hash sizes. 

Kraken consists of 3 tools ( ReaperTally  and  Sequence Imp) designed to streamline the analysis of next-generation sequencing data. Although designed with small RNA sequence analysis in mind the tools can be used to address issues facing next-generation sequencing in general.

Please follow this  link  for some help with the  installation  of the tools.

Software:

  • Reaper + Tally + Minion

    Reaper is a program for demultiplexing, trimming and filtering short read sequencing data. It can handle barcodes, trim adapter sequences, strip low quality bases and low complexity sequence, and has many more features. It is fast (written in C) and uses very little memory (one read at-a-time).

    Tally removes redundancy from sequence files by collapsing identicle reads to a single entry while recording the number of instances of each. It can also tally paired-end data or re-pair independently processed paired-end files.

    Minion is a small utility program to infer or test the presence of 3' adapter sequence in sequencing data.

    Documentation:
    Source Code:
    Precompiled Binaries:

  • Sequence Imp

    Sequence Imp is a pipeline incorporating the tools above, designed to streamline RNA sequence analysis for multiple FASTQ files simultaneously.

    Documentation:
    Scripts:
    Worked Examples:
    Annotation:

  • Publication Support - Supplemental material for the Kraken manuscript
  •     Perl script from the Kraken manuscript used as a control to benchmark tally

 

#!/usr/local/bin/perl

use strict; use warnings; open(OUT, ">out.perl") || die "no open output"; my %tally = (); while (<>) { chomp; $tally{$_}++ if ! /^>/; } my $ord = 1; for (sort {$tally{$b} <=> $tally{$a}} keys %tally) { print OUT ">$ord-$tally{$_}\n$_\n"; $ord++; } close(OUT);  

 

Queries and Comments:

If you have any queries, questions or bug reports please send them to:

 

kraken@ebi.ac.uk