PRANK: Probabilistic Alignment Kit

image

NEW: PRANK development has moved to Google Code. The new prank-msa site contains the latest version of the program source code and allows entering comments and bug reports. The new version of PRANK available there includes bug fixes and significant speed improvements.

Introduction

PRANK is a probabilistic multiple alignment program for DNA, codon and amino-acid sequences. It's based on a novel algorithm that treats insertions correctly and avoids over-estimation of the number of deletion events. In addition, PRANK borrows ideas from maximum likelihood methods used in phylogenetics and correctly takes into account the evolutionary distances between sequences. Lastly, PRANK allows for defining a potential structure for sequences to be aligned and then, simultaneously with the alignment, predicts the locations of structural units in the sequences.

PRANK is a command-line program for Unix-style environments but the same sequence alignment engine is implemented in the graphical program PRANKSTER. In addition to providing a user-friendly interface to those not familiar with Unix systems, PRANKSTER is an alignment browser for alignments saved in the HSAML format. The novel format allows for storing all the information generated by the aligner and the alignment browser is a convenient way to analyse and manipulate the data.

The reconstruction of evolutionary homology -- including the correct placement of insertion and deletion events -- is only feasible for rather closely-related sequences. PRANK is not meant for the alignment of very diverged protein sequences. If sequences are very different, the correct homology cannot be reconstructed with confidence and PRANK may simply refuse to match them. There are several methods developed for structural matching of very distant protein sequences. One should not consider the resulting alignment a proper inference of evolutiory homology, though.

Often there are ties in the alignment, i.e. positions with many equally good solutions. The common practice among sequence aligners is always to pick the same solution among many different ones and produce consistent alignments. We believe that this gives the user a wrong impression and too high confidence on the resulting alignment (for example, 10 iterations of the alignment always produce the same result -> the alignment must be correct). Our decision is to break the ties randomly. This means that different runs of the program with the same data may give different alignments and PRANK alignment may not be reproducible. However, the practice we have chosen also tells the user that the regions not consistently aligned in a similar manner simply cannot be resolved reliably. Be aware that the reproducability in many other methods does not mean higher confidence -- they also reproduce exactly the same errors!

PRANK and PRANKSTER include an option for DNA translation/back-translation. This allows an automatic translation of a protein-coding DNA data set into protein sequences, alignment of these sequences as proteins, and back-translation of the resulting alignment into DNA such that the gaps are maintained. PRANKSTER can also back-translate protein alignments produced with external alignment software.

Note, however, that both PRANK and PRANKSTER can also align codon sequences (i.e., using a codon substitution matrix) and this should (theoretically) produce even better alignments for protein-coding sequences than alignment of protein sequences and back-translation of these alignments into DNA.

Software

image

The latest version of PRANK is available for download from its new home at Google Code, prank-msa

image

The webPRANK alignment server is an easy-to-use web interface to the PRANK alignment algorithm. It supports all the main features of the command-line program PRANK and includes a powerful alignment browser with features similar to those found in the graphical interface PRANKSTER. Note that webPRANK can also be used to upload and display your existing PRANK alignments.

Use webPRANK now or read the paper describing it.

image

PRANKSTER is a graphical front-end to the multiple sequence alignment program PRANK (version 100701). The program is no longer under active development.