![]() |
PRANKSTER: Graphical aligner and alignment browserIntroduction
Executable binaries are provided for Linux, MacOSX and Windows. Using PRANKSTERDisclaimerPRANKSTER has been developed and best tested on Linux. It is known to run on MacOSX and Windows but its correct behaviour on those platforms is not confirmed. The author is not taking any responsability of possible damage that the software may cause to your computer, scientific career, family life or anything else. DownloadPRANKSTER is written in C++ and Qt4. The code is © Ari Loytynoja and distributed under the GPL; an exception are the eigen routines and the sequence input/output functions that come from PAML and readseq packages and are © Ziheng Yang and Don Gilbert, respectively. A snapshot of the current development source code can be obtained by request from Ari Loytynoja. PRANKSTER binaries can be downloaded from here. InstallationLinuxDue to licensing issues, PRANKSTER is written using the new Qt 4 graphics libraries. Although the Qt libraries are the basis of any Linux computer running KDE, the version 4 isn't yet included in any major distribution and the compilation of the source code requires installation of these libraries. To avoid this time-consuming step, the software is provided as a precompiled binary. Download the file
or open a file browser and click the icon. Different Linux distributions may not have same versions of libraries included or these libraries may be placed in slightly different locations. If the precompiled binary does not work on your system, you may check if the libraries required by PRANKSTER (use the command MacOSXPRANKSTER has been compiled on MacOSX Tiger (10.4.4; PPC). Download the file WindowsPRANKSTER has been compiled on Windows XP. Download the file Source codeAssuming that Qt 4 is installed to
Using PRANKSTERPRANKSTER can be used with or without a mouse, as all other features but the window panel settings (relative width of the tree/name/data panels) can be changed using shortcut keys or key combinations. In PRANK XML-format, node selection can be done either by mouse (by clicking the node in the tree panel) or using the key combination Opening and printing large files may take time. The program GUI may freeze until the task is finished. Different menusFile menuOpen file: Input sequence data in various formats (format detection should be automatic). PRANKSTER uses code from Don Gilbert's program readseq, such that the data input and output have the same strengths and weaknesses as that software. In addition to standard formats, PRANK HSAML format is supported. Open URL: Download a file from an URL address. Reload data: Reload the active sequence file. Any modification is lost. Load tree: Import a tree in newick format to be used as a guide tree for the alignment. The names in the tree have to match exactly those of the sequences; if tree contains only a subset of the sequences, the sequence set will be reduced. Save: Save alignment in one of the various formats. The format is chosen from the file type menu. In most formats, only the currently selected sites (see below) are exported; in HSAML, the selection is stored. Print: Print the data as shown on the screen: font size, colouring, panel setting/hiding etc. will be replicated in the printout. Page selection and ordering have not been implemented. Quit: Quit. Settings menuFont: Set font, colouring and space. Some systems may support fewer font types than shown on the list. Plot: Set parameters for the display of posterior probabilities (HSAML format only). Show/hide, height and width are obvious; for each state (left-most pull-down menu), colour, style, offset and show/hide can be individually selected. The names of different structure states are defined in the model; postprob refers to the reliability of the alignment solution. Sites: Select alignment sites (HSAML format only). Alignment sites can be filtered according to their posterior probabilities (e.g., probability of belonging to a certain structure state, or the reliability of the given solution). All sites can be selected or unselected, and then removed from or added back to selection according to chosen criteria. A selection rule can be limited to a certain range of sites, and either to the currently selected node or all nodes below that node. Selection criteria is saved in the HSAML output and will be recovered when re-opening the file. Model: Specify the alignment model. For both DNA and protein, gap opening rate and gap extension probability can be defined, the option '+F' (keep gaps open; see LG05) set, or an external PRANK alignment model be imported. For DNA, base frequencies can be defined (by default, empirical frequencies are used), and kappa and rho set; coding sequences (reading frame 1 assumed) can be translated into codons and the empirical codon substitution model used. Unchecking 'use log values' may give a 2-3 fold speed up in the alignment but also cause an underflow error and a program crash in the case of large datasets (>>50 sequences). Input/output: PRANKSTER is foremost an alignment method and thus manipulation of a guide tree (even when inferred by the program) is considered as "input" (or the name of the menu item is wrong -- also possible). Short names reads just the name until the first white space; Truncate branches and Fixed branches use the branch lengths value below and do what they say; Scale branches has its own value. Output with Dots and dashes (requires +F) makes the alignment more readable by marking insertions and deletions differently. Unfortunately few downstream methods understand this sort of output. Translate to protein: As it says, translates a DNA data set to proteins and allows back-translating these sequences to DNA after the alignment, maintaining the gap structure. Can also be considered as "Save DNA sequences into memory" function (see below). Sequences should be ungapped or the back-translation may fail. Back-translate to DNA: Back-translates protein sequences to DNA if the names in the two data sets (DNA "saved into memory" -- see above -- and proteins shown on the screen) match and the DNA sequences produce the given protein sequences when translated. Note, however, that the protein alignment does not need to be produced with PRANKSTER but you can use it as a universal back-translation program. Alignment menuMake guide tree: Generate an alignment guide tree using the Neighbor Joining algorithm. If the data is unaligned, approximate pairwise alignments are generated and distances are estimated from those; if the data is aligned, distance estimates are generally improved if the the given alignment is used and gaps are not removed (Remove gaps from input? No.). If you don't know the correct phylogeny, it is recommended that you run the alignment (at least) twice: (1) generate a tree from unaligned data (Alignment/Make guide tree), (2) make a multiple alignment (Alignment/Make alignment), (3) generate a new guide based on the given alignment (Alignment/Make guide tree; Remove existing tree? OK; Remove gaps from input? No), and (4) make an improved multiple alignment (Alignment/Make alignment). You may repeat the steps 3--4 or, even better, export an alignment, use a phylogeny software to infer a tree, import that (rooted) tree in PRANK, and realign the data. If you know the correct phylogeny, import the tree with branch lengths and use it for alignment. The PRANK algorithm uses insertion-deletion events as phylogenetic information and the results may be very sensitive to the given topology. Make alignment: A multiple alignment is generated using PRANK. Shortcut key combinationsThe shortcut keys and key combinations have been tested on a Linux system with a UK keyboard. Correct function on other systems is not guaranteed.
MethodsPRANKSTER is a front-end to the multiple sequence alignment program PRANK. For homogeneous models (default), the method corresponds to that published in (LG05). For DNA alignments, the substitution model is Tamura-Nei (TN93) or a subset of it (Hasegawa-Kishino-Yano (HKY85) by default); for protein coding DNA data, an empirical codon model can also be used (substitution model kindly provided by Carolin Kosiol); for protein, the default model is WAG (WG01). For any data type also external models given in PRANK HMM format can be used. This model can either be homogeneous (one structure state) or define a structure. You can build some models here. ReferencesHKY85. Hasegawa M, Kishino H, Yano T. 1985. Dating of the human-ape splitting by a molecular clock of mitochondrial DNA. JME 2:160-174. Back to the front page. Comments? E-mail ari@ebi.ac.uk. |