Supplementary Material


These pages contain the data analysed in the paper "An algorithm for progressive multiple alignment of sequences with insertions" (PNAS 102:10557--10562) and the source code for a program implementing the described method. A detailed description of the performed analyses can be found here and figures referred in the text here.

Sequence data

The two test dataset analysed are:

  • D-loop: mitochondrial control region from 15 primates

  • CAV2: genomic region around the CAV2 gene from 14 mammals and chicken.

The results are provided as ascii (fasta format) and pdf files. The files in fasta format can be browsed using e.g. the program PRANKSTER.


The software can be found here.


The first three of the D-loop alignments described in the paper were generated using commands:

  • ./prank -d=dloop_unaligned.nuc -t=dloop_tree.tre -m=dloop_jc.hmm -o=dloop_jc_minus -disable

  • ./prank -d=dloop_unaligned.nuc -t=dloop_tree.tre -m=dloop_jc.hmm -o=dloop_jc_plus

  • ./prank -d=dloop_unaligned.nuc -t=dloop_tree.tre -m=dloop_jc.hmm -o=dloop_jc_plus_F -F

where -d, -t, -m and -o define the names of a data file (in fasta format), a tree file (rooted, in newick format), a HMM file and an output file (in fasta format); the meaning of flags -disable and -F, as well as the modifications to use the HKY model or to align other datasets should be obvious.

Back to the front page.    Comments? E-mail