 |
Goldman Group Projects
Hidden Markov Models (HMMs) for Modelling Protein Secondary Structure and Sequence Evolution
This page contains information regarding a project
which uses HMMs to model secondary structure along protein-coding
amino acid sequences. The models are used in phylogenetic analyses,
and can be used for protein structure prediction.
This project is supported by the
BBSRC.
Six papers have been published on this
work: follow this link to
see the references.
The links below are mentioned in some of these papers:
-
software (1): the
PASSML
and PASSML-TM programs are available here.
PASSML and PASSML-TM can analyse data using the models described
in the above papers. Contact Pietro
Liò for further
information.
-
software (2): Jeff Thorne's C program for analysing
data using the models described in the above papers is available
as a compressed tar file structevol.tar.Z.
It includes basic instructions. Contact Jeff
Thorne for further information.
-
parameter estimates (1): values of the parameters
rij described in the 1998 Genetics
paper are available here
(38 x 38 matrix:
the 38 states are ordered Hb1, ..., Hb10, Eb1,
..., Eb6, Tb1, Tb2, Cb1, He1,
..., He10, Ee1, ..., Ee6, Te1,
Te2, Ce1).
[See also this link]
-
parameter estimates (2): values of the parameters
akij described in the 1998 Genetics
paper are available for k = Hb,
Eb,
Tb,
Cb,
He,
Ee,
Te
and Ce
(each a 20 x 20 matrix:
amino acids are ordered A, R, N, D, C, Q, E, G, H, I, L, K, M,
F, P, S, T, W, Y, V). [See also this link]
- data sets: 11 aligned sequences analysed in the 1998
Genetics paper are available as follows:
ADPGP,
GP120,
P17ALL,
P17B,
SUSY,
XYLA,
ADHAN,
ADHPL,
GDH,
GPA,
PEPCK.
Single-letter amino acid codes
are used, plus `-' for alignment gaps and `?' used for residues
deemed untrustworthy (typically due to alignment uncertainty)
and treated as unknown.
 |