 |
PDBsum entry 1ymz
|
|
|
|
 |
|
|
|
|
|
|
|
|
|
 |
|
|
|
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
|
|
|
|
|
|
|
|
|
|
Unknown function
|
PDB id
|
|
|
|
1ymz
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
References listed in PDB file
|
 |
|
Key reference
|
 |
|
Title
|
 |
Evolutionary information for specifying a protein fold.
|
 |
|
Authors
|
 |
M.Socolich,
S.W.Lockless,
W.P.Russ,
H.Lee,
K.H.Gardner,
R.Ranganathan.
|
 |
|
Ref.
|
 |
Nature, 2005,
437,
512-518.
[DOI no: ]
|
 |
|
PubMed id
|
 |
|
 |
 |
|
Abstract
|
 |
|
Classical studies show that for many proteins, the information required for
specifying the tertiary structure is contained in the amino acid sequence. Here,
we attempt to define the sequence rules for specifying a protein fold by
computationally creating artificial protein sequences using only statistical
information encoded in a multiple sequence alignment and no tertiary structure
information. Experimental testing of libraries of artificial WW domain sequences
shows that a simple statistical energy function capturing coevolution between
amino acid residues is necessary and sufficient to specify sequences that fold
into native structures. The artificial proteins show thermodynamic stabilities
similar to natural WW domains, and structure determination of one artificial
protein shows excellent agreement with the WW fold at atomic resolution. The
relative simplicity of the information used for creating sequences suggests a
marked reduction to the potential complexity of the protein-folding problem.
|
 |
 |
 |
|
 |
|
 |
Figure 1.
Figure 1: SCA-based protein design. a, Structure of a
representative WW domain (Nedd4.3, Protein Data Bank 1I5H) in
complex with a target peptide (in stick representation). The two
canonical tryptophans are shown as space-filling side chains.
The figure was prepared using PyMol51. b, SCA conservation
scores for each position in the WW alignment in arbitrary units
of statistical energy12. Position numbers (x axis) and the
secondary structure diagram at the top coincide with matrix
columns in c-e. c, A matrix representation of statistical
coupling values from perturbation analysis of five positions
(rows) in the WW domain MSA. d, The matrix for an alignment of
IC sequences, built by randomly selecting amino acids at each
site from the observed frequency distributions in the natural
alignment. e, The matrix for an alignment of CC sequences,
derived from a design algorithm where both the conservation
pattern and the pattern of statistical couplings in the natural
alignment are preserved. Scale bar shows the SCA coevolution
score, ranging from 0 (blue) to 2 (red).
|
 |
Figure 4.
Figure 4: Summary of experiments on all natural and artificial
WW sequences. a, A pie chart showing the outcomes of folding
studies for natural (n = 42), CC (n = 43), IC (n = 43), or
random (n = 19) WW sequences. Red, natively folded; blue,
soluble but unfolded; yellow, insoluble; grey, poor expressing.
b, Melting temperatures (T[m]) and van't Hoff enthalpies of
unfolding for all folded WW sequences. Open circles indicate
natural sequences and filled circles indicate the 12 folded CC
sequences. The artificial sequences show thermodynamic
parameters that fall into the same range as that of natural WW
domains.
|
 |
|
 |
 |
|
The above figures are
reprinted
by permission from Macmillan Publishers Ltd:
Nature
(2005,
437,
512-518)
copyright 2005.
|
 |
|
|
|
|
 |