spacer
spacer

WWW pages related to Quaternary Structure

  • Molecular Surfaces: A Review by Michael L. Connolly
  • Protein-Protein Interactions
  • Protein-Protein Interactions
  • Quaternary Structure Inference of Proteins from their Crystals Pita software
  • ProSAT Functional annotation of protein 3D structures Bioinformatics (2003) 19, 1723-1725
  • QuaternaryStructure Predictor: ExperimentalHomodimer Classifier www server
  • Protein-Protein Interaction Server www server
  • Macromolecular Interactions
  • Domain motion page
  • Structural and functional domains in proteins
  • TUTORIAL ON PEPTIDE AND PROTEIN STRUCTURE INTRODUCTION
  • Identification of Protein Oligomerisation States by Analysis of Interface Conservation www server
  • Single Nucleotide Polymorphisms www server
  • UCSC Genome Bioinformatics Best Links
  • ProSAT - Protein Structure Annotation Tool
  • Problems and Perspectives in Computational Molecular Biology
  • Dockit is a distance geometry based suite of programs for computing the docking geometries of small molecule ligands to protein binding sites.
  • The Inter-Chain Beta-Sheet (ICBS) database
  • Technical report: 3Dee as an example of a derived database

  • Publications related to Quaternary Structure

    2003

  • From words to literature in structural proteomics. Andrej Sali, Robert Glaeser, Thomas Earnest and Wolfgang Baumeister. Nature, 422, 216- 2003
    Abstract: Technical advances on several frontiers have expanded the applicability of existing methods in structural biology and helped close the resolution gaps between them. As a result, we are now poised to integrate structural information gathered at multiple levels of the biological hierarchy from atoms to cells into a common framework. The goal is a comprehensive description of the multitude of interactions between molecular entities, which in turn is a prerequisite for the discovery of general structural principles that underlie all cellular processes.
  • Computational methods of analysis of protein–protein interactions Lukasz Salwinski and David Eisenberg Current Opinion in Structural Biology, 13, 377-382, 2003
    Abstract: Computational methods play an important role at all stages of the process of determining protein–protein interactions. They are used to predict potential interactions, to validate the results of high-throughput interaction screens and to analyze the protein networks inferred from interaction databases.
  • Dissecting subunit interfaces in homodimeric proteins. Bahadur RP, Chakrabarti P, Rodier F, Janin J. Proteins. 53, 708-19, 2003.
    Abstract: The subunit interfaces of 122 homodimers of known three-dimensional structure are analyzed and dissected into sets of surface patches by clustering atoms at the interface; 70 interfaces are single-patch, the others have up to six patches, often contributed by different structural domains. The average interface buries 1,940 A2 of the surface of each monomer, contains one or two patches burying 600-1,600 A2, is 65% nonpolar and includes 18 hydrogen bonds. However, the range of size and of hydrophobicity is wide among the 122 interfaces. Each interface has a core made of residues with atoms buried in the dimer, surrounded by a rim of residues with atoms that remain accessible to solvent. The core, which constitutes 77% of the interface on average, has an amino acid composition that resembles the protein interior except for the presence of arginine residues, whereas the rim is more like the protein surface. These properties of the interfaces in homodimers, which are permanent assemblies, are compared to those of protein-protein complexes where the components associate after they have independently folded. On average, subunit interfaces in homodimers are twice larger than in complexes, and much less polar due to the large fraction belonging to the core, although the amino acid compositions of the cores are similar in the two types of interfaces.
  • Contribution of surface salt bridges to protein stability: guidelines for protein engineering. Makhatadze GI, Loladze VV, Ermolenko DN, Chen X, Thomas ST. J Mol Biol., 327, 1135-48, 2003
    Abstract: The small globular protein, ubiquitin, contains a pair of oppositely charged residues, K11 and E34, that according to the three-dimensional structure are located on the surface of this protein with a spatial orientation characteristic of a salt bridge. We investigated the strength of this salt bridge and its contribution to the global stability of the ubiquitin molecule. Using the "double mutant cycle" analysis, the strength of the pairwise interactions between K11 and E34 was estimated to be favorable by 3.6kJ/mol. Further, the salt bridge of the reverse orientation, i.e. E11/K34, can be formed and is found to have a strength (3.8kJ/mol) similar to that of the K11/E34 pair. However, the global stability of the K11/E34 variant of ubiquitin is 2.2kJ/mol higher than that of the E11/K34 variant. The difference in the contribution of the opposing salt bridge orientations to the overall stability of the ubiquitin molecule is attributed to the difference in the charge-charge interactions between residues forming the salt bridge and the rest of the ionizable groups in this protein. On the basis of these results, we concluded that surface salt bridges are stabilizing, but their contribution to the overall protein stability is strongly context-dependent, with charge-charge interactions being the largest determinant. Analysis of 16 salt bridges from six different proteins, for which detailed experimental data on energetics have been reported, support the conclusions made from the analysis of the salt bridge in ubiquitin. Implications of these findings for engineering proteins with enhanced thermostability are discussed.
  • In silico identification of functional protein interfaces Rachel E. Bell and Nir Ben-Tal Comp Funct Genom, 4, 420-423, 2003
    Abstract: Proteins perform many of their biological roles through protein-protein, protein-DNA or protein-ligand interfaces. The identification of the amino acids comprising these interfaces often enhances our understanding of the biological function of the proteins. Many methods for the detection of functional interfaces have been developed, and large-scale analyses have provided assessments of their accuracy. Among them are those that consider the size of the protein interface, its amino acid composition and its physicochemical and geometrical properties. Other methods to this effect use statistical potential functions of pairwise interactions, and evolutionary information. The rationale of the evolutionary approach is that functional and structural constraints impose selective pressure; hence, biologically important interfaces often evolve at a slower pace than do other external regions of the protein. Recently, an algorithm, Rate4Site, and a web-server, ConSurf, for the identification of functional interfaces based on the evolutionary relations among homologous proteins as reflected in phylogenetic trees, were developed in our laboratory. The explicit use of the tree topology and branch lengths makes the method remarkably accurate and sensitive. Here we demonstrate its potency in the identification of the functional interfaces of a hypothetical protein, the structure of which was determined as part of the international structural genomics effort. Finally, we propose to combine complementary procedures, in order to enhance the overall performance of methods for the identification of functional interfaces in proteins.
  • Automatic inference of protein quaternary structure from crystals H. Ponstingl, T. Kabir and J. M. Thornton J. Appl. Cryst. 36, 1116-1122, 2003
    Abstract: The arrangement of the subunits in an oligomeric protein often cannot be inferred without ambiguity from crystallographic studies. The annotation of the functional assembly of protein structures in the Protein Data Bank (PDB) is incomplete and frequently inconsistent. Instructions for the reconstruction, by symmetry, of the functional assembly from the deposited coordinates are often absent. An automatic procedure is proposed for the inference of assembly structures that are likely to be physiologically relevant. The method scores crystal contacts by their contact size and chemical complementarity. The subunit assembly is then inferred from these scored contacts by a clustering procedure involving a single adjustable parameter. When predicting the oligomeric state for a non-redundant set of 55 monomeric and 163 oligomeric proteins from dimers up to hexamers, a classification error rate of 16% was observed.
  • Development of Uni.ed Statistical Potentials Describing Protein-Protein Interactions Hui Lu, Long Lu, and Jeffrey Skolnick Biophysical Journal, 84, 1895-1901, 2003
    A residue-based and a heavy atom-based statistical pair potential are developed for use in assessing the strength of protein-protein interactions. To ensure the quality of the potentials, a nonredundant, high-quality dimer database is constructed. The protein complexes in this dataset are checked by a literature search to con.rm that they form multimers, and the pairwise amino acid preference to interact across a protein-protein interface is analyzed and pair potentials constructed. The performance of the residue-based potentials is evaluated by using four jackknife tests and by assessing the potentials ability to select true protein-protein interfaces from false ones. Compared to potentials developed for monomeric protein structure prediction, the interdomain potential performs much better at distinguishing protein-protein interactions. The potential developed from homodimer interfaces is almost the same as that developed from heterodimer interfaces with a correlation coef.cient of 0.92. The residue-based potential is well suited for genomic scale protein interaction prediction and analysis, such as in a recently developed threading-based algorithm, MULTIPROSPECTOR. However, the more time-consuming atom-based potential performs better in identifying near-native structures from docking generated decoys.
  • A Novel Shape Complementarity Scoring Function for Protein-Protein Docking Rong Chen and Zhiping Weng PROTEINS: Structure, Function, and Genetics, 51, 397-408, 2003
    dock
    Abstract: Shape complementarity is the most basic ingredient of the scoring functions for proteinprotein docking. Most grid-based docking algorithms use the total number of grid points at the binding interface to quantify shape complementarity. We have developed a novel Pairwise Shape Complementarity (PSC) function that is conceptually simple and rapid to compute. The favorable component of PSC is the total number of atom pairs between the receptor and the ligand within a distance cutoff. When applied to a benchmark of 49 test cases, PSC consistently ranks near-native structures higher and produces more near-native structures than the traditional grid-based function, and the improvement was seen across all prediction levels and in all categories of the benchmark. Without any post-processing or biological information about the binding site except the complementaritydetermining region of antibodies, PSC predicts the complex structure correctly for 6 test cases, and ranks at least one near-native structure in the top 20 predictions for 18 test cases. Our docking program ZDOCK has been parallelized and the average computing time is 4 minutes using sixteen IBM SP3 processors. Both ZDOCK and the benchmark are freely available to academic users
  • Analysing Six Types of Protein-Protein Interfaces: Yanay Ofran and Burkhard Rost. J. Mol. Biol. 325, 377-387, 2003. http://cubic.bioc.columbia.edu/
    Abstract: Non-covalent residue side-chain interactions occur in many different types of proteins and facilitate many biological functions. Are these differences manifested in the sequence compositions and/or the residue-residue contact preferences of the interfaces? Previous studies analysed small data sets and gave contradictory answers. Here, we introduced a new data-mining method that yielded the largest high-resolution data set of interactions analysed. We introduced an information theory-based analysis method. On the basis of sequence features, we were able to differentiate six types of protein interfaces, each corresponding to a different functional or structural association between residues. Particularly, we found significant differences in amino acid composition and residue-residue preferences between interactions of residues within the same structural domain and between different domains, between permanent and transient interfaces, and between interactions associating homo-oligomers and hetero-oligomers. The differences between the six types were so substantial that, using amino acid composition alone, we could predict statistically to which of the six types of interfaces a pool of 1000 residues belongs at 63 100% accuracy. All interfaces differed significantly from the background of all residues in SWISS- PROT, from the group of surface residues, and from internal residues that were not involved in non-trivial interactions. Overall, our results suggest that the interface type could be predicted from sequence and that interface-type specific mean-field potentials may be adequate for certain applications.
  • SEM (Symmetry Equivalent Molecules): a web-based GUI to generate and visualize the macromolecules: A. S. Z. Hussain, Ch. Kiran Kumar, C. K. Rajesh, S. S. Sheik and K. Sekar. Nuc Acid Res. 31 3356-3358, 2003
    Abstract: SEM, Symmetry Equivalent Molecules, is a webbased graphical user interface to generate and visualize the symmetry equivalent molecules (proteins and nucleic acids). In addition, the program allows the users to save the three- dimensional atomic coordinates of the symmetry equivalent molecules in the local machine. The widely recognized graphics program RasMol has been deployed to visualize the reference (input atomic coordinates) and the symmetry equivalent molecules. This program is written usingCGI/Perl scripts and has been interfaced with all the three-dimensional structures (solved using X-ray crystallography) available in the Protein Data Bank. The program, SEM, can be accessed over the World Wide Web interface at http://dicsoft2.physics.iisc.ernet.in/sem/ or http://144.16.71.11/sem/
  • On Hydrophobicity and Conformational Specifcity in Proteins. Erik Sandelin Stockholm Bioinformatics Center, AlbaNova, Stockholms Universitet, 106 91 Stockholm, Sweden
    Preprint, 2003
    Abstract: In this study we examine the distribution of hydrophobic residues in a non-redundant set of monomeric globular single domain proteins. We find that the total fraction of hydrophobic residues is roughly constant and has no discernible dependence on protein size. This results in a decrease of the hydrophobicity of the core as the size of proteins increases. Using a normalized measure, and by comparing with sets of randomly reshu²ed sequences, we show that this change in the composition of the core is statistically significant and robust with respect to which amino acids are considered hydrophobic and to how buried residues are defined. Comparison with model sequences optimized for stability, while still required to retain their native state as a unique minimum energy conformation, suggests that the size-independence of the total fraction of hydrophobic residues could be a result of requiring proteins to be conformational specific.
  • Protein-protein interactions: structurally conserved residues distinguish between binding sites and exposed protein surfaces. Ma B, Elkayam T, Wolfson H, Nussinov R. Proc Natl Acad Sci U S A., 100, 5772-7, 2003
    Abstract: Polar residue hot spots have been observed at protein-protein binding sites. Here we show that hot spots occur predominantly at the interfaces of macromolecular complexes, distinguishing binding sites from the remainder of the surface. Consequently, hot spots can be used to define binding epitopes. We further show a correspondence between energy hot spots and structurally conserved residues. The number of structurally conserved residues, particularly of high ranking energy hot spots, increases with the binding site contact size. This finding may suggest that effectively dispersing hot spots within a large contact area, rather than compactly clustering them, may be a strategy to sustain essential key interactions while still allowing certain protein flexibility at the interface. Thus, most conserved polar residues at the binding interfaces confer rigidity to minimize the entropic cost on binding, whereas surrounding residues form a flexible cushion. Furthermore, our finding that similar residue hot spots occur across different protein families suggests that affinity and specificity are not necessarily coupled: higher affinity does not directly imply greater specificity. Conservation of Trp on the protein surface indicates a highly likely binding site. To a lesser extent, conservation of Phe and Met also imply a binding site. For all three residues, there is a significant conservation in binding sites, whereas there is no conservation on the exposed surface. A hybrid strategy, mapping sequence alignment onto a single structure illustrates the possibility of binding site identification around these three residues.
  • Domain fusion analysis by applying relational algebra to protein sequence and domain databases Kevin Truong1 and Mitsuhiko Ikura1 BMC Bioinformatics, 4, 16, 2003
    http://calcium.uhnres.utoronto.ca/pi
    Abtract: Domain fusion analysis is a useful method to predict functionally linked proteins that may be involved in direct protein-protein interactions or in the same metabolic or signaling pathway. As separate domain databases like BLOCKS, PROSITE, Pfam, SMART, PRINTS-S, ProDom, TIGRFAMs, and amalgamated domain databases like InterPro continue to grow in size and quality, a computational method to perform domain fusion analysis that leverages on these efforts will become increasingly powerful. This paper proposes a computational method employing relational algebra to find domain fusions in protein sequence databases. The feasibility of this method was illustrated on the SWISS-PROT+TrEMBL sequence database using domain predictions from the Pfam HMM (hidden Markov model) database. We identified 235 and 189 putative functionally linked protein partners in H. sapiens and S. cerevisiae, respectively. From scientific literature, we were able to confirm many of these functional linkages, while the remainder offer testable experimental hypothesis. As the analysis can be computed quickly on any relational database that supports standard SQL (structured query language), it can be dynamically updated along with the sequence and domain databases, thereby improving the quality of predictions over time.

  • 2002

  • Human non-Synonymous SNPs: server and survey. V. Ramemsky, P. Bork and S. Sunyaev. Nuc Acid Res., 30, 3894-3900, 2002
    Abstract: Human single nucleotide polymorphisms (SNPs) represent the most frequent type of human population DNA variation. One of the main goals of SNP research is to understand the genetics of the human phenotype variation and especially the generic basis of human complex diseases. Non-synonymous coding SNPs (nsSNPs) comprise a group of SNPs that, together with SNPs in regulatory regions, are believed to have the highest impact on phenotype. Here we present a World Wide Web server to predict the effect of an nsSNP on protein structure and function. The prediction method enabled analysis of the publicly available SNP database HGVbase, which gave rise to a dataset of snSNPs with predicted functionality. The dataset was further used to compare the effect of various structural and functional characteristics of amino acid substitutions responsible for phenotypic display of snSNPs. We also studied the dependence of selective pressure on the structural and functional properties of proteins. We found that in our dataset the selection pressure against deleterious SNPs depends on the molecular function of the protein. The strongest selective pressure was detected for proteins involved in transcription regulation.
  • Prediction of heterodimerization interfaces of G-protein coupled receptors with a new subtractive correlated mutation method: Marta Filizola, Osvaldo Olmea and Harel Weinstein. Protein Engineering 15,.881-885, 2002
    Abstract: Recent studies employing differential epitope tagging, selective immunoprecipitation of receptor complexes and fluorescence or bioluminescence resonance energy transfer techniques provide direct evidence for heterodimerization between both closely and distantly related members of the G- protein coupled receptor (GPCR) family. Since heterodimerization appears to play a role in modulating agonist affinity, efficacy and/or trafficking properties, the molecular models of GPCRs required to understand receptor function must consider these oligomerization hypotheses. To advance knowledge in this field, we present here a computational approach based on correlated mutation analysis and the structural information contained in three-dimensional molecular models of the transmembrane regions of GPCRs built using the rhodopsin crystal structure as a template. The new subtractive correlated mutation method reveals likely heterodimerization interfaces amongst the different alternatives for the positioning of two tightly packed bundles of seven transmembrane domains next to each other in contact heterodimers of GPCRs. Predictions are applied to GPCRs in the class of opioid receptors. However, in the absence of a known structure of any GPCR dimer, the features of the method and predictions are also illustrated and analyzed for a dimeric complex of known structure.
  • Analysis of Catalytic Residues in Enzyme Active Sites. Gail J. Bartlett, Craig T. Porter, Neera Borkakoti and Janet M. Thornton. J. Mol. Biol. 324, 105-121, 2002.
    Abstract: We present an analysis of the residues directly involved in catalysis in 178 enzyme active sites. Specific criteria were derived to define a catalytic residue, and used to create a catalytic residue dataset, which was then analysed in terms of properties including secondary structure, solvent accessibility, flexibility, conservation, quaternary structure and function. The results indicate the dominance of a small set of amino acid residues in catalysis and give a picture of a general active site environment. It is hoped that this information will provide a better understanding of the molecular mechanisms involved in catalysis and a heuristic basis for predicting catalytic residues in enzymes of unknown function.
  • Novel domain packing in the crystal structure of a thiosulphate-oxidizing enzyme. V. A. Bamford, B. C. Berks and A. M. Hemmings. Biochemical Society Transactions 30, 638-642, 2002.
    Abstract: A key component of the oxidative biogeochemical sulphur cycle involves the utilization by bacteria of reduced inorganic sulphur compounds as electron donors to photosynthetic or respiratory electron transport chains. The SoxAX protein of the photosynthetic bacterium Rhodovulum sulÆdophilum is a heterodimeric c-type cytochrome that is involved in the oxidation of thiosulphate and sulphide. The recently solved crystal structure of the SoxAX complex represents the Ærst structurally characterized example of a productive electron transfer complex between haemoproteins where both partners adopt the c-type cytochrome fold. The packing of c-type cytochrome domains both within SoxA and at the interface between the subunits of the complex has been compared with other examples and found to be unique.
  • Protein structure resources: Helge Weissiga and Philip E. Bournea, Acta Crystallographica Section D 58, 908-915, 2002
    Abstract: The Protein Data Bank (PDB) is the primary source of macromolecular structure data for a worldwide community of users. A subset of those users then process these data to derive secondary information which is also available on the WWW. This process includes validation, some form of reductionism, via sequence or structure, or visualization. The result, a set of further web-accessible resources on protein structure and functional classiÆcation, links to primary genomic information, protein±protein and protein±ligand interactions, protein dynamics and protein-modeling resources. This paper reports on these processes and a subset of the web resources that result.
  • Dissecting protein-protein recognition sites. Chakrabarti, P. and Janin, J. Proteins: Struct. Funct. Genet. 47, 334-343, 2002
    Abstract: The recognition sites in 70 pairwise protein-protein complexes of known three-dimensional structure are dissected in a set of surface patches by clustering atoms at the interface. When the interface buries <2000 A2 of protein surface, the recognition sites usually form a single patch on the surface of each component protein. In contrast, larger interfaces are generally multipatch, with at least one pair of patches that are equivalent in size to a single-patch interface. Each recognition site, or patch within a site, contains a core made of buried interface atoms, surrounded by a rim of atoms that remain accessible to solvent in the complex. A simple geometric model reproduces the number and distribution of atoms within a patch. The rim is similar in composition to the rest of the protein surface, but the core has a distinctive amino acid composition, which may help in identifying potential protein recognition sites on single proteins of known structures.
  • Thermodynamic consequences of burial of polar and non-polar amino acid residues in the protein interior. Loladze, V.V., D.N. Ermolenko, and G.I. Makhatadze. J. Mol. Biol. 320, 343-357, 2002.
    Abstract: Effects of amino acid substitutions at four fully buried sites of the ubiquitin molecule on the thermodynamic parameters (enthalpy, Gibbs energy) of unfolding were evaluated experimentally using differential scanning calorimetry. The same set of substitutions has been incorporated at each of four sites. These substitutions have been designed to perturb packing (van der Waals) interactions, hydration, and/or hydrogen bonding. From the analysis of the thermodynamic parameters for these ubiquitin variants we conclude that: (i) packing of non-polar groups in the protein interior is favorable and is largely defined by a favorable enthalpy of van der Waals interactions. The removal of one methylene group from the protein interior will destabilize a protein by approximately 5 kJ/mol, and will decrease the enthalpy of a protein by 12 kJ/mol. (ii) Burial of polar groups in the non-polar interior of a protein is highly destabilizing, and the degree of destabilization depends on the relative polarity of this group. For example, burial of Thr side-chain in the non-polar interior will be less destabilizing than burial of Asn side-chain. This decrease in stability is defined by a large enthalpy of dehydration of polar groups upon burial. (iii) The destabilizing effect of dehydration of polar groups upon burial can be compensated if these buried polar groups form hydrogen bonding. The enthalpy of this hydrogen bonding will compensate for the unfavorable dehydration energy and as a result the effect will be energetically neutral or even slightly stabilizing.
  • Close Range Electrostatic Interactions in Proteins Sandeep Kumar and Ruth Nussinov. ChemBioChem, 3, 604-617, 2002
    Abstract: Two types of noncovalent bonding interactions are present in protein structures, specific and nonspecific. Nonspecific interactions are mostly hydrophobic and van der Waals. Specific interactions are largely electrostatic. While the hydrophobic effect is the major driving force in protein folding, electrostatic interactions are important in protein folding, stability, flexibility, and function. Here we review the role of close-range electrostatic interactions (salt bridges) and their networks in proteins. Salt bridges are formed by spatially proximal pairs of oppositely charged residues in native protein structures. Often salt-bridging residues are also close in the protein sequence and fall in the same secondary structural element, building block, autonomous folding unit, domain, or subunit, consistent with the hierarchical model for protein folding. Recent evidence also suggests that charged and polar residues in largely hydrophobic interfaces may act as hot spots for binding. Salt bridges are rarely found across protein parts which are joined by flexible hinges, a fact suggesting that salt bridges constrain flexibility and motion. While conventional chemical intuition expects that salt bridges contribute favorably to protein stability, recent computational and experimental evidence shows that salt bridges can be stabilizing or destabilizing. Due to systemic protein flexibility, reflected in small-scale side-chain and backbone atom motions, salt bridges and their stabilities fluctuate in proteins. At the same time, genome-wide, amino acid sequence composition, structural, and thermodynamic comparisons of thermophilic and mesophilic proteins indicate that specific interactions, such as salt bridges, may contribute significantly towards the thermophilic ±mesophilic protein stability differential.
  • Interrogating protein interaction networks through structural biology. Aloy, P. and Russell, R. B. Proc. Natl Acad. Sci. USA, 99, 5896-5901, 2002.
    Abstract: Protein–protein interactions are central to most biological processes. Although much recent effort has been put into methods to identify interacting partners, there has been a limited focus on how these interactions compare with those known from three-dimensional (3D) structures. Because comparison of protein interactions often involves considering homologous, but not identical, proteins, a key issue is whether proteins that are homologous to an interacting pair will interact in the same way, or interact at all. Accordingly, we describe a method to test putative interactions on complexes of known 3D structure. Given a 3D complex and alignments of homologues of the interacting proteins, we assess the fit of any possible interacting pair on the complex by using empirical potentials. For studies of interacting protein families that show different specificities, the method provides a ranking of interacting pairs useful for prioritizing experiments. We evaluate the method on interacting families of proteins with multiple complex structures. We then consider the fibroblast growth factoreceptor system and explore the intersection between complexes of known structure and interactions proposed between yeast proteins by methods such as two-hybrids. We provide confirmation for several interactions, in addition to suggesting molecular details of how they occur.
  • MULTIPROSPECTOR: An Algorithmfor the Prediction of Protein-Protein Interactions by Multimeric Threading Long Lu, Hui Lu, and Jeffrey Skolnick PROTEINS: Structure, Function, and Genetics, 49, 350-364, 2002
    Abstract: Inthis postgenomic era, the ability to identify protein-protein interactions on a genomic scale is very important to assist in the assignment of physiological function. Because of the increasing number of solved structures involving protein complexes, the time is ripe to extend threading to the prediction of quaternary structure. In this spirit, a multimeric threading approach has been developed. The approach is comprised of two phases. In the first phase, traditional threading on a single chain is applied to generate a set of potential structures for the query sequences. In particular, we use our recently developed threading algorithm, PROSPECTOR. Then, for those proteins whose template structures are part of a known complex, we rethread on both partners in the complex and now include a protein-protein interfacial energy. To perform this analysis, a database of multimeric protein structures has been constructed, the necessary interfacial pairwise potentials have been derived, and a set of empirical indicators to identify true multimers based on the threading Z-score and the magnitude of the interfacial energy have been established. The algorithm has been tested on a benchmark set comprised of 40 homodimers, 15 heterodimers, and 69 monomers that were scanned against a protein library of 2478 structures that comprise a representative set of structures in the Protein Data Bank. Of these, the method correctly recognized and assigned 36 homodimers, 15 heterodimers, and 65 monomers. This protocol was applied to identify partners and assign quaternary structures of proteins found inthe yeast database of interactingproteins. Our multimeric threadingalgorithm correctly predicts 144 interacting proteins, compared to the 56 (26) cases assigned by PSIBLAST using a (less) permissive E-value of 1 (0.01). Next, all possible pairs of yeast proteins have been examined. Predictions (n=2865) of protein-protein interactions are made; 1138 of these 2865 interactions have counterparts in the Database of Interacting Proteins. In contrast, PSI-BLAST made 1781 predictions, and 1215 have counterparts in DIP. An estimation of the false-negative rate for yeastpredictedinteractions has alsobeenprovided. Thus, a promising approach to help assist in the assignment of protein-protein interactions on a genomic scale has been developed.
  • Recombinatoric exploration of novel folded structures: A heteropolymer-based model of protein evolutionary landscapes. Cui, Y., W.H. Wong, E. Bornberg-Bauer, and H.S. Chan. Proc. Natl. Acad. Sci. USA 99, 809-814, 2002.
    Abstract: The role of recombination in evolution is compared with that of point mutations (substitutions) in the context of a simple, polymer physics-based model mapping between sequence (genotype) and conformational (phenotype) spaces. Crossovers and point mutations of lattice chains with a hydrophobic polar code are investigated. Sequences encoding for a single ground-state conformation are considered viable and used as model proteins. Point mutations lead to diffusive walks on the evolutionary landscape, whereas crossovers can “tunnel” through barriers of diminished fitness. The degree to which crossovers allow for more efficient sequence and structural exploration depends on the relative rates of point mutations versus that of crossovers and the dispersion in fitness that characterizes the ruggedness of the evolutionary landscape. The probability that a crossover between a pair of viable sequences results in viable sequences is an order of magnitude higher than random, implying that a sequence's overall propensity to encode uniquely is embodied partially in local signals. Consistent with this observation, certain hydrophobicity patterns are significantly more favored than others among fragments (i.e., subsequences) of sequences that encode uniquely, and examples reminiscent of autonomous folding units in real proteins are found. The number of structures explored by both crossovers and point mutations is always substantially larger than that via point mutations alone, but the corresponding numbers of sequences explored can be comparable when the evolutionary landscape is rugged. Efficient structural exploration requires intermediate nonextreme ratios between point-mutation and crossover rates.
  • Types of inter-atomic interactions at the MHC-peptide interface: Identifying commonality from accumulated data Png Eak Hock Adrian, Ganapathy Rajaseger, Venkatarajan Subramanian Mathura, Meena Kishore Sakharkar and Pandjassarame Kangueane BMC Structural Biology, 2, 2002
    Abstract: Background: Quantitative information on the types of inter-atomic interactions at the MHCpeptide interface will provide insights to backbone/sidechain atom preference during binding. Qualitative descriptions of such interactions in each complex have been documented by protein crystallographers. However, no comprehensive report is available to account for the common types of inter-atomic interactions in a set of MHC-peptide complexes characterized by variation in MHC allele and peptide sequence. The available x-ray crystallography data for these complexes in the Protein Databank (PDB) provides an opportunity to identify the prevalent types of such interactions at the binding interface. Results: We calculated the percentage distributions of four types of interactions at varying interatomic distances. The mean percentage distribution for these interactions and their standard deviation about the mean distribution is presented. The prevalence of SS and SB interactions at the MHC-peptide interface is shown in this study. SB is clearly dominant at an inter-atomic distance of 3Å. Conclusion: The prevalently dominant SB interactions at the interface suggest the importance of peptide backbone conformation during MHC-peptide binding. Currently, available algorithms are developed for protein sidechain prediction upon fixed backbone template. This study shows the preference of backbone atoms in MHC-peptide binding and hence emphasizes the need for accurate peptide backbone prediction in quantitative MHC-peptide binding calculations.
  • Anti-cooperativity and cooperativity in hydrophobic interactions: Three-body free energy landscapes and comparison with implicit-solvent potential functions for proteins. Shimizu, S. and H.S. Chan. Proteins: Struct. Funct. Genet. 48, 15-30, 2002.
    Abstract: Potentials of mean force (PMFs) of three-body hydrophobic association are investigated to gain insight into similar processes in protein folding. Free energy landscapes obtained from explicit simulations of three methanes in water are compared with that predicted by popular implicit-solvent effective potentials for the study of proteins. Explicit-water simulations show that for an extended range of three-methane configurations, hydrophobic association at 25 degrees C under atmospheric pressure is mostly anti-cooperative, that is, less favorable than if the interaction free energies were pairwise additive. Effects of free energy nonadditivity on the kinetic path of association and the temperature dependence of additivity are explored by using a three-methane system and simplified chain models. The prevalence of anti-cooperativity under ambient conditions suggests that driving forces other than hydrophobicity also play critical roles in protein thermodynamic cooperativity. We evaluate the effectiveness of several implicit-solvent potentials in mimicking explicit water simulated three-body PMFs. The favorability of the contact free energy minimum is found to be drastically overestimated by solvent accessible surface area (SASA). Both the SASA and a volume-based Gaussian solvent exclusion model fail to predict the desolvation barrier. However, this barrier is qualitatively captured by the molecular surface area model and a recent "hydrophobic force field." None of the implicit-solvent models tested are accurate for the entire range of three-methane configurations and several other thermodynamic signatures considered.
  • Prediction of Protein-Protein Interaction Sites Using Support Vector Machines Yohei Minakuchi, Kenji Satou, Akihiko Konagaya, Takashi Ito Genome Informatics 13, 322-323, 2002
    Abstract: Protein-protein interactions play an important role in various biological processes. Over the past few years, several studies have been made on protein interface and those results enable us to obtain massive data on various aspects of protein interface. The problem that we have to consider next is predicting the interaction sites on the protein surface. Although several prediction methods are developed, there is still room for improvement. The purpose of this study is to develop a reliable prediction system of protein-protein interaction sites from their three-dimensional structure.
  • Prediction of protein-protein interaction sites in heterocomplexes with neural networks Fariselli, P., Pazos, F., Valencia, A., and Casadio, R. Eur. J. Biochem., 269, 1356-1361, 2002.
    Abstract: In this paper we address the problem of extracting features relevant for predicting protein-protein interaction sites from the three-dimensional structures of protein complexes. Our approach is based on information about evolutionary con- servation and surface disposition. We implement a neural network based system, which uses a cross validation proce- dure and allows the correct detection of 73% of the residues involved in protein interactions in a selected database comprising 226 heterodimers. Our analysis conRrms that the chemico-physical properties of interacting surfaces are di.cult to distinguish from those of the whole protein sur- face. However neural networks trained with a reduced representation of the interacting patch and sequence proRle are su.cient to generalize over the different features of the contact patches and to predict whether a residue in the protein surface is or is not in contact. By using a blind test, we report the prediction of the surface interacting sites of three structural components of the Dnak molecular chaperone system, and Rnd close agreement with previously published experimental results. We propose that the predictor can signiRcantly complement results from structural and func- tional proteomics.
  • Inferring Domain-Domain Interactions From Protein-Protein Interactions Minghua Deng, Shipra Mehta, Fengzhu Sun and Ting Chen 12, 1540-1548, 2002
    Abstract: The interaction between proteins is one of the most important features of protein functions. Behind protein-protein interactions there are protein domains interacting physically with one another to perform the necessary functions. Therefore, understanding protein interactions at the domain level gives a global view of the protein interaction network, and possibly of protein functions. Two research groups used yeast two-hybrid assays to generate 5719 interactions between proteins of the yeast Saccharomyces cerevisiae. This allows us to study the large-scale conserved patterns of interactions between protein domains. Using evolutionarily conserved domains defined in a protein-domain database called PFAM (http://PFAM.wustl.edu), we apply a Maximum Likelihood Estimation method to infer interacting domains that are consistent with the observed protein-protein interactions. We estimate the probabilities of interactions between every pair of domains and measure the accuracies of our predictions at the protein level. Using the inferred domain-domain interactions, we predict interactions between proteins. Our predicted protein-protein interactions have a significant overlap with the protein-protein interactions (MIPS: http://mips.gfs.de) obtained by methods other than the two-hybrid assays. The mean correlation coefficient of the gene expression profiles for our predicted interaction pairs is significantly higher than that for random pairs. Our method has shown robustness in analyzing incomplete data sets and dealing with various experimental errors. We found several novel protein-protein interactions such as RPS0A interacting with APG17 and TAF40 interacting with SPT3, which are consistent with the functions of the proteins.

  • 2001

  • Standard atomic volumes in double-stranded DNA and packing in protein--DNA interfaces. Nadassy K, Tomas-Oliveira I, Alberts I, Janin J, Wodak SJ. Nucleic Acids Res., 29, 3362-76, 2001
    Abstract: Standard volumes for atoms in double-stranded B-DNA are derived using high resolution crystal structures from the Nucleic Acid Database (NDB) and compared with corresponding values derived from crystal structures of small organic compounds in the Cambridge Structural Database (CSD). Two different methods are used to compute these volumes: the classical Voronoi method, which does not depend on the size of atoms, and the related Radical Planes method which does. Results show that atomic groups buried in the interior of double-stranded DNA are, on average, more tightly packed than in related small molecules in the CSD. The packing efficiency of DNA atoms at the interfaces of 25 high resolution protein-DNA complexes is determined by computing the ratios between the volumes of interfacial DNA atoms and the corresponding standard volumes. These ratios are found to be close to unity, indicating that the DNA atoms at protein-DNA interfaces are as closely packed as in crystals of B-DNA. Analogous volume ratios, computed for buried protein atoms, are also near unity, confirming our earlier conclusions that the packing efficiency of these atoms is similar to that in the protein interior. In addition, we examine the number, volume and solvent occupation of cavities located at the protein-DNA interfaces and compared them with those in the protein interior. Cavities are found to be ubiquitous in the interfaces as well as inside the protein moieties. The frequency of solvent occupation of cavities is however higher in the interfaces, indicating that those are more hydrated than protein interiors. Lastly, we compare our results with those obtained using two different measures of shape complementarity of the analysed interfaces, and find that the correlation between our volume ratios and these measures, as well as between the measures themselves, is weak. Our results indicate that a tightly packed environment made up of DNA, protein and solvent atoms plays a significant role in protein-DNA recognition.
  • Three-dimensional structure of the lithostathine protofibril, a protein involved in Alzheimer's disease. C. Gregorire, S. Marco, J. Thimonier, L. Duplan, E. Laurine, J.-P. Chauvin, B. Michel, V. Peyot and J.-M. Verdier. The EMBO Journal, 20, 3313-3321, 2001.
    Abstract: Neurodegenerative diseases are characterised by the presence of filamentous aggregates of protein. We previously established that lithostathine is a protein overexpressed in the pre-clinical stages of Alzheimer's disease. Furthermore, it is present in the pathognomonic lesions associated with Alzheimer's disease. After self-proteolysis, the N-terminally truncated form of lithostathine leads to the formation of fibrillar aggregates. Here we observed using atomic force microscopy that these aggregates consisted of a network of protofibrils, each of which had a twisted appearance. Electron microscopy and image analysis showed that this twisted protofibril has a quadruple helical structure. Three dimensional X-ray structural data and the results of biochemical experiments showed that when forming a protofibril, lithostathine wasfirst assembled into a tetramer as the result of longitudinal electrostatic interactions. All these results were used to build a structural model for the lithostathine proptofibril called the quadruple-helical filament (QHF-litho). In conclusion, lithostathine strongly resembles the prion protein in its dramatic proteolysis and amyloid proteins in its ability to form fibrils.
  • Identification of protein oligomerization states by analysis of interface conservation: Adrian H. Elcock and J. Andrew McCammon. Proc Natl Acad Science 98, 2990-2994, 2001
    Abstract: The discrimination of true oligomeric protein-protein contacts from nonspecific crystal contacts remains problematic. Criteria that have been used previously base the assignment of oligomeric state on consideration of the area of the interface andyor the results of scoring functions based on statistical potentials. Both techniques have a high success rate but fail in more than 10% of cases. More importantly, the oligomeric states of several proteins are incorrectly assigned by both methods. Here we test the hypothesis that true oligomeric contacts should be identifiable on the basis of an increased degree of conservation of the residues involved in the interface. By quantifying the degree of conservation of the interface and comparing it with that of the remainder of the protein surface, we develop a new criterion that provides a highly effective complement to existing methods.
  • Mapping Protein Family Interactions: Intramolecular and Intermolecular Protein Family Interaction Repertoires in the PDB and Yeast: Jong Park, Michael Lappe and Sarah A. Teichmann. J. Mol. Biol. 307, 929-938, 2001.
    Abstract: In the postgenomic era, one of the most interesting and important challenges is to understand protein interactions on a large scale. The physical interactions between protein domains are fundamental to the workings of a cell: in multi-domain polypeptide chains, in multi-subunit proteins and in transient complexes between proteins that also exist independently. To study the large- scale patterns and evolution of interactions between protein domains, we view interactions between protein domains in terms of the interactions between structural families of evolutionarily related domains. This allows us to classify 8151 interactions between individual domains in the Protein Data Bank and the yeast Saccharomyces cerevisiae in terms of 664 types of interactions, between protein families. At least 51 interactions do not occur in the Protein Data Bank and can only be derived from the yeast data. The map of interactions between protein families has the form of a scale-free network, meaning that most protein families only interact with one or two other families, while a few families are extremely versatile in their interactions and are connected to many families. We observe that almost half of all known families engage in interactions with domains from their own family. We also see that the repertoires of interactions of domains within and between polypeptide chains overlap mostly for two speciÆc types of protein families: enzymes and same-family interactions. This suggests that different types of protein interaction repertoires exist for structural, functional and regulatory reasons.
  • Conservation Helps to Identify Biologically Relevant Crystal Contacts. William S. J. Valdar and Janet M. Thornton. J. Mol. Biol. 313, 399-416, 2001
    Abstract: Some crystal contacts are biologically relevant, most are not. We assess the utility of combining measures of size and conservation to discriminate between biological and non-biological contacts. Conservation and size information is calculated for crystal contacts in 53 families of homodimers and 65 families of monomers. Biological contacts are shown to be usually conserved and typically the largest contact in the crystal. A range of neural networks accepting different combinations and encodings of this information is used to answer the following questions: (1) is a given crystal contact biological, and (2) given all crystal contacts in a homodimer, which is the biological one? Predictions for (1) are performed on both homodimer and monomer datasets. The best performing neural network combined size and conservation inputs. For the homodimers, it correctly classiÆed 48 out of 53 biological contacts and 364 out of 366 nonbiological contacts, giving a combined accuracy of 98.3 %. A more robust performance statistic, the phi-coefÆcient, which accounts for imbalances in the dataset, gave a value of 0.92. Taking all 535 non-biological contacts from the 65 monomers, this predictor made erroneous classiÆcations only 4.3% of the time. Predictions for (2) were performed on homodimers only. The best performing network achieved a prediction accuracy of 98.1% using size information alone. We conclude that in answering question (1) size and conservation combined discriminate biological from non-biological contacts better than either measure alone. For answering question (2), we conclude that in our dataset size is so powerful a discriminant that conservation adds little predictive benefit
  • Amino acid-base interactions: a three-dimensional analysis of protein-DNA interactions at an atomic level. N.M. Luscombe, R.A. Laskowski and J.M. Thornton. Nuc Acid Res. 29, 2860-2874, 2001. http://www.biochem.ucl.ac.uk/~nick/aa-base/
    Abstract: To assess whether there are universal rules that govern amino acid- base recognition, we investigate hydrogen bonds, van der Waals contacts and water-mediated bonds in 129 protein-DNA complex structures. DNA-backbone interactions are the most numerous, providing stability rather than specificity. For base interactions, there are significant base-amino acid type correlations, which can be rationalized by considering the stereochemistry of protein side chains and the base edges exposed in the DNA structure. Nearly two-thirds of the direct read-out of DNA sequences involves complex networks of hydrogen bonds which enhance specificity. Two-thirds of all protein-DNA interactions comprise van der Waals contacts, compared to about one-sixth each of hydrogen and water mediated bonds. This highlights the central importance of these contacts for complex formation, which have previously been relegated to a secondary role. Although common, water-mediated bonds are usually non-specific, acting as space fillers at the protein-DNA interface. In conclusion, the majority of amino acid-base interactions observed follow general principles that apply across all protein-DNA complexes, although there are individual exceptions. Therefore, we distinguish between interactions whose specificities are 'universal' and 'context- dependent'. An interactive Web-based atlas of side chain-base contacts provides acess to the collected data, including analysis and visualization of the three- dimensional geometry of the interactions.
  • On the molecular discrimination between adenine and guanine by proteins. I. Nobeli, R.A. Laskowski, W.S.J. Valdar and J.M. Thornton. Nuc Acid Res. 29, 4294-4309, 2001
    Abstract: The molecular recognition and discrimination of adenine and guanine ligand moieties in complexes with proteins have been studied using emphirical observations on carefully selected crystal structures. The distribution of protein folds that bind these purines has been found to differ significantly from that across the whole PDB, but the most populated architectures and folds are also the most common in three genomes from the three different domains of life. The protein environments around the two nucleic acid bases were significantly different, in terms of the propensities of amino acid residues to be in the binding site, as well as their propensities to form hydrogen bonds to the bases. Plots of the distribution of protein atoms around the two purines clearly show different clustering of hydrogen bond donors and acceptors opposite complimentary acceptors and donors in the rings, with hydrophobic areas below and above the rings. However, the clustering pattern is fuzzy, reflecting the variety of ways that proteins have evolved to recognize the same molecular moiety. Furthermore, an analysis of the conservation of residues in the protein chains binding guanine shows that residues in contact with the base are in general better conserved than the rest of the chain.
  • Structure of the C-terminal sterile a-motif (SAM) domain of human p73a. Wooi Koon Wang, Mark Bycroft, Nicholas W. Foster, Ashley M. Buckle, Alan R. Fersht and Yu Wai Chen. Acta Crystallographica Section D D57, 545-551, 2001
    Abstract: p73 is a homologue of the tumour suppressor p53 and contains all three functional domains of p53. The _-splice variant of p73 (p73_) contains near its C-terminus an additional structural domain known as the sterile -motif (SAM) that is probably responsible for regulating p53-like functions of p73. Here, the 2.54 resolution crystal structure of this protein domain is reported. The crystal structure and the published solution structure have the same Æve-helix bundle fold that is characteristic of all SAM-domain structures, with an overall r.m.s.d. of 1.5 for main-chain atoms. The hydrophobic core residues are well conserved, yet some large local differences are observed. The crystal structure reveals a dimeric organization, with the interface residues forming a mini four-helix bundle. However, analysis of solvation free energies and the surface area buried upon dimer formation indicated that this arrangement is more likely to be an effect of crystal packing rather than reØecting a physiological state. This is consistent with the solution structure being a monomer. The p73_ SAM domain also contains several interesting structural features: a Cys-X-X-Cys motif, a 310-helix and a loop that have elevated B factors, and short tight inter-helical loops including two _-turns; these elements are probably important in the normal function of this domain.
  • Method development of combination of atomic force and electron microscopy data to obtain better 3d reconstructions. Velázquez Muriel, J.A, Sorzano, C.O.S., Carazo, J.M. Proceedings of the Spanish, Portuguese and French Microscopy Societies meeting at Barcelona. 2001. paper
  • Contribution of polar groups in the interior of a protein to the conformational stability. Takano, K., Y. Yamagata, and K. Yutani. Biochemistry 40, 4853-4858, 2001.
    Abstract: It has been generally believed that polar residues are usually located on the surface of protein structures. However, there are many polar groups in the interior of the structures in reality. To evaluate the contribution of such buried polar groups to the conformational stability of a protein, nonpolar to polar mutations (L8T, A9S, A32S, I56T, I59T, I59S, A92S, V93T, A96S, V99T, and V100T) in the interior of a human lysozyme were examined. The thermodynamic parameters for denaturation were determined using a differential scanning calorimeter, and the crystal structures were analyzed by X-ray crystallography. If a polar group had a heavy energy cost to be buried, a mutant protein would be remarkably destabilized. However, the stability (Delta G) of the Ala to Ser and Val to Thr mutant human lysozymes was comparable to that of the wild-type protein, suggesting a low-energy penalty of buried polar groups. The structural analysis showed that all polar side chains introduced in the mutant proteins were able to find their hydrogen bond partners, which are ubiquitous in protein structures. The empirical structure-based calculation of stability change (Delta Delta G) [Takano et al. (1999) Biochemistry 38, 12698--12708] revealed that the mutant proteins decreased the hydrophobic effect contributing to the stability (Delta G(HP)), but this destabilization was recovered by the hydrogen bonds newly introduced. The present study shows the favorable contribution of polar groups with hydrogen bonds in the interior of protein molecules to the conformational stability.
  • Achieving stability and conformational specicity in designed proteins via binary patterning. Marshall, S.A. and S.L. Mayo. J. Mol. Biol. 305, 619-631, 2001.
    Abstract: We have developed a method to determine the optimal binary pattern (arrangement of hydrophobic and polar amino acids) of a target protein fold prior to amino acid sequence selection in protein design studies. A solvent accessible surface is generated for a target fold using its backbone coordinates and "generic" side-chains, which are constructs whose size and shape are similar to an average amino acid. Each position is classified as hydrophobic or polar according to the solvent exposure of its generic side-chain. The method was tested by analyzing a set of proteins in the Protein Data Bank and by experimentally constructing and analyzing a set of engrailed homeodomain variants whose binary patterns were systematically varied. Selection of the optimal binary pattern results in a designed protein that is monomeric, well-folded, and hyperthermophilic. Homeodomain variants with fewer hydrophobic residues are destabilized, while additional hydrophobic residues induce aggregation. Binary patterning, in conjunction with a force field that models folded state energies, appears sufficient to satisfy two basic goals of protein design: stability and conformational specificity.
  • Genomic-scale comparison of sequence- and structure-based methods of function prediction: does structure provide additional insight? Fetrow, J. S., Siew, N., Di Gennaro, J. A., Martinez- Yamout, M., Dyson, H. J. and Skolnick, J. Protein Sci. 10, 1005-1014, 2001.
    Abstract: A function annotation method using the sequence-to-structure-to-function paradigm is applied to the identification of all disulfide oxidoreductases in the Saccharomyces cerevisiae genome. The method identifies 27 sequences as potential disulfide oxidoreductases. All previously known thioredoxins, glutaredoxins, and disulfide isomerases are correctly identified. Three of the 27 predictions are probable false-positives. Three novel predictions, which subsequently have been experimentally validated, are presented. Two additional novel predictions suggest a disulfide oxidoreductase regulatory mechanism for two subunits (OST3 and OST6) of the yeast oligosaccharyltransferase complex. Based on homology, this prediction can be extended to a potential tumor suppressor gene, N33, in humans, whose biochemical function was not previously known. Attempts to obtain a folded, active N33 construct to test the prediction were unsuccessful. The results show that structure prediction coupled with biochemically relevant structural motifs is a powerful method for the function annotation of genome sequences and can provide more detailed, robust predictions than function prediction methods that rely on sequence comparison alone.
  • Prediction of protein interaction sites from sequence profile and residue neighbor list. Zhou, H. X. and Shan, Y. Proteins: Struct. Funct. Genet. 44, 336-343, 2001.
    other refs
    Abstract: Protein-protein interaction sites are predicted from a neural network with sequence profiles of neighboring residues and solvent exposure as input. The network was trained on 615 pairs of nonhomologous complex-forming proteins. Tested on a different set of 129 pairs of nonhomologous complex-forming proteins, 70% of the 11,004 predicted interface residues are actually located in the interfaces. These 7732 correctly predicted residues account for 65% of the 11,805 residues making up the 129 interfaces. The main strength of the network predictor lies in the fact that neighbor lists and solvent exposure are relatively insensitive to structural changes accompanying complex formation. As such, it performs equally well with bound or unbound structures of the proteins. For a set of 35 test proteins, when the input was calculated from the bound and unbound structures, the correct fractions of the predicted interface residues were 69 and 70%, respectively.
  • Residue frequencies and pairing preferences at protein-protein interfaces. Glaser, F., Steinberg, D. M., Vakser, I. A. and Ben-Tal,N. Proteins: Struct. Funct. Genet. 43, 89-102, 2001.
    Abstract: We used a nonredundant set of 621 protein-protein interfaces of known high-resolution structure to derive residue composition and residue-residue contact preferences. The residue composition at the interfaces, in entire proteins and in whole genomes correlates well, indicating the statistical strength of the data set. Differences between amino acid distributions were observed for interfaces with buried surface area of less than 1,000 A(2) versus interfaces with area of more than 5,000 A(2). Hydrophobic residues were abundant in large interfaces while polar residues were more abundant in small interfaces. The largest residue-residue preferences at the interface were recorded for interactions between pairs of large hydrophobic residues, such as Trp and Leu, and the smallest preferences for pairs of small residues, such as Gly and Ala. On average, contacts between pairs of hydrophobic and polar residues were unfavorable, and the charged residues tended to pair subject to charge complementarity, in agreement with previous reports. A bootstrap procedure, lacking from previous studies, was used for error estimation. It showed that the statistical errors in the set of pairing preferences are generally small; the average standard error is approximately 0.2, i.e., about 8% of the average value of the pairwise index (2.9). However, for a few pairs (e.g., Ser-Ser and Glu-Asp) the standard error is larger in magnitude than the pairing index, which makes it impossible to tell whether contact formation is favorable or unfavorable. The results are interpreted using physicochemical factors and their implications for the energetics of complex formation and for protein docking are discussed.
  • Polar residues in the core of escheria coli thiore-doxin are important for fold specicity. Bolon, D.N., and S.L. Mayo. Biochemistry 40, 10047-10053, 2001.
    Abstract: Most globular proteins contain a core of hydrophobic residues that are inaccessible to solvent in the folded state. In general, polar residues in the core are thermodynamically unfavorable except when they are able to form intramolecular hydrogen bonds. Compared to hydrophobic interactions, polar interactions are more directional in character and may aid in fold specificity. In a survey of 263 globular protein structures, we found a strong positive correlation between the number of polar residues at core positions and protein size. To probe the importance of buried polar residues, we experimentally tested the effects of hydrophobic mutations at the five polar core residues in Escherichia coli thioredoxin. Proteins with single hydrophobic mutations (D26I, C32A, C35A, T66L, and T77V) all have cooperative unfolding transitions like the wild type (wt), as determined by chemical denaturation. Relative to wt, D26I is more stable while the other point mutants are less stable. The combined 5-fold mutant protein (IAALV) is less stable than wt and has an unfolding transition that is substantially less cooperative than that of wt. NMR spectra as well as amide deuterium exchange indicate that IAALV is likely sampling a number of low-energy structures in the folded state, suggesting that polar residues in the core are important for specifying a well-folded native structure.
  • Helix-Helix Packing and Interfacial Pairwise Interactions of Residues in Membrane Proteins Larisa Adamian and Jie Liang J. Mol. Biol., 311, 891-907, 2001
    Abstract: Helix-helix packing plays a critical role in maintaining the tertiary structures of helical membrane proteins. By examining the overall distribution of voids and pockets in the transmembrane (TM) regions of helical membrane proteins, we found that bacteriorhodopsin and halorhodopsin are the most tightly packed, whereas mechanosensitive channel is the least tightly packed. Large residues F, W, and H have the highest propensity to be in a TM void or a pocket, whereas small residues such as S, G, A, and T are least likely to be found in a void or a pocket. The coordination number for non-bonded interactions for each of the residue types is found to correlate with the size of the residue. To assess speciÆc interhelical interactions between residues, we have developed a new computational method to characterize nearest neighboring atoms that are in physical contact. Using an atom-based probabilistic model, we estimate the membrane helical interfacial pairwise (MHIP) propensity. We found that there are many residue pairs that have high propensity for interhelical interactions, but disulÆde bonds are rarely found in the TM regions. The high propensity pairs include residue pairs between an aromatic residue and a basic residue (W-R, W-H, and Y-K). In addition, many residue pairs have high propensity to form interhelical polar-polar atomic contacts, for example, residue pairs between two ionizable residues, between one ionizable residue and one N or Q. Soluble proteins do not share this pattern of diverse polar-polar interhelical interaction. Exploratory analysis by clustering of the MHIP values suggests that residues similar in side-chain branchness, cyclic structures, and size tend to have correlated behavior in participating interhelical interactions. A chi-square test rejects the null hypothesis that membrane protein and soluble protein have the same distribution of interhelical pairwise propensity. This observation may help us to understand the folding mechanism of membrane proteins.
  • Electrostatic contributions to protein-protein interactions: fast energetic filters for docking and their physical basis. Norel R, Sheinerman F, Petrey D, Honig B. Protein Sci., 10, 2147-61, 2001
    Abstract: The methods of continuum electrostatics are used to calculate the binding free energies of a set of protein-protein complexes including experimentally determined structures as well as other orientations generated by a fast docking algorithm. In the native structures, charged groups that are deeply buried were often found to favor complex formation (relative to isosteric nonpolar groups), whereas in nonnative complexes generated by a geometric docking algorithm, they were equally likely to be stabilizing as destabilizing. These observations were used to design a new filter for screening docked conformations that was applied, in conjunction with a number of geometric filters that assess shape complementarity, to 15 antibody-antigen complexes and 14 enzyme-inhibitor complexes. For the bound docking problem, which is the major focus of this paper, native and near-native solutions were ranked first or second in all but two enzyme-inhibitor complexes. Less success was encountered for antibody-antigen complexes, but in all cases studied, the more complete free energy evaluation was able to identify native and near-native structures. A filter based on the enrichment of tyrosines and tryptophans in antibody binding sites was applied to the antibody-antigen complexes and resulted in a native and near-native solution being ranked first and second in all cases. A clear improvement over previously reported results was obtained for the unbound antibody-antigen examples as well. The algorithm and various filters used in this work are quite efficient and are able to reduce the number of plausible docking orientations to a size small enough so that a final more complete free energy evaluation on the reduced set becomes computationally feasible.

  • 2000

  • Discriminating between homodimeric and monomeric proteins in the crystalline state. Ponstingl, H., K. Henrick, and J. M. Thornton. Proteins 41, 47-57, 2000.
    Abstract: Scores calculated from intermolecular contacts of proteins in the crystalline state are used to differentiate monomeric and homodimeric proteins, by classification into two categories separated by a cut-off score value. The generalized classification error is estimated by using bootstrap re-sampling on a nonredundant set of 172 water-soluble proteins whose prevalent quaternary state in solution is known to be either monomeric or homodimeric. A statistical potential, based on atom-pair frequencies across interfaces observed with homodimers, is found to yield an error rate of 12.5%. This indicates a small but significant improvement over the measure of solvent accessible surface area buried in the contact interface, which achieves an error rate of 15.4%. A further modification of the latter parameter relating the two most extensive contacts of the crystal results in an even lower error rate of 11.1%.
  • A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae. Uetz, P., Giot, L., Cagney, G., MansÆeld, T. A., Judson, R. S. and Knight, J. R., et al. Nature 403, 623-627, 2000.
    Abstract: Two large-scale yeast two-hybrid screens were undertaken to identify protein-protein interactions between full-length open reading frames predicted from the Saccharomyces cerevisiae genome sequence. In one approach, we constructed a protein array of about 6,000 yeast transformants, with each transformant expressing one of the open reading frames as a fusion to an activation domain. This array was screened by a simple and automated procedure for 192 yeast proteins, with positive responses identified by their positions in the array. In a second approach, we pooled cells expressing one of about 6,000 activation domain fusions to generate a library. We used a high-throughput screening procedure to screen nearly all of the 6,000 predicted yeast proteins, expressed as Gal4 DNA-binding domain fusion proteins, against the library, and characterized positives by sequence analysis. These approaches resulted in the detection of 957 putative interactions involving 1,004 S. cerevisiae proteins. These data reveal interactions that place functionally unclassified proteins in a biological context, interactions between proteins involved in the same biological function, and interactions that link biological functions together into larger cellular processes. The results of these screens are shown here.
  • Protein domain interfaces: characterization and comparison with oligomeric protein interfaces. Jones, S., Marin, A. and Thornton, J. M. Protein Eng. 13, 77-82, 2000.
    Abstract: The physical and chemical properties of domain-domain interactions have been analysed in two-domain proteins selected from the protein classification, CATH. The two-domain structures were divided into those derived from (i) monomeric proteins, or (ii) oligomeric or complexed proteins. The size, polarity, hydrogen bonding and packing of the intra-chain domain interface were calculated for both sets of two-domain structures. The results were compared with inter-chain interface parameters from permanent and non-obligate protein-protein complexes. In general, the intra-chain domain and inter-chain interfaces were remarkably similar. Many of the intra-chain interface properties are intermediate between those calculated for permanent and non-obligate inter-chain complexes. Residue interface propensities were also found to be very similar, with hydrophobic residues playing a major role, together with positively charged arginine residues. In addition, the residue composition of the domain interfaces were found to be more comparable with domain surfaces than domain cores. The implications of these results for domain swapping and protein folding are discussed.
  • Conservation of polar residues as hot spots at protein interfaces. Hu, Z., Ma, B., Wolfson, H. and Nussinov, R. Proteins Struct. Funct.Genet. 39, 331-342, 2000.
    Abstract: A number of studies have addressed the question of which are the critical residues at protein-binding sites. These studies examined either a single or a few protein-protein interfaces. The most extensive study to date has been an analysis of alanine-scanning mutagenesis. However, although the total number of mutations was large, the number of protein interfaces was small, with some of the interfaces closely related. Here we show that although overall binding sites are hydrophobic, they are studded with specific, conserved polar residues at specific locations, possibly serving as energy "hot spots." Our results confirm and generalize the alanine-scanning data analysis, despite its limited size. Previously Trp, Arg, and Tyr were shown to constitute energetic hot spots. These were rationalized by their polar interactions and by their surrounding rings of hydrophobic residues. However, there was no compelling reason as to why specifically these residues were conserved. Here we show that other polar residues are similarly conserved. These conserved residues have been detected consistently in all interface families that we have examined. Our results are based on an extensive examination of residues which are in contact across protein interfaces. We utilize all clustered interface families with at least five members and with sequence similarity between the members in the range of 20-90%. There are 11 such clustered interface families, comprising a total of 97 crystal structures. Our three-dimensional superpositioning analysis of the occurrences of matched residues in each of the families identifies conserved residues at spatially similar environments. Additionally, in enzyme inhibitors, we observe that residues are more conserved at the interfaces than at other locations. On the other hand, antibody-protein interfaces have similar surface conservation as compared to their corresponding linear sequence alignment, consistent with the suggestion that evolution has optimized protein interfaces for function.
  • Convergent solutions to binding at a protein-protein interface. DeLano, W. L., Ultsch, M. H., de Vos, A. M. and Wells, J. A. Science 287, 1279-1283, 2000.
    Abstract: The hinge region on the Fc fragment of human immunoglobulin G interacts with at least four different natural protein scaffolds that bind at a common site between the C(H2) and C(H3) domains. This "consensus" site was also dominant for binding of random peptides selected in vitro for high affinity (dissociation constant, about 25 nanomolar) by bacteriophage display. Thus, this site appears to be preferred owing to its intrinsic physiochemical properties, and not for biological function alone. A 2.7 angstrom crystal structure of a selected 13-amino acid peptide in complex with Fc demonstrated that the peptide adopts a compact structure radically different from that of the other Fc binding proteins. Nevertheless, the specific Fc binding interactions of the peptide strongly mimic those of the other proteins. Juxtaposition of the available Fc-complex crystal structures showed that the convergent binding surface is highly accessible, adaptive, and hydrophobic and contains relatively few sites for polar interactions. These are all properties that may promote cross-reactive binding, which is common to protein-protein interactions and especially hormone-receptor complexes.
  • DIP: the database of interacting proteins. Xenarios, I., Rice, D. W., Salwinski, L., Baron, M. K., Marcotte, E. M. and Eisenberg, D. Nucl. Acids Res. 28, 289-291, 2000.
    DIP
    Abstract: The Database of Interacting Proteins is a database that documents experimentally determined protein-protein interactions. Since January 2000 the number of protein-protein interactions in DIP has nearly tripled to 3472 and the number of proteins to 2659. New interactive tools have been developed to aid in the visualization, navigation and study of networks of protein interactions.
  • Buried polar interactions and conformational stability in the simian immunodeficiency virus (SIV) gp41 core. Ji, H., C. Bracken, and M. Lu. Biochemistry 4, 676-685, 2000.
    Abstract: For human (HIV) and simian (SIV) immunodeficiency viruses, the gp41 envelope protein undergoes a receptor-activated conformational change from a labile native structure to an energetically more stable fusogenic conformation, which then mediates viral-cell membrane fusion. The core structure of fusion-active gp41 is a six-helix bundle in which three antiparallel carboxyl-terminal helices are packed against an amino-terminal trimeric coiled coil. Here we show that a recombinant model of the SIV gp41 core, designated N36(L6)C34, forms an alpha-helical trimer that exhibits a cooperative two-state folding-unfolding transition. We investigate the importance of buried polar interactions in determining the overall fold of the gp41 core. We have replaced each of four polar amino acids at the heptad a and d positions of the coiled coil in N36(L6)C34 with a representative hydrophobic amino acid, isoleucine. The Q565I, T582I, and T586I variants form six-helix bundle structures that are significantly more stable than that of the wild-type peptide, whereas the Q575I variant misfolds into an insoluble aggregate under physiological conditions. Thus, the buried polar residues within the amino-terminal heptad repeat are important determinants of the structural specificity and stability of the gp41 core. We suggest that these conserved buried polar interactions play a role in governing the conformational state of the gp41 molecule.
  • Buried charged surface in proteins. Kajander, T., P.C. Kahn, S.H. Passila, D.C. Cohen, L. LehitÄo, W. Adolfsen, J. Warwicker, U. Schell and A. Goldman. Structure with Folding and Design 8, 1203-1214, 2000.
    Abstract: BACKGROUND: The traditional picture of charged amino acids in globular proteins is that they are almost exclusively on the outside exposed to the solvent. Buried charges, when they do occur, are assumed to play an essential role in catalysis and ligand binding, or in stabilizing structure as, for instance, helix caps. RESULTS: By analyzing the amount and distribution of buried charged surface and charges in proteins over a broad range of protein sizes, we show that buried charge is much more common than is generally believed. We also show that the amount of buried charge rises with protein size in a manner which differs from other types of surfaces, especially aromatic and polar uncharged surfaces. In large proteins such as hemocyanin, 35% of all charges are greater than 75% buried. Furthermore, at all sizes few charged groups are fully exposed. As an experimental test, we show that replacement of the buried D178 of muconate lactonizing enzyme by N stabilizes the enzyme by 4.2 degrees C without any change in crystallographic structure. In addition, free energy calculations of stability support the experimental results. CONCLUSIONS: Nature may use charge burial to reduce protein stability; not all buried charges are fully stabilized by a prearranged protein environment. Consistent with this view, thermophilic proteins often have less buried charge. Modifying the amount of buried charge at carefully chosen sites may thus provide a general route for changing the thermophilicity or psychrophilicity of proteins.

  • 1999

  • Decamers observed in the crystals of bovine pancreatic trypsin inhibitor. Jacek Lubkowski and Alexander Wlodawer. Acta Crystallographica Section D D55 , 335-337, 1999
    Abstract: The structure of bovine pancreatic trypsin inhibitor (BPTI) has been solved at 2.1 resolution in a new crystal form (space group P6422 with unit-cell dimensions a = b = 95.0, c = 158.1 ). The asymmetric unit is a pentamer, but a decamer is created by application of crystallographic symmetry. The decamer of BPTI is only the fourth such assembly reported to date in the Protein Data Bank.
  • Wet and dry interfaces: the role of solvent in protein-protein and protein-DNA recognition. Janin J. Structure Fold Des. 7 R277-279, 1999
    Abstract Water molecules are found in abundance in protein-protein and protein-DNA interfaces. Although interface solvent molecules exchange quickly with the bulk solvent, structural and biochemical data suggest that water-mediated interactions are as important as direct hydrogen bonds in the stability and specificity of recognition.
  • Structural features of protein-nucleic acid recognition sites. Nadassy K, Wodak SJ, Janin J. Biochemistry, 38, 1999-2017, 1999
    Abstract: We analyzed the atomic models of 75 X-ray structures of protein-nucleic acid complexes with the aim of uncovering common properties. The interface area measured the extent of contact between the protein and nucleic acid. It was found to vary between 1120 and 5800 A2. Despite this wide variation, the interfaces in complexes of transcription factors with double-stranded DNA could be broken up into recognition modules where 12 +/- 3 nucleotides on the DNA side contact 24 +/- 6 amino acids on the protein side, with interface areas in the range 1600 +/- 400 A2. For enzymes acting on DNA, the recognition module is on average 600 A2 larger, due to the requirement of making an active site. As judged by its chemical and amino acid composition, the average protein surface in contact with the DNA is more polar than the solvent accessible surface or the typical protein-protein interface. The protein side is rich in positively charged groups from lysine and arginine side chains; on the DNA side the negative charges from phosphate groups dominate. Hydrogen bonding patterns were also analyzed, and we found one intermolecular hydrogen bond per 125 A2 of interface area in high-resolution structures. An equivalent number of polar interactions involved water molecules, which are generally abundant at protein-DNA interfaces. Calculations of Voronoi atomic volumes, performed in the presence and absence of water molecules, showed that protein atoms buried at the interface with DNA are on average as closely packed as in the protein interior. Water molecules contribute to the close packing, thereby mediating shape complementarity. Finally, conformational changes accompanying association were analyzed in 24 of the complexes for which the structure of the free protein was also available. On the DNA side the extent of deformation showed some correlation with the size of the interface area. On the protein side the type and size of the structural changes spanned a wide spectrum. Disorder-to-order transitions, domain movements, quaternary and tertiary changes were observed, and the largest changes occurred in complexes with large interfaces.
  • Dimer dissociation of the pore-forming toxin aerolysin precedes receptor binding. Fivaz, M., Velluz, M. C. and van der Goot, F. G. J. Biol. Chem. 274, 37705-37708, 1999.
    Abstract: The pore-forming toxin aerolysin is secreted by Aeromonas hydrophila as an inactive precursor. Based on chemical cross-linking and gel filtration, we show here that proaerolysin exists as a monomer at low concentrations but is dimeric above 0.1 mg/ml. At intermediate concentrations, monomers and dimers appeared to be in rapid equilibrium. All together our data indicate that, at low concentrations, the toxin is a monomer and that this species is competent for receptor binding. In contrast, a mutant toxin that forms a covalent dimer was unable to bind to target cells.
  • The atomic structure of protein-protein recognition sites. Lo Conte, L., Chothia, C. and Janin, J. J. Mol. Biol. 285, 2177-2198, 1999.
    Abstract: The non-covalent assembly of proteins that fold separately is central to many biological processes, and differs from the permanent macromolecular assembly of protein subunits in oligomeric proteins. We performed an analysis of the atomic structure of the recognition sites seen in 75 protein-protein complexes of known three-dimensional structure: 24 protease-inhibitor, 19 antibody-antigen and 32 other complexes, including nine enzyme-inhibitor and 11 that are involved in signal transduction.The size of the recognition site is related to the conformational changes that occur upon association. Of the 75 complexes, 52 have "standard-size" interfaces in which the total area buried by the components in the recognition site is 1600 (+/-400) A2. In these complexes, association involves only small changes of conformation. Twenty complexes have "large" interfaces burying 2000 to 4660 A2, and large conformational changes are seen to occur in those cases where we can compare the structure of complexed and free components. The average interface has approximately the same non-polar character as the protein surface as a whole, and carries somewhat fewer charged groups. However, some interfaces are significantly more polar and others more non-polar than the average.Of the atoms that lose accessibility upon association, half make contacts across the interface and one-third become fully inaccessible to the solvent. In the latter case, the Voronoi volume was calculated and compared with that of atoms buried inside proteins. The ratio of the two volumes was 1.01 (+/-0.03) in all but 11 complexes, which shows that atoms buried at protein-protein interfaces are close-packed like the protein interior. This conclusion could be extended to the majority of interface atoms by including solvent positions determined in high-resolution X-ray structures in the calculation of Voronoi volumes. Thus, water molecules contribute to the close-packing of atoms that insure complementarity between the two protein surfaces, as well as providing polar interactions between the two proteins.
  • Use of pair potentials across protein interfaces in screening predicted docked complexes. Moont, G., Gabb, H. A. and Sternberg, M. J. Proteins: Struct. Funct. Genet. 35, 364-373, 1999.
    Abstract: Empirical residue-residue pair potentials are used to screen possible complexes for protein-protein dockings. A correct docking is defined as a complex with not more than 2.5 A root-mean-square distance from the known experimental structure. The complexes were generated by "ftdock" (Gabb et al. J Mol Biol 1997;272:106-120) that ranks using shape complementarity. The complexes studied were 5 enzyme-inhibitors and 2 antibody-antigens, starting from the unbound crystallographic coordinates, with a further 2 antibody-antigens where the antibody was from the bound crystallographic complex. The pair potential functions tested were derived both from observed intramolecular pairings in a database of nonhomologous protein domains, and from observed intermolecular pairings across the interfaces in sets of nonhomologous heterodimers and homodimers. Out of various alternate strategies, we found the optimal method used a mole-fraction calculated random model from the intramolecular pairings. For all the systems, a correct docking was placed within the top 12% of the pair potential score ranked complexes. A combined strategy was developed that incorporated "multidock," a side-chain refinement algorithm (Jackson et al. J Mol Biol 1998;276:265-285). This placed a correct docking within the top 5 complexes for enzyme-inhibitor systems, and within the top 40 complexes for antibody-antigen systems.
  • Energy functions for protein design. Gordon, D.B., S.A Marshall, and S.L. Mayo Curr. Opin. Struct. Biol. 9, 509-513, 1999.
    Abstract: Recent successes in protein design have illustrated the promise of computational approaches. These methods rely on energy expressions to evaluate the quality of different amino acid sequences for target protein structures. The force fields optimized for design differ from those typically used in molecular mechanics and molecular dynamics calculations.

  • 1998

  • Domain assignment for protein structures using a consensus approach: Characterization and analysis S. JONES, M. STEWART, A. MICHIE, M. B. SWINDELLS, C. ORENGO and J. M. THORNTON Protein Science, 7, 233-242, 1998
    Abstract: A consensus approach for the assignment of structural domains in proteins is presented. The approach combines a number of previously published algorithms, and takes advantage of the elevated accuracy obtained when assignments from the individual algorithms are in agreement. The consensus approach is tested on a data set of 55 protein chains, for which domain assignments from four automated methods were known, and for which crystallographers assignments had been reported in the literature. Accuracy was found to increase in this test from 72% using individual algorithms to 100% when all four methods were in agreement. However a consensus prediction using all four methods was only possible for 52% of the dataset. The consensus approach (using three publicly available domain assignment algorithms (PUU, DETECTIVE, DOMAK)) was then used to make domain assignments for a data set of 787 protein chains from the Protein Data Bank. Analysis of the assignments showed 55.7% of assignments could be made automatically, and of these, 13.5% were multi-domain proteins. Of the remaining 44.3% that could not be assigned by the consensus procedure 90.4% had their domain boundaries assigned correctly by at least one of the algorithms. Once identified, these domains were analyzed for trends in their size and secondary structure class. In addition, the discontinuity of each domain along the protein chain was considered.
  • Crystal structure of p14TCL1, an oncogene product involved in T-cell prolymphocytic leukemia, reveals a novel beta-barrel topology. Hoh, F., Yang, Y. S., Guignard, L., Padilla, A., Stern, M. H., Lhoste, J. M. and van Tilbeurgh, H. Structure (London), 6, 147-155, 1998.
    Abstract: Chromosome rearrangements are frequently involved in the generation of hematopoietic tumors. One type of T-cell leukemia, T-cell prolymphocytic leukemia, is consistently associated with chromosome rearrangements characterized by the juxtaposition of the TCRA locus on chromosome 14q11 and either the TCL1 gene on 14q32.1 or the MTCP1 gene on Xq28. The TCL1 gene is preferentially expressed in cells of early lymphoid lineage; its product is a 14 kDa protein (p14TCL1), expressed in the cytoplasm. p14TCL1 has strong sequence similarity with one product of the MTCP1 gene, p13MTCP1 (41% identical and 61% similar). The functions of the TCL1 and MTCP1 genes are not known yet. They have no sequence similarity to any other published sequence, including those of well-documented oncogene families responsible for leukemia. In order to gain a more fundamental insight into the role of this particular class of oncogenes, we have determined the three-dimensional structure of p14TCL1. RESULTS: The crystal structure of p14TCL1 has been determined at 2.5 A resolution. The structure was solved by molecular replacement using the solution structure of p13MTCP1, revealing p14TCL1 to be an all-beta protein consisting of an eight-stranded antiparallel beta barrel with a novel topology. The barrel consists of two four-stranded beta-meander motifs, related by a twofold axis and connected by a long loop. This internal pseudo-twofold symmetry was not expected on basis of the sequence alone, but structure-based sequence analysis of the two motifs shows that they are related. The structures of p13MTCP1 and p14TCL1 are very similar, diverging only in regions that are either flexible and/or involved in crystal packing. p14TCL1 forms a tight crystallographic dimer, probably corresponding to the 28 kDa species identified in solution by gel filtration experiments. CONCLUSIONS: Structural similarities between p14TCL1 and p13MTCP1 suggest that their (unknown) function may be analogous. This is confirmed by the fact that these proteins are implicated in analogous diseases. Their structure does not show similarity to other oncoproteins of known structure, confirming their classification as a novel class of oncoproteins.
  • IL-8 derivatives with a reduced potential to form homodimers are fully active in vitro and in vivo. Horcher, M., Rot, A., Aschauer, H. and Besemer, J. Cytokine 10, 1-12, 1998.
    Abstract: Interleukin 8 (IL-8) is a member of the CXC subfamily of chemokines which attracts and activates preferentially neutrophilic granulocytes. At nanomolar concentrations monomeric and dimeric forms of the molecule are in equilibrium, with the monomer being the prevalent form. Five amino acids from position 23 to 29 of the 72-amino acid IL-8 sequence form the dimer interface, with Leu25 and Val27 being highly conserved among the CXC chemokines. To investigate the contribution of these amino acids to the dimerization of IL-8, we produced in escherichia coli IL-8 derivatives with phenylalanine substitutions at position 25 or 27, or both 25 and 27. All three recombinant proteins were characterized by a significantly impaired potential to form dimers in solution as seen in chemical crosslinking experiments. IL-8 Val27 also could not be crosslinked as a dimer on its receptors. Receptor affinities and in vitro chemotactic activities, however, were not significantly different between wild-type and IL-8 with single mutations. The dimerization deficient IL-8 analogue had also full inflammatory activity in vivo. Thus, the monomer is the biologically active form of IL-8.
  • Dictionary of Interfaces in Proteins (DIP). Data Bank of Complementary Molecular Surface Patches Robert Preissner, Andrean Goede and Cornelius Frossmmel J. Mol. Biol., 280, 535-550, 1998
    Abstract: Molecular surface areas of proteins are responsible for selective binding of ligands and protein-protein recognition, and are considered the basis for speciÆc interactions between different parts of a protein. This basic principle leads us to study the interfaces within proteins as a learning set for intermolecular recognition processes of ligands like substrates, coenzymes, etc., and for prediction of contacts occurring during protein folding and association. For this purpose, we deÆned interfaces as pairs of matching molecular surface patches between neighboring secondary structural elements. All such interfaces from known protein structures were collected in a comprehensive data bank of interfaces in proteins (DIP). The up-to-date DIP contains interface Æles for 351 selected Brookhaven Protein Data Bank entries with a total of about 160,000 surface elements formed by 12,475 secondary structures. For special purposes, the inclusion of additional structures or selection of subgroups of proteins can be performed in an easy and straightforward manner. Atomic coordinates of the constituents of molecular surface patches are directly accessible as well as the corresponding contact distances from given atoms to their neighboring secondary structural elements. As a rule, independent of the type of secondary structure, the molecular surface patches of the secondary structural elements can be described as quite Øat bodies with a length to width to depth ratio of about 3:2:1 for patches consisting of more than ten atoms. The relative orientation between two docking patches is strongly restricted, due to the narrow distribution of the distances between their centers of mass and of the angles between their normal lines, respectively. The existing retrieval system for the DIP allows selection (out of the set of molecular patches) according to different criteria, such as geometric features, atomic composition, type of secondary structure, contacts, etc. A fast, sequence-independent 3-D superposition procedure was developed for automatic searches for geometrically similar surface areas. Using this procedure, we found a large number of structurally similar interfaces of up to 30 atoms in completely unrelated protein structures.
  • Empirical solvent-mediated potentials hold for both intra-molecular and intermolecular inter-residue interactions. Keskin, O., Bahar, I., Badretdinov, A. Y., Ptitsyn, O. B. and Jernigan, R. L. Protein Sci. 7, 2578-2586, 1998.
    Abstract: Whether knowledge-based intra-molecular inter-residue potentials are valid to represent inter-molecular interactions taking place at protein-protein interfaces has been questioned in several studies. Differences in the chain connectivity effect and in residue packing geometry between interfaces and single chain monomers have been pointed out as possible sources of distinct energetics for the two cases. In the present study, the interfacial regions of protein-protein complexes are examined to extract inter-molecular inter-residue potentials, using the same statistical methods as those previously adopted for intra-molecular residue pairs. Two sets of energy parameters are derived, corresponding to solvent-mediation and "average residue" mediation. The former set is shown to be highly correlated (correlation coefficient 0.89) with that previously obtained for inter-residue interactions within single chain monomers, while the latter exhibits a weaker correlation (0.69) with its intra-molecular counterpart. In addition to the close similarity of intra- and inter-molecular solvent-mediated potentials, they are shown to be significantly more residue-specific and thereby discriminative compared to the residue-mediated ones, indicating that solvent-mediation plays a major role in controlling the effective inter-residue interactions, either at interfaces, or within single monomers. Based on this observation, a reduced set of energy parameters comprising 20 one-body and 3 two-body terms is proposed (as opposed to the 20 x 20 tables of inter-residue potentials), which reproduces the conventional 20 x 20 tables with a correlation coefficient of 0.99.
  • Effects of salt bridges on protein structure and design. Sindelar CV, Hendsch ZS, Tidor B. Protein Sci., 9. 1898-914, 1998
    Abstract: Theoretical calculations (Hendsch ZS and Tidor B, 1994, Protein Sci 3:211-226) and experiments (Waldburger CD et al., 1995, Nat Struct Biol 2:122-128; Wimley WC et al., 1996, Proc Natl Acad Sci USA 93:2985-2990) suggest that hydrophobic interactions are more stabilizing than salt bridges in protein folding. The lack of apparent stability benefit for many salt bridges requires an alternative explanation for their occurrence within proteins. To examine the effect of salt bridges on protein structure and stability in more detail, we have developed an energy function for simple cubic lattice polymers based on continuum electrostatic calculations of a representative selection of salt bridges found in known protein crystal structures. There are only three types of residues in the model, with charges of -1, 0, or + 1. We have exhaustively enumerated conformational space and significant regions of sequence space for three-dimensional cubic lattice polymers of length 16. The results demonstrate that, while the more highly charged sequences are less stable, the loss of stability is accompanied by a substantial reduction in the degeneracy of the lowest-energy state. Moreover, the reduction in degeneracy is greater due to charges that pair than for lone charges that remain relatively exposed to solvent. We have also explored and illustrated the use of ion-pairing strategies for rational structural design using model lattice studies.
  • Anatomy of hot spots in protein interfaces. Bogan, A. A. and Thorn, K. S. J. Mol. Biol. 280, 1-9, 1998.
    Abstract: Binding of one protein to another is involved in nearly all biological functions, yet the principles governing the interaction of proteins are not fully understood. To analyze the contributions of individual amino acid residues in protein-protein binding we have compiled a database of 2325 alanine mutants for which the change in free energy of binding upon mutation to alanine has been measured (available at http://motorhead. ucsf.edu/thorn/hotspot). Our analysis shows that at the level of side-chains there is little correlation between buried surface area and free energy of binding. We find that the free energy of binding is not evenly distributed across interfaces; instead, there are hot spots of binding energy made up of a small subset of residues in the dimer interface. These hot spots are enriched in tryptophan, tyrosine and arginine, and are surrounded by energetically less important residues that most likely serve to occlude bulk solvent from the hot spot. Occlusion of solvent is found to be a necessary condition for highly energetic interactions.

  • 1997

  • Monomeric variants of IL-8: effects of side chain substitutions and solution conditions upon dimer formation. Lowman, H. B., Fairborther, W. J., Slagle, P. H., Kabakoff, R., Liu, J., Shire, S. and Hebert, C. A. Protein Sci. 6, 598-608, 1997.
    Abstract: IL-8 dimers have been observed in NMR and X-ray structures of the protein. We have engineered IL-8 monomers by mutations of residues throughout the dimer interface, which introduce hindrance determinants to dimerization. These IL-8 variants are shown by NMR to have wild-type monomer folding, but by ultracentrifugation to have a range of dimerization constants from microM to mM, as compared with a dimerization constant of about 10 microM for wild-type IL-8, under physiological salt and temperature conditions. The monomeric variants of IL-8 bind the erythrocyte chemokine receptor DARC, as well as the neutrophil IL-8 receptors CXCR1 and CXCR2 with affinities similar to that of wild-type IL-8. In addition, the monomeric variants were shown to have agonist activity, with similar potency to wild-type, in both Ca(2+)-flux assays on CXCR1 and CXCR2 transfected cells, and in chemotaxis assays on neutrophils. Thus, these variants confirm that monomeric IL-8 is functionally equivalent to wild-type in vitro assays. We have also investigated the effects of various solution conditions upon IL-8 dimer formation using analytical ultracentrifugation. At salt concentrations, temperatures, and pH conditions lower than physiological, the dimerization affinity of IL-8 is greatly enhanced. This suggests that, under some conditions, IL-8 dimer formation may occur at concentrations of IL-8 considerably lower than 10 microM, with consequences in vivo that are yet to be determined.
  • Analysis of protein-protein interaction sites using surface patches. Jones, S. and Thornton, J. M. J. Mol. Biol. 272, 121-132, 1997.
    Abstract: Protein-protein interaction sites in complexes of known structure are characterised using a series of parameters to evaluate what differentiates them from other sites on the protein surface. Surface patches are defined in protomers from a data set of 28 homo-dimers, 20 different hetero-complexes (segregated into large and small protomers), and antigens from six antibody-antigen complexes. Six parameters (solvation potential, residue interface propensity, hydrophobicity, planarity, protrusion and accessible surface area) are calculated for the observed interface patch and all other surface patches defined on each protein. A ranking of the observed interface, relative to all other possible patches, is calculated. With this approach it becomes possible to analyse the distribution of the rankings of all the observed patches, relative to all other surface patches, for each data set. For each type of complex, none of the parameters were definitive, but the majority showed trends for the observed interface to be distinguished from other surface patches.
  • Prediction of protein-protein interaction sites using patch analysis. Jones, S. and Thornton, J. M. J. Mol. Biol. 272, 133-143, 1997.
    Abstract: A method for defining and analysing a series of residue patches on the surface of protein structures is used to predict the location of protein-protein interaction sites. Each residue patch is analysed for six parameters; solvation potential, residue interface propensity, hydrophobicity, planarity, protrusion and accessible surface area. The method involves the calculation of a relative combined score that gives the probability of a surface patch forming protein-protein interactions. Predictions are made for the known structures of protomers from 28 homo-dimers, large protomers from 11 hetero-complexes, small protomers from 14 hetero-complexes, and antigens from six antibody-antigen complexes. The predictions are successful for 66% (39/59) of the structures and the remainder can usually be rationalized in terms of additional interaction sites.
  • Protein-protein crystal-packing contacts. Carugo, O. and Argos, P. Protein Sci. 6, 2261-2263, 1997.
    Abstract: Protein-protein contacts in monomeric protein crystal structures have been analyzed and compared to the physiological protein-protein contacts in oligomerization. A number of features differentiate the crystal-packing contacts from the natural contacts occurring in multimeric proteins. The area of the protein surface patches involved in packing contacts is generally smaller and its amino acid composition is indistinguishable from that of the protein surface accessible to the solvent. The fraction of protein surface in crystal contacts is very variable and independent of the number of packing contacts. The thermal motion at the crystal packing interface and that of the protein core, even for large packing interfaces, though the tendency is to be closer to that of the core. These results suggest that protein crystallization depends on random protein-protein interactions, which have little in common with physiological protein-protein recognition processes, and that the possibility of engineering macromolecular crystallization to improve crystal quality could be widened.
  • Electrostatic complementarity at protein/protein interfaces. McCoy, A. J., Chandana Epa, V. and Colman, P. M. J. Mol. Biol. 268, 570-584, 1997.
    Abstract: Calculation of the electrostatic potential of protein-protein complexes has led to the general assertion that protein-protein interfaces display "charge complementarity" and "electrostatic complementarity". In this study, quantitative measures for these two terms are developed and used to investigate protein-protein interfaces in a rigorous manner. Charge complementarity (CC) was defined using the correlation of charges on nearest neighbour atoms at the interface. All 12 protein-protein interfaces studied had insignificantly small CC values. Therefore, the term charge complementarity is not appropriate for the description of protein-protein interfaces when used in the sense measured by CC. Electrostatic complementarity (EC) was defined using the correlation of surface electrostatic potential at protein-protein interfaces. All twelve protein-protein interfaces studied had significant EC values, and thus the assertion that protein-protein association involves surfaces with complementary electrostatic potential was substantially confirmed. The term electrostatic complementarity can therefore be used to describe protein-protein interfaces when used in the sense measured by EC. Taken together, the results for CC and EC demonstrate the relevance of the long-range effects of charges, as described by the electrostatic potential at the binding interface. The EC value did not partition the complexes by type such as antigen-antibody and proteinase-inhibitor, as measures of the geometrical complementarity at protein-protein interfaces have done. The EC value was also not directly related to the number of salt bridges in the interface, and neutralisation of these salt bridges showed that other charges also contributed significantly to electrostatic complementarity and electrostatic interactions between the proteins. Electrostatic complementarity as defined by EC was extended to investigate the electrostatic similarity at the surface of influenza virus neuraminidase where the epitopes of two monoclonal antibodies, NC10 and NC41, overlap. Although NC10 and NC41 both have quite high values of EC for their interaction with neuraminidase, the similarity in electrostatic potential generated by the two on the overlapping region of the epitopes is insignificant. Thus, it is possible for two antibodies to recognise the electrostatic surface of a protein in dissimilar ways.
  • Inter-residue potentials in globular proteins and the dominance of highly specific hydrophilic interactions at close separation. Bahar, I. and Jernigan, R. L. J. Mol. Biol. 266, 195-214, 1997.
    Abstract: Residue-specific potentials between pairs of side-chains and pairs of side-chain-backbone interaction sites have been generated by collecting radial distribution data for 302 protein structures. Multiple atomic interactions have been utilized to enhance the specificity and smooth the distance-dependence of the potentials. The potentials are demonstrated to successfully discriminate correct sequences in inverse folding experiments. Many specific effects are observable in the non-bonded potentials; grouping of residue types is inappropriate, since each residue type manifests some unique behavior. Only a weak dependence is seen on protein size and composition. Effective contact potentials operating in three different environments (self, solvent-exposed and residue-exposed) and over any distance range are presented. The effective contact potentials obtained from the integration of radial distributions over the distance interval r < or = 6.4 A are in excellent agreement with published values. The hydrophobic interactions are verified to be dominantly strong in this range. Comparison of these with a newly derived set of effective contact potentials for closer inter-residue separations (r < or = 4.0 A) demonstrates drastic changes in the most favorable interactions. In the closer approach case, where the number of pairs with a given residue is approximately one, the highly specific interactions between charged and polar side-chains predominate. These closer approach values could be utilized to select successively the relative positions and directions of residue side-chains in protein simulations, following a hierarchical algorithm optimizing side-chain-side-chain interactions over the two successively closer distance ranges. The homogeneous contribution to stability is stronger than the specific contribution by about a factor of 5. Overall, the total non-bonded interaction energy calculated for individual proteins follows a dependence on the number of residues of the form of n1.28, indicating an enhanced stability for larger proteins.
  • Hydrogen bonds and salt bridges across protein-protein interfaces. Xu D, Tsai CJ, Nussinov R. Protein Eng., 10, 999-1012, 1997
    Abstract: To understand further, and to utilize, the interactions across protein-protein interfaces, we carried out an analysis of the hydrogen bonds and of the salt bridges in a collection of 319 non-redundant protein-protein interfaces derived from high-quality X-ray structures. We found that the geometry of the hydrogen bonds across protein interfaces is generally less optimal and has a wider distribution than typically observed within the chains. This difference originates from the more hydrophilic side chains buried in the binding interface than in the folded monomer interior. Protein folding differs from protein binding. Whereas in folding practically all degrees of freedom are available to the chain to attain its optimal configuration, this is not the case for rigid binding, where the protein molecules are already folded, with only six degrees of translational and rotational freedom available to the chains to achieve their most favorable bound configuration. These constraints enforce many polar/charged residues buried in the interface to form weak hydrogen bonds with protein atoms, rather than strongly hydrogen bonding to the solvent. Since interfacial hydrogen bonds are weaker than the intra-chain ones to compete with the binding of water, more water molecules are involved in bridging hydrogen bond networks across the protein interface than in the protein interior. Interfacial water molecules both mediate non-complementary donor-donor or acceptor-acceptor pairs, and connect non-optimally oriented donor-acceptor pairs. These differences between the interfacial hydrogen bonding patterns and the intra-chain ones further substantiate the notion that protein complexes formed by rigid binding may be far away from the global minimum conformations. Moreover, we summarize the pattern of charge complementarity and of the conservation of hydrogen bond network across binding interfaces. We further illustrate the utility of this study in understanding the specificity of protein-protein associations, and hence in docking prediction and molecular (inhibitor) design.
  • Protein binding versus protein folding: the role of hydrophilic bridges in protein associations. Xu, D., Lin, S. L. and Nussinov, R. J. Mol. Biol. 265, 68-84, 1997.
    Abstract: The role of hydrophilic bridges between charged, or polar, atoms in protein associations has been examined from two perspectives. First, statistical analysis has been carried out on 21 data sets to determine the relationship between the binding free energy and the structure of the protein complexes. We find that the number of hydrophilic bridges across the binding interface shows a strong positive correlation with the free energy; second, the electrostatic contribution of salt bridges to binding has been assessed by a continuum electrostatics calculation. In contrast to protein folding, we find that salt bridges across the binding interface can significantly stabilize complexes in some cases. The different contributions of hydrophilic bridges to folding and to binding arise from the different environments to which the involved hydrophilic groups are exposed before and after the bridges are formed. These groups are more solvated in a denatured protein before folding than on the surface of the combining proteins before binding. After binding, they are buried in an environment whose residual composition can be much more hydrophilic than the one after folding. As a result, the desolvation cost of a hydrophilic pair is lower, and the favorable interactions between the hydrophilic pair and its surrounding residues are generally stronger in binding than in folding. These results complement our recent finding that while hydrophobic effect in protein-protein interfaces is significant, it is not as strong as that observed in the interior of monomers. Taken together, these studies suggest that while the types of forces in protein-protein interaction and in protein folding are similar, their relative contributions differ. Hence, association of protein monomers which do not undergo significant conformational change upon binding differs from protein folding, implying that conclusions (e.g. statistics, energetics) drawn from investigating folding may not apply directly to binding, and vice versa.
  • Extent and nature of contacts between protein molecules in crystal lattices and between subunits of protein oligomers. Dasgupta, S., G. H. Iyer, S. H. Bryant, C. E. Lawrence, and J. A. Bell. Proteins 28, 494-514, 1997
    Abstract: A survey was compiled of several characteristics of the intersubunit contacts in 58 oligomeric proteins, and of the intermolecular contracts in the lattice for 223 protein crystal structures. The total number of atoms in contact and the secondary structure elements involved are similar in the two types of interfaces. Crystal contact patches are frequently smaller than patches involved in oligomer interfaces. Crystal contacts result from more numerous interactions by polar residues, compared with a tendency toward nonpolar amino acids at oligomer interfaces. Arginine is the only amino acid prominent in both types of interfaces. Potentials of mean force for residue-residue contacts at both crystal and oligomer interfaces were derived from comparison of the number of observed residue-residue interactions with the number expected by mass action. They show that hydrophobic interactions at oligomer interfaces favor aromatic amino acids and methionine over aliphatic amino acids; and that crystal contacts form in such a way as to avoid inclusion of hydrophobic interactions. They also suggest that complex salt bridges with certain amino acid compositions might be important in oligomer formation. For a protein that is recalcitrant to crystallization, substitution of lysine residues with arginine or glutamine is a recommended strategy.
  • Specific versus non-specific contacts in protein crystals Janin, J. Nat. Struct. Biol. 4, 973-974, 1997.

  • 1996

  • An evolutionary trace method defines binding surfaces common to protein families. Lichtarge, O., Bourne, H. R. and Cohen, F. E. J. Mol. Biol. 257, 342-358, 1996.
    Abstract: X-ray or NMR structures of proteins are often derived without their ligands, and even when the structure of a full complex is available, the area of contact that is functionally and energetically significant may be a specialized subset of the geometric interface deduced from the spatial proximity between ligands. Thus, even after a structure is solved, it remains a major theoretical and experimental goal to localize protein functional interfaces and understand the role of their constituent residues. The evolutionary trace method is a systematic, transparent and novel predictive technique that identifies active sites and functional interfaces in proteins with known structure. It is based on the extraction of functionally important residues from sequence conservation patterns in homologous proteins, and on their mapping onto the protein surface to generate clusters identifying functional interfaces. The SH2 and SH3 modular signaling domains and the DNA binding domain of the nuclear hormone receptors provide tests for the accuracy and validity of our method. In each case, the evolutionary trace delineates the functional epitope and identifies residues critical to binding specificity. Based on mutational evolutionary analysis and on the structural homology of protein families, this simple and versatile approach should help focus site-directed mutagenesis studies of structure-function relationships in macromolecules, as well as studies of specificity in molecular recognition. More generally, it provides an evolutionary perspective for judging the functional or structural role of each residue in protein structure.
  • Principles of protein-protein interactions. Jones, S. and Thornton, J. M. Proc. Natl Acad. Sci. USA, 93, 13-20, 1996.
    Abstract: This review examines protein complexes in the Brookhaven Protein Databank to gain a better understanding of the principles governing the interactions involved in protein-protein recognition. The factors that influence the formation of protein-protein complexes are explored in four different types of protein-protein complexes--homodimeric proteins, heterodimeric proteins, enzyme-inhibitor complexes, and antibody-protein complexes. The comparison between the complexes highlights differences that reflect their biological roles.
  • Thermal stability of hexameric and tetrameric nucleoside diphosphate kinases. Effect of subunit interaction. Giartosio, A., Erent, M., Cervoni, L., Morera, S., Janin, J., Konrad, M. and Lascu, I. J. Biol. Chem. 271, 17845-17851, 1996.
    Abtract: The eukaryotic nucleoside diphosphate (NDP) kinases are hexamers, while the bacterial NDP kinases are tetramers made of small, single domain subunits. These enzymes represent an ideal model for studying the effect of subunit interaction on protein stability. The thermostability of NDP kinases of each class was studied by differential scanning calorimetry and biochemical methods. The hexameric NDP kinase from Dictyostelium discoideum displays one single, irreversible differential scanning calorimetry peak (Tm 62 degrees C) over a broad protein concentration, indicating a single step denaturation. The thermal stability of the protein was increased by ADP. The P105G substitution, which affects a loop implicated in subunit contacts, yields a protein that reversibly dissociates to folded monomers at 38 degrees C before the irreversible denaturation occurs (Tm 47 degrees C). ADP delays the dissociation, but does not change the Tm. These data indicate a "coupling" of the quaternary structure with the tertiary structure in the wild-type, but not in the mutated protein. We describe the x-ray structure of the P105G mutant at 2.2-A resolution. It is very similar to that of the wild-type protein. Therefore, a minimal change in the structure leads to a dramatic change of protein thermostability. The NDP kinase from Escherichia coli behaves like the P105G mutant of the Dictyostelium NDP kinase. The detailed study of their thermostability is important, since biological effects of thermolabile NDP kinases have been described in several organisms.
  • Forces contributing to the conformational stability of proteins. Pace, C.N., B.A. Shirley, M. McNutt, and K. Gaijwala. FASEB J. 10, 75-83, 1996.

  • 1995

  • Are buried salt bridges important for protein stability and conformational specificity? Waldburger, C.D., J.F. Schildbach., and R.T. Sauer. Nat. Struct. Biol. 2, 122-128, 1995.
    Abstract: The side chains of Arg 31, Glu 36 and Arg 40 in Arc repressor form a buried salt-bridge triad. The entire salt-bridge network can be replaced by hydrophobic residues in combinatorial randomization experiments resulting in active mutants that are significantly more stable than wild type. The crystal structure of one mutant reveals that the mutant side chains pack against each other in an otherwise wild-type fold. Thus, simple hydrophobic interactions provide more stabilizing energy than the buried salt bridge and confer comparable conformational specificity.
  • Conservation of salt bridges in protein families. Schueler, O., and H. Margalit. J. Mol. Biol. 248, 125-135, 1995.
    Abstract: A detailed computational analysis is presented that focuses on the relationship between structural attributes and the degree and mode of salt bridge conservation. A data set of conserved and non-conserved salt bridges was constructed from eight protein families, based on the structural alignment of family members. Salt bridges were defined at the secondary structure level rather than at the residue level, implying different possible modes of conservation: preservation (same charges at the same residue positions), compensation (reversal of charges), and complementation (maintenance of a salt bridge between two segments of secondary structures, not involving the same residue positions). Structural attributes such as the surface accessibility, distance from the active site, or type of secondary structures involved, were studied. No significant differences were found between conserved and non-conserved salt bridges, except for the surface accessibility. Conserved salt bridges were shown to be less exposed than non-conserved ones. Moreover, within the set of conserved salt bridges, the degree of conservation was shown to negatively correlate with surface exposure; however, not to an extent that could indicate a general role for electrostatic interactions in the protein interior. Examination of the most conserved salt bridge in each family showed a variety of typical features: Some involved the terminal segments of the protein, some were buried and one involved the catalytic site of the protein. Hence, the role of salt bridges is more specific, probably in fine tuning of a specific structure through the folding process or in determining the functional site. As for the conservation mode, preservations were found to predominate in the conserved interactions, while complementations were of secondary importance. Compensations occurred only rarely and mostly in exposed salt bridges, suggesting that this mechanism is not utilized frequently and especially not in important interactions.
  • Comprehensive analysis of hydrogen bonds in regulatory protein DNA-complexes: in search of common principles. Mandel-Gutfreund Y, Schueler O, Margalit H. J Mol Biol., 253, 370-82, 1995
    Abstract: A systematic analysis of hydrogen bonds between regulatory proteins and their DNA targets is presented, based on 28 crystallographically solved complexes. All possible hydrogen bonds were screened and classified into different types: those that involve the amino acid side-chains and DNA base edges and those that involve the backbone atoms of the molecules. For each interaction type, all bonds were characterized and a statistical analysis was performed to reveal significant amino acid-base interdependence. The interactions between the amino acid side-chains and DNA backbone constitute about half of the interactions, but did not show any amino acid-base correlation. Interactions via the protein backbone were also observed, predominantly with the DNA backbone. As expected, the most significant pairing preference was demonstrated for interactions between the amino acid side-chains and the DNA base edges. The statistically significant relationships could mostly be explained by the chemical nature of the participants. However, correlations that could not be trivially predicted from the hydrogen bonding potential of the residues were also identified, like the preference of lysine for guanine over adenine, or the preference of glutamic acid for cystosine over adenine. While Lys x G interactions were very frequent and spread over various families, the Glu x C interactions were found mainly in the basic helix-loop-helix family. Further examination of the side-chain-base edge contacts at the atomic level revealed a trend of the amino acids to contact the DNA by their donor atoms, preferably at position W2 in the major groove. In most cases it seems that the interactions are not guided simply by the presence of a required atom in a specific position in the groove, but that the identity of the base possessing this atom is crucial. This may have important implications in molecular design experiments.
  • Madura, J. D., Briggs, J. M., Wade, R. C., Davis, M. E., Luty, B. A., Ilin, A., Antosiewicz, J., Gilson, M. K., Bagheri, B., Scott, L. R. and McCammon, J. A. Comp. Phys. Commun. 91, 57-95, 1995.
  • 3D domain swapping: a mechanism for oligomer assembly. Bennett, J. J., Schlunegger, J. P. and Eisenberg, D. Protein Sci. 4, 2455-2468, 1995.
    Abstract: 3D domain swapping is a mechanism for forming oligomeric proteins from their monomers. In 3D domain swapping, one domain of a monomeric protein is replaced by the same domain from an identical protein chain. The result is an intertwined dimer or higher oligomer, with one domain of each subunit replaced by the identical domain from another subunit. The swapped "domain" can be as large as an entire tertiary globular domain, or as small as an alpha-helix or a strand of a beta-sheet. Examples of 3D domain swapping are reviewed that suggest domain swapping can serve as a mechanism for functional interconversion between monomers and oligomers, and that domain swapping may serve as a mechanism for evolution of some oligomeric proteins. Domain-swapped proteins present examples of a single protein chain folding into two distinct structures.

  • 1994

  • Domain swapping: entangling alliances between proteins. Bennett, M. J., Choe, S. and Eisenberg, D. Proc. Natl. Acad. Sci. USA, 91, 3127-3131, 1994.
    Abstract: The comparison of monomeric and dimeric diphtheria toxin (DT) reveals a mode for protein association which we call domain swapping. The structure of dimeric DT has been extensively refined against data to 2.0-A resolution and a three-residue loop has been corrected as compared with our published 2.5-A-resolution structure. The monomeric DT structure has also been determined, at 2.3-A resolution. Monomeric DT is a Y-shaped molecule with three domains: catalytic (C), transmembrane (T), and receptor binding (R). Upon freezing in phosphate buffer, DT forms a long-lived, metastable dimer. The protein chain tracing discloses that upon dimerization an unprecedented conformational rearrangement occurs: the entire R domain from each molecule of the dimer is exchanged for the R domain from the other. This involves breaking the noncovalent interactions between the R domain and the C and T domains, rotating the R domain by 180 degrees with atomic movements up to 65 A, and re-forming the same noncovalent interactions between the R domain and the C and T domains of the other chain of the dimer. This conformational transition explains the long life and metastability of the DT dimer. Several other intertwined, dimeric protein structures satisfy our definition of domain swapping and suggest that domain swapping may be the molecular mechanism for evolution of these oligomers and possibly of oligomeric proteins in general.
  • Do salt bridges stabilize proteins - A continuum electrostatic analysis. Hendsch, Z.S., and B. Tidor. Protein Sci. 3, 211-226, 1994.
    Abstract: The electrostatic contribution to the free energy of folding was calculated for 21 salt bridges in 9 protein X-ray crystal structures using a continuum electrostatic approach with the DELPHI computer-program package. The majority (17) were found to be electrostatically destabilizing; the average free energy change, which is analogous to mutation of salt bridging side chains to hydrophobic isosteres, was calculated to be 3.5 kcal/mol. This is fundamentally different from stability measurements using pKa shifts, which effectively measure the strength of a salt bridge relative to 1 or more charged hydrogen bonds. The calculated effect was due to a large, unfavorable desolvation contribution that was not fully compensated by favorable interactions within the salt bridge and between salt-bridge partners and other polar and charged groups in the folded protein. Some of the salt bridges were studied in further detail to determine the effect of the choice of values for atomic radii, internal protein dielectric constant, and ionic strength used in the calculations. Increased ionic strength resulted in little or no change in calculated stability for 3 of 4 salt bridges over a range of 0.1-0.9 M. The results suggest that mutation of salt bridges, particularly those that are buried, to "hydrophobic bridges" (that pack at least as well as wild type) can result in proteins with increased stability. Due to the large penalty for burying uncompensated ionizable groups, salt bridges could help to limit the number of low free energy conformations of a molecule or complex and thus play a role in determining specificity (i.e., the uniqueness of a protein fold or protein-ligand binding geometry).
  • Structural features can be unconserved in proteins with similar folds - an analysis of side-chain to side-chain contacts secondary structure and accessibility. Russell, R.B. and G.J. Barton. J. Mol. Biol. 244, 332-350, 1994.
    Abstract: Side-chain to side-chain contacts, accessibility, secondary structure and RMS deviation were compared within 607 pairs of proteins having similar three-dimensional (3D) structures. Three types of protein 3D structural similarities were defined: type A having sequence and usually functional similarity; type B having functional, but no sequence similarity; and type C having only 3D structural similarity. Within proteins having little or no sequence similarity (types B and C), structural features frequently had a degree of conservation comparable to dissimilar 3D structures. Despite similar protein folds, as few as 30% of residues within similar protein 3D structures can form a common core. RMS deviations on core C alpha atoms can be as high as 3.2 A. Similar protein structures can have secondary structure identities as low as 41%, which is equivalent to that expected by chance. By defining three categories of amino acid accessibility (buried, half buried and exposed), some similar protein 3D structures have as few as 30% of positions in the same category, making them indistinguishable from pairs of dissimilar protein structures. Similar structures can also have as few as 12% of common side-chain to side-chain contacts, and virtually no similar energetically favourable side-chain to side-chain interactions. Complementary changes are defined as structurally equivalent pairs of interacting residues in two structures with energetically favourable but different side-chain interactions. For many proteins with similar three-dimensional structures, the proportion of complementary changes is near to that expected by chance, suggesting that many similar structures have fundamentally different stabilising interactions. All of the results suggest that proteins having similar 3D structures can have little in common apart from a scaffold of core secondary structures. This has profound implications for methods of protein fold detection, since many of the properties assumed to be conserved across similar protein 3D structures (e.g. accessibility, side-chain to side-chain contacts, etc.) are often unconserved within weakly similar (i.e. type B and C) protein 3D structures. Little difference was found between type B and C similarities suggesting that the structure of similar proteins can evolve beyond recognition even when function is conserved. Our findings suggest that it is more general features of protein structure, such as the requirements for burial of hydrophobic residues and exposure of polar residues, rather than specific residue-residue interactions that determine how well a particular sequence adopts a particular fold.

  • 1993

  • Rapid approximation to molecular-surface area via the use of boolean logic and look-up tables. Legrand, S. M. and K.M. Merz. J. Comput. Chem. 14, 349-352, 1993.

  • 1992

  • Mechanism of specicity in the Fos-Jun oncoprotein heterodimer. O'Shea, E.K.,R. Rutkowski, and P.S. Kim. Cell 68, 699-708, 1992
    Abstract: Fos and Jun, the protein products of the nuclear proto-oncogenes c-fos and c-jun, associate preferentially to form a heterodimer that binds to DNA and modulates transcription of a wide variety of genes in response to mitogenic stimuli. Both Fos and Jun contain a single leucine zipper region. Previous studies have shown that the leucine zippers of Fos and Jun are necessary and sufficient to mediate preferential heterodimer formation. The leucine zipper regions from Fos and Jun are also known to fold autonomously, most likely as two-stranded, parallel coiled coils. We show here that 8 amino acids from Fos and from Jun are sufficient to mediate preferential heterodimer formation in a background of the GCN4 leucine zipper sequence. Using pH titration and amino acid replacements, we also show that destabilization of the Fos homodimer by acidic residues provides a major thermodynamic driving force for preferential heterodimer formation.

  • pre-1990

  • Surface, subunit interfaces and interior of oligomeric proteins. Janin, J., Miller, S. and Chothia, C. J. Mol. Biol. 204, 155-164, 1988
    Abstract: The solvent-accessible surface area (As) of 23 oligomeric proteins is calculated using atomic co-ordinates from high-resolution and well-refined crystal structures. As is correlated with the protein molecular weight, and a power law predicts its value to within 5% on average. The accessible surface of the average oligomer is similar to that of monomeric proteins in its hydropathy and amino acid composition. The distribution of the 20 amino acid types between the protein surface and its interior is also the same as in monomers. Interfaces, i.e. surfaces involved in subunit contacts, differ from the rest of the subunit surface. They are enriched in hydrophobic side-chains, yet they contain a number of charged groups, especially from Arg residues, which are the most abundant residues at interfaces except for Leu. Buried Arg residues are involved in H-bonds between subunits. We counted H-bonds at interfaces and found that several have none, others have one H-bond per 200 A2 of interface area on average (1 A = 0.1 nm). A majority of interface H-bonds involve charged donor or acceptor groups, which should make their contribution to the free energy of dissociation significant, even when they are few. The smaller interfaces cover about 700 A2 of the subunit surface. The larger ones cover 3000 to 10,000 A2, up to 40% of the subunit surface area in catalase. The lower value corresponds to an estimate of the accessible surface area loss required for stabilizing subunit association through the hydrophobic effect alone. Oligomers with small interfaces have globular subunits with accessible surface areas similar to those of monomeric proteins. We suggest that these oligomers assemble from preformed monomers with little change in conformation. In oligomers with large interfaces, isolated subunits should be unstable given their excessively large accessible surface, and assembly is expected to require major structural changes.
  • Interior and surface of monomeric proteins. Miller, S., J. Janin, A.M. Lesk and C. Chothia. J. Mol. Biol. 196, 641-656, 1987.
    Abstract: The solvent-accessible surface area (As) of 46 monomeric proteins is calculated using atomic co-ordinates from high-resolution and well-refined crystal structures. The As of these proteins can be determined to within 1 to 2% and that of their individual residues to within 10 to 20%. The As values of proteins are correlated with their molecular weight (Mr) in the range 4000 to 35,000: the power law As = 6.3 M0.73 predicts protein As values to within 4% on average. The average water-accessible surface is found to be 57% non-polar, 24% polar and 19% charged, with 5% root-mean-square variations. The molecular surface buried inside the protein is 58% non-polar, 39% polar and 4% charged. The buried surface contains more uncharged polar groups (mostly peptides) than the surface that remains accessible, but many fewer charged groups. On average, 15% of residues in small proteins and 32% in larger ones may be classed as "buried residues", having less than 5% of their surface accessible to the solvent. The accessibilities of most other residues are evenly distributed in the range 5 to 50%. Although the fraction of buried residues increases with molecular weight, the amino acid compositions of the protein interior and surface show no systematic variation with molecular weight, except for small proteins that are often very rich in buried cysteines. From amino acid compositions of protein surfaces and interiors we calculate an effective coefficient of partition for each type of residue, and derive an implied set of transfer free energy values. This is compared with other sets of partition coefficients derived directly from experimental data. The extent to which groups of residues (charged, polar and non-polar) are buried within proteins correlates well with their hydrophobicity derived from amino acid transfer experiments. Within these three groups, the correlation is low.
  • Surface and inside volumes in globular proteins. Janin, J. Nature 277, 491-492, 1979.
  • The nature of the accessible and buried surfaces in proteins. Chothia, C. J.Mol. Biol. 105, 1-14, 1976
  • Surface area of globular proteins. Janin, J. J. Mol. Biol. 105, 13-14, 1976.
  • Structural invariants in protein folding. Chothia, C. Nature 254, 304-308, 1975.
  • Accessible area, packing volumes and interaction surfaces of globular proteins. Teller, D.C. Nature 260, 729-731, 1976.
  • spacer
    spacer