Get   for     ? 
 Site search     ? 
CAPRI: Critical Assessment of PRediction of Interactions
 
  
   Call For Targets
   Meeting Report
   Evaluation Meeting
Registration
   Evaluation Meeting
Program
   ROUND 1
   ROUND 2
 
 MSD  CAPRI: Critical Assessment of PRediction of Interactions

First community wide experiment on the comparative evaluation of protein-protein docking for structure prediction

Hosted By EMBL/EBI-MSD Group


CAPRI target 01 evaluation results

Raúl Méndez, Leonardo De Maria and Shoshana J. Wodak.
SCMBB Université Libre de Bruxelles, Cp 263, Brussels, Belgium.
Re-accessed on Monday December 16, 2002.
e-mail: raul@ucmb.ulb.ac.be
shosh@ucmb.ulb.ac.be

The evaluation results of the CAPRI TARGET 01 predictions are stored in different directories depending on the criteria that have been used. In the following the directories and their contents are briefly described.



Information

Directory Information contains information about Target 01, which was used in the evaluation and scoring. It contains the following files (file names are given in bold):

  • capri.1.pdb: the crystal structure of the target (target 01) in PDB format.
  • Contres1: list of residue contacts in the target between subunits H (Hpr) and A, C (Kinase).
  • Contres2: list of residue contacts in the target between subunits I (Hpr) and A, B (Kinase).
  • Contres3: list of residue contacts in the target between subunits J (Hpr) and B, C (Kinase).
  • As an example, here are the first three lines of Contres1:

    H11 ASP - C309 GLU
    H12 SER - C305 ILE
    H13 GLY - C305 ILE

    The first column lists the Hpr residue and in the second column the Kinase residue that are in contact. Two residues are considered to be in contact, if at least one atom of residue 1 is within 5Å from an atom in residue 2. Each residue is listed by the chain name and residue number along the chain, followed by the three letters amino acid code.

    Three different lists were calculated because the contacts with the H, I, and J HPr copies were slightly different. This was also done for the lists of interface residues(below).

  • capri.1.H.intres Hpr-Kinase interface residues in the target for the H-A,C interface.
  • capri.1.I.intres Hpr-Kinase interface residues in the target for the I-A,B interface.
  • capri.1.J.intres Hpr-Kinase interface residues in the target for the H-B,C interface.
  • Interface residues were identified by computing the differences in residue accessible surface areas (ASA) in the complex and in the individual components. Residues in the Hpr and kinase subunits for while Delta ASA is not zero were included in the list

    For example, the capri.1.H.intres file is structured as follows:

    INTERFACE RESIDUES IN Hpr capri.1H.intres

    H11 ASP DELTA_ASA = 9.576227
    H12 SER DELTA_ASA = 22.17662
    H13 GLY DELTA_ASA = 0.6229653E-01
     .
     .

    (continue)

     .
     .

    INTERFACE RESIDUES IN KINASE capri.1H.intres

    A136 ARG DELTA_ASA = 23.78261
    A137 ARG DELTA_ASA = 1.377046
    A138 SER DELTA_ASA = 51.34740
     .
     .
    ( end of file)

    Each file thus contains two lists. The 1st list contains the Hpr residues with Delta ASA Ang**2 in the target. The second list contains the Kinase residues with Delta ASA Ang**2 in the target. The name of the Hpr subunit used for this list is given in the file name (capri.1.H.intres stands for Hpr H subunit). The actual Delta ASA value is given as well. The ASA values were computed using a probe radius of 1.4Å.

  • cc.capri1.H.d list of clashes in the target H-A,C interface.
  • cc.capri1.I.d list of clashes in the target I-A,B interface.
  • cc.capri1.J.d list of clashes in the target J-B,C interface.


  • Example of the cc.capri1.H.d content:

    B. SUBTILIS Hpr Atom    LACTOBACILLUS Hpr KINASE Atom   Distance
    
    -- (no Clashes between 0-1Å)
    
    -- (no Clashes between 1-2Å)
     
    H  46 .SER.OG		A 179 .ASP.OD2			 2.48
    H  40 .LYS.NZ		A 136 .ARG.O			 2.66
    H  52 .SER.OG		A 180 .ARG.NH1			 2.76
    H  46 .SER.OG		A 179 .ASP.OD1			 2.85
    H  48 .MET.CB		A 179 .ASP.OD2			 2.86
    H  46 .SER.OG		A 179 .ASP.CG			 2.96
    H  51 .MET.O		C 301 .LEU.CD1			 2.97
    H  54 .GLY.O		C 308 .ASN.ND2			 2.99
    

    In these files only distances lower or equal to 3 Å for any Hpr atom from the target (in this example, from H subunit) to any Kinase atom for the target (in this example, in the A or C subunits). Chain name, residue number, residue and atom names are also indicated.


    Final Summary

    File Target 01 Final Summary. contains information that looks like that:

    
    
    PREDS			fnat		fnon-nat		 		fIR		 		INTERFACE RES.(OP)	THETA ANGLE	DISTANCE	Nclash		L_rmsd		I_rmsd
    								   Hpr       	K         	   Hpr       	K         
    
    T01_P22.1.A 0.096 0.931 0.500 0.640 0.464 0.457 133.1 4.033 42 17.979 8.639 T01_P22.4.A 0.096 0.946 0.462 0.640 0.400 0.410 110.6 5.314 78 16.377 7.566 T01_P22.2.A 0.077 0.949 0.462 0.640 0.400 0.421 147.0 5.484 53 19.051 9.243 T01_P22.3.A 0.077 0.938 0.500 0.640 0.500 0.571 168.3 2.526 35 19.267 9.797
    .
    .

    Column 1 lists the prediction identifier (i.e. T01_P22.1.A means participant 22, prediction 1 for the target 1, using Hpr chain A in the unbound coordinates)

    Column 2 gives the fraction of predicted contacts over native fnat. This fraction is computed as the number of contacts in the prediction that match the contacts in the target, divided by the number of contacts in the target. In fact we compute 3 different ratios, one for each of the 3 different interfaces in the target and retain the largest ratio for the summary. The contact lists for the target are Contres1, Contres2 and Contres3. As mentioned above, 2 residues are considered as being in contact if at least one atom of one residue is within 5Å of an atoms of the other.

    Colum 3 gives the false positive fnon-nat contact fraction. This fraction is computed as the number of contacts in the prediction that don't match the contacts in the target, divided by the number of contacts in the prediction. This number accounts for the real efficiency of the prediction in term of contact: as bigger is the predicted interface as higher the probability of predict native contacts. The number given in this table is the one referred to the target contact list which gives the highest contact ratio over prediction.

    Columns 4 and 5 list the interface residues ratios over native fIR. Column 4 give the ratio between the residues of the Hpr that are part of the interface in the prediction, over the Hpr residues that are part of the interface in the target. The 5th column gives the same information for the kinase moiety. Note that these ratios are computed considering, respectively the 3 different Hpr/Kinase interfaces in the target, but again we list in the summary the results when compared to the target interface which gives the hihgest contact ratio (ON). Column 5, list the interface contact ratio for the Kinase. All the interface residues lists are generated using the BRUGEL package.

    Columns 6 and 7 lists the interface residue ratios over prediction. They are analogous to columns 4 and 5 but now dividing the number of residues in the prediction found in the target over the total number of provided residues at the predicted interface.

    Column 8 lists the rotation angle (Theta angle) necessary to fit the Hpr molecule in the predicted complex to that in the target, as per capri.1.pdb. To compute this angle, we first perform a rigid-body fit (Kabsch, 1978, Acta. Cryst. A. 34, 827-828) of the Kinase subunit in the predicted complex, to the kinase subunit in the target. The particular subunits superimposed are listed in the detailed information (see the FITTING_SUMMARY below) In performing the fit between the kinase subunits, we leave out the C-terminal helix (end fragment starting at residue 287), which changes conformation between the bound and unbound kinase moieties.

    After this first fit, a second fit is performed so as to superimpose the predicted Hpr molecules onto its closest counterpart in the target structure (capri.1.pdb closest). The rotation angle corresponding to this second fitting is the listed theta angle.

    Column 9 lists the distance (in Angstroms) between geometric centers of predicted and target Hpr molecules before the second fit. The distance between the geometric centers together with the Theta angle give an idea of the global position of the Hpr in the prediction relative to the position in the target.

    Column 10 lists the number clashes Nclash between the Hpr and the Kinases for each predicted complex. Clashes are computed between heavy atom within 3 Å . In the detailed information you can find the clash pairs classified into three categories: from 0 to 1, from 1 to 2 and from 2 to 3 Å.

    Columns 11 and 12 list the RMSD's (Root Mean Square Deviation) values in Å . Column 11 list the RMSD values calculated between the Hpr's backbones once the kinases are superimposed. Column 12 contain the rsmd's when sumperimposing the backbones of the residues at the interface (Hpr + Kinase) on the prediction upon the counterpart in the target I_rmsd. Residues at the interface are re-defined here, as residues in the target having at least one atom within 10 Å of an atom of the other molecule. The equivalents for those residues in the predictions are considered as to be in the interface to sumperimpose. For all the RMSD calculations we consider the same molecular fragments as for the fits.


    Contact List

    Directory: ContactList contains one file per predicted interface, with information on the residue-residue contacts in the prediction and how well they match those in the target. The information contained in each file is illustrated by an example

    for the prediction T01_P22.1.A.highlighted

    HIGHLIGHTED CONTACT LIST FOR T01_P22.1.resA
    Number of contacts = 72 Matching List1 = 5/59 Matching List2 = 5/56 Matching List3 = 5/52


    A20 THR - C180 ARG
    A23 VAL - C179 ASP
    .
    .
    A46 SER - C140 HIS	1	2	3
    A46 SER - C141 GLY
    A46 SER - C157 SER	1	2	3
    .
    .
    

    The first line in each file, is a header listing the ID of the analyzed interface (here, T01_P22.1.resA). The second line summarizes the results. It lists the total number of residue contacts in the predicted interface, and 3 ratios. These ratios are of the number of the predicted contacts that match those in the target, over the number of contacts in the target, evaluated for the 3 different contacts lists of the target (Contres1, Contres2 and Contres3).

    The rest of the file contains the list of residue pairs (Hpr residue-kinase residue) in contact in the predicted interface. The residues are identified by the chain identifier (here chan A for the Hpr and chain C for the kinase) the residue number along the chain and the three letter aa code. When a residue pair matches a pair in one of the 3 contacts lists (Contres1, Contres2, Contres3) of the target, this is indicated in the following 3 columns. A match with a contact in Contres1 is marked as a 1, that with a contact in Contres2 is marked with a 2 and that with Contres3 s marked by a 3.


    INTERFACE_RESIDUES_HIGHLIGHTED

    Directory InterfaceResidues contains one file per predicted interface, with information on the residues forming the Hpr-kinas interface in the prediction and how well they match those in the target interfaces. The information contained in each file is illustrated by an example

    for the prediction T01_P22.1.A.highlighted

    HIGHLIGHTED RESIDUE LIST FOR T01_P22.1.A

    N_res_Hpr = 28 N_res_K = 35 Match H_Hpr = 14/27 Match H_K = 17/29 Match I_Hpr = 15/28 Match I_K = 16/30 Match J_Hpr = 13/26 Match J_K = 16/25


                
    A20  THR  39.24771	2
    A23  VAL  5.924515
    A24  GLN  89.13425	2
    A27  SER  49.59916	1	2	3
    A28  LYS  91.05328	1	2
    A29  TYR  6.877995
        .
        .
        .
                
    KINASE LIST
                
    C138  SER  6.184550	1	2	3
    C139  MET  3.325624	1
    C140  HIS  69.19147	1	2	3
    C141  GLY  0.2478379
    .
    .
    .
    

    The first line in each file, is a header listing the ID of the analyzed interface (here, T01_P22.1.A). The second line summarizes the results. It lists respectively, the number of Hpr residues forming the predicted interface, the total number of kinase residues, and three pairs of ratios, of correctly predicted over observed residues for the Hpr and kinase, respectively, for each of the 3 interfaces of the target. This summary line is followed by 2 lists. The first list is that of the Hpr residues in the predicted interface and the second is that of the kinase residues in the predicted interface. For each listed residue the Delta ASA value is also given in Å2 A match with a residue in the list of one of the target interfaces is indicated on the right hand side. 1, indicated a match with the interface residues list, capri.1.H.intres, a 2 indicates a match with the interfaces residues list capri.1.I.intres, and 3 indicates a match with a residue in the interface residues list capri.1.J.intres.

    Note that interface residues list files and contact list ones are named the same (i.e. T01_P22.1.A.highlighted) but they are in different directories and their contents are completely different.


    FITTING_SUMMARY

    Directory FittingSummary contains one file per predicted interface, with information on the results of fitting the predicted complex over the target complex. The information contained in each file is illustrated by an example for the prediction T01_P22.1.A.

    T01_P22.1.A.fitting.summary contains the following information:
    (First fitting information)

    Fitting on C prediction Kinase onto K capri Kinase subunit

    Rotation Matrix:
    0.99823 -0.03700 0.04659 0.04238 0.99185 -0.12020 -0.04177 0.12196 0.99166 Translation vector -12.738 -6.203 -15.003
    (Second fitting information)

    Fitting Hpr's, A onto R
    Theta angle = 133.05
    Distance between geometrical centers = 4.033024

    Two rigid body coordinate superpositions were computed. first fitting information gives the info on the first superposition, in which the kinase subunits in the predicted and target complexes are superimposed. And the second fitting information gives the info on the second fitting, where the Hpr subunits of the predicted and target complexes where superimposed, after having performed the first superposition. The 1st fitting info indicates which kinase subunits where used in the superposition, and give the rotation matrix and translation vector, computed from the superposition.

    We use the optimal fit in the sense that we scan all the possible kinase subunit with the highest number of contact vs. any of the kinase subunits in the target and select the transformation that leaves both the predicted and target Hpr closest as a first fit.

    In this case, the above rotation+translation can be applied to subunit C of the prediction to optimally superimpose it to subunit K of the target. The coordinate superpositions were computed used the program of Kabsch (1978). In performing the fit between the kinase subunits, we leave out the C-terminal helix (residue 287 -end), which changes conformation between the bound and unbound kinase moieties. After this first fit, a second fit is performed so as to superimpose the predicted Hpr molecules onto its closest counterpart in the target structure , the chain id's of the fitted Hpr molecules are given, and for this second fit we list the rotation, in the form of the Theta angle, given in degrees, and the distance between the geometrical centers (computed before the second fit is performed).

    Note that in order to not confuse chain ID's between target and predicted coordinate sets, the chain ID's in the target (capri.1.pdb) were renamed as follows:
    A to K
    B to L
    C to M
    for the Kinases subunits
    H to R
    I to S
    J to T
    for the Hpr subunits.


    FITTED PDB

    Directory FittedPDB contains the files with the coordinates of the predicted and target complexes superimposed, following the first fit, in which the kinase subunits have been superimposed (using the listed rotation matrix and translation vector). Displaying this file using Rasmol, for each prediction, and coloring different the predicted and target coordinates shows clearly the differences between the predicted and observed positions of the Hpr molecules, relative to the kinase subunits, are pdb. Fitted Pdb are now regenerated according to the new fits.

    These files have been now regenerated according to the new fits

    .
    CLOSE_CONTACTS

    Directory CloseContacts contains one file per predicted interface with information on the clashes in each predicted interface. We noticed that many predicted complexes seemed to have an unduly large number atomic clashes, probably due to the use of simplified models.

    For example part of file cc.T01_P27.1.A.d looks like that:

           
    B. SUBTILIS Hpr Atom    LACTOBACILLUS Hpr KINASE Atom   Distance
    
    A 47   .ILE.O           E 237  .TRP.CH2                 0.34
    A 47   .ILE.CD1         E 237  .TRP.NE1                 0.56
    A 15   .HIS.CD2         E 235  .GLU.C                   0.63
    A 42   .VAL.CB          E 239  .PRO.CB                  0.65
    .
    .
    
    --
    A 15   .HIS.CB          E 236  .ASN.OD1                 1.01
    A 52   .SER.OG          E 240  .ASP.CB                  1.09
    A 47   .ILE.C           E 237  .TRP.CH2                 1.12
    A 47   .ILE.O           E 237  .TRP.CZ2                 1.17
    .
    .
    --
    A 54   .GLY.O           E 235  .GLU.OE1                 2.01
    A 54   .GLY.C           E 235  .GLU.CD                  2.01
    A 42   .VAL.CG1         E 239  .PRO.O                   2.01
    A 15   .HIS.CD2         E 236  .ASN.CA                  2.02
    A 49   .GLY.O           E 239  .PRO.C                   2.03
    
    

    The atoms of the Hpr and Kinase subunits respectively, forming the clashes are listed. The list of clashes is segregated into contacts between 0-1, 1-2 and 2-3Å.


    Email Problems or Queries to Kim Henrick, John Tate