| MSD |
CAPRI:
Critical Assessment of PRediction of Interactions |
First community wide experiment on the comparative
evaluation of protein-protein docking for structure
prediction
Hosted By EMBL/EBI-MSD Group
CAPRI target 01 evaluation results
Raúl Méndez, Leonardo De Maria and Shoshana J. Wodak.
SCMBB Université Libre de Bruxelles, Cp 263, Brussels, Belgium.
Re-accessed on Monday December 16, 2002.
e-mail: raul@ucmb.ulb.ac.be
shosh@ucmb.ulb.ac.be
The evaluation results of the CAPRI TARGET 01 predictions
are stored in different directories depending on the criteria
that have been used. In the following the directories and
their contents are briefly described.
Information
Directory Information
contains information about Target 01, which was used in the
evaluation and scoring. It contains the following files (file
names are given in bold):
capri.1.pdb:
the crystal structure of the target (target 01) in
PDB format.
Contres1:
list of residue contacts in the target between subunits
H (Hpr) and A, C (Kinase).
Contres2:
list of residue contacts in the target between subunits
I (Hpr) and A, B (Kinase).
Contres3:
list of residue contacts in the target between subunits
J (Hpr) and B, C (Kinase).
As an example, here are the first three lines of Contres1:
H11 ASP - C309 GLU
H12 SER - C305 ILE
H13 GLY - C305 ILE
The first column lists the Hpr residue and in the second
column the Kinase residue that are in contact. Two residues
are considered to be in contact, if at least one atom of
residue 1 is within 5Å from an atom in residue 2.
Each residue is listed by the chain name and residue number
along the chain, followed by the three letters amino acid
code.
Three different lists were calculated because the contacts
with the H, I, and J HPr copies were slightly different.
This was also done for the lists of interface residues(below).
capri.1.H.intres
Hpr-Kinase interface residues in the target for the H-A,C
interface.
capri.1.I.intres
Hpr-Kinase interface residues in the target for the I-A,B
interface.
capri.1.J.intres
Hpr-Kinase interface residues in the target for the H-B,C
interface.
Interface residues were identified by computing the differences
in residue accessible surface areas (ASA) in the complex
and in the individual components. Residues in the Hpr and
kinase subunits for while Delta ASA is not zero were included
in the list
For example, the capri.1.H.intres file is structured as
follows:
INTERFACE RESIDUES IN Hpr capri.1H.intres
H11 ASP DELTA_ASA = 9.576227
H12 SER DELTA_ASA = 22.17662
H13 GLY DELTA_ASA = 0.6229653E-01
.
.
(continue)
.
.
INTERFACE RESIDUES IN KINASE capri.1H.intres
A136 ARG DELTA_ASA = 23.78261
A137 ARG DELTA_ASA = 1.377046
A138 SER DELTA_ASA = 51.34740
.
.
( end of file)
Each file thus contains two lists. The 1st list contains
the Hpr residues with Delta ASA Ang**2 in the target. The
second list contains the Kinase residues with Delta ASA
Ang**2 in the target. The name of the Hpr subunit used for
this list is given in the file name (capri.1.H.intres
stands for Hpr H subunit). The actual Delta ASA value is
given as well. The ASA values were computed using a probe
radius of 1.4Å.
cc.capri1.H.d
list of clashes in the target H-A,C interface.
cc.capri1.I.d
list of clashes in the target I-A,B interface.
cc.capri1.J.d
list of clashes in the target J-B,C interface.
Example of the cc.capri1.H.d content:
B. SUBTILIS Hpr Atom LACTOBACILLUS Hpr KINASE Atom Distance
-- (no Clashes between 0-1Å)
-- (no Clashes between 1-2Å)
H 46 .SER.OG A 179 .ASP.OD2 2.48
H 40 .LYS.NZ A 136 .ARG.O 2.66
H 52 .SER.OG A 180 .ARG.NH1 2.76
H 46 .SER.OG A 179 .ASP.OD1 2.85
H 48 .MET.CB A 179 .ASP.OD2 2.86
H 46 .SER.OG A 179 .ASP.CG 2.96
H 51 .MET.O C 301 .LEU.CD1 2.97
H 54 .GLY.O C 308 .ASN.ND2 2.99
In these files only distances lower or equal to 3 Å
for any Hpr atom from the target (in this example, from
H subunit) to any Kinase atom for the target (in this example,
in the A or C subunits). Chain name, residue number, residue
and atom names are also indicated.
Final Summary
File
Target 01 Final Summary. contains information
that looks like that:
PREDS fnat fnon-nat fIR INTERFACE RES.(OP) THETA ANGLE DISTANCE Nclash L_rmsd I_rmsd
Hpr K Hpr K
T01_P22.1.A 0.096 0.931 0.500 0.640 0.464 0.457 133.1 4.033 42 17.979 8.639
T01_P22.4.A 0.096 0.946 0.462 0.640 0.400 0.410 110.6 5.314 78 16.377 7.566
T01_P22.2.A 0.077 0.949 0.462 0.640 0.400 0.421 147.0 5.484 53 19.051 9.243
T01_P22.3.A 0.077 0.938 0.500 0.640 0.500 0.571 168.3 2.526 35 19.267 9.797
.
.
Column 1 lists the prediction identifier (i.e. T01_P22.1.A
means participant 22, prediction 1 for the target 1, using
Hpr chain A in the unbound coordinates)
Column 2 gives the fraction of predicted contacts over native fnat.
This fraction is computed as the number of contacts in the
prediction that match the
contacts in the target, divided by the number of contacts
in the target. In fact we compute 3 different ratios, one
for each of the 3 different interfaces in the target and
retain the largest ratio for the summary. The contact lists
for the target are Contres1, Contres2 and Contres3. As
mentioned above, 2 residues are considered as being in contact
if at least one atom of one residue is within 5Å of
an atoms of the other.
Colum 3 gives the false positive fnon-nat contact fraction.
This fraction is computed as the number of contacts in the prediction that don't match the contacts in the target, divided by the number of
contacts in the prediction. This number accounts for the
real efficiency of the prediction in term of contact: as bigger is
the predicted interface as higher the probability of predict
native contacts. The number given in this table is the one referred
to the target contact list which gives the highest contact ratio over
prediction.
Columns 4 and 5 list the interface residues ratios over native fIR.
Column 4 give the ratio between the residues of the Hpr
that are part of the interface in the prediction, over the
Hpr residues that are part of the interface in the target.
The 5th column gives the same information for the kinase
moiety. Note that these ratios are computed considering,
respectively the 3 different Hpr/Kinase interfaces in the
target, but again we list in the summary the results when
compared to the target interface which gives the hihgest
contact ratio (ON). Column 5, list the interface contact
ratio for the Kinase. All the interface residues lists
are generated using the BRUGEL package.
Columns 6 and 7 lists the interface residue ratios over prediction.
They are analogous to columns 4 and 5 but now dividing the number of
residues in the prediction found in the target over the total number
of provided residues at the predicted interface.
Column 8 lists the rotation angle (Theta angle) necessary
to fit the Hpr molecule in the predicted complex to that
in the target, as per capri.1.pdb. To compute this angle,
we first perform a rigid-body fit (Kabsch, 1978, Acta.
Cryst. A. 34, 827-828) of the Kinase subunit in the
predicted complex, to the kinase subunit in the target.
The particular subunits superimposed are listed in the detailed
information (see the FITTING_SUMMARY below) In performing
the fit between the kinase subunits, we leave out the C-terminal
helix (end fragment starting at residue 287), which changes
conformation between the bound and unbound kinase moieties.
After this first fit, a second fit is performed so as to
superimpose the predicted Hpr molecules onto its closest
counterpart in the target structure (capri.1.pdb closest).
The rotation angle corresponding to this second fitting
is the listed theta angle.
Column 9 lists the distance (in Angstroms) between geometric
centers of predicted and target Hpr molecules before the
second fit. The distance between the geometric centers together
with the Theta angle give an idea of the global position
of the Hpr in the prediction relative to the position in
the target.
Column 10 lists the number clashes Nclash between the
Hpr and the Kinases for each predicted complex. Clashes
are computed between heavy atom within 3 Å . In the
detailed information you can find the clash pairs
classified into three categories: from 0 to 1, from 1 to
2 and from 2 to 3 Å.
Columns 11 and 12 list the RMSD's (Root Mean Square Deviation) values in Å . Column 11
list the RMSD values calculated between the Hpr's backbones
once the kinases are superimposed. Column 12 contain the rsmd's
when sumperimposing the backbones of the residues at the interface
(Hpr + Kinase) on the prediction upon the counterpart in the target I_rmsd.
Residues at the interface are re-defined here, as residues
in the target having at least one atom within 10 Å of an
atom of the other molecule. The equivalents for those residues in
the predictions are considered as to be in the interface to sumperimpose. For all the RMSD calculations we consider the same molecular fragments as for the fits.
Contact List
Directory: ContactList
contains one file per predicted interface, with information
on the residue-residue contacts in the prediction and how
well they match those in the target. The information contained
in each file is illustrated by an example
for the prediction T01_P22.1.A.highlighted
HIGHLIGHTED CONTACT LIST FOR T01_P22.1.resA
Number of contacts = 72 Matching List1 = 5/59 Matching List2
= 5/56 Matching List3 = 5/52
A20 THR - C180 ARG
A23 VAL - C179 ASP
.
.
A46 SER - C140 HIS 1 2 3
A46 SER - C141 GLY
A46 SER - C157 SER 1 2 3
.
.
The first line in each file, is a header listing the ID
of the analyzed interface (here, T01_P22.1.resA). The second
line summarizes the results. It lists the total number of
residue contacts in the predicted interface, and 3 ratios.
These ratios are of the number of the predicted contacts
that match those in the target, over the number of contacts
in the target, evaluated for the 3 different contacts lists
of the target (Contres1, Contres2 and Contres3).
The rest of the file contains the list of residue pairs
(Hpr residue-kinase residue) in contact in the predicted
interface. The residues are identified by the chain identifier
(here chan A for the Hpr and chain C for the kinase) the
residue number along the chain and the three letter aa code.
When a residue pair matches a pair in one of the 3 contacts
lists (Contres1, Contres2, Contres3) of the target, this
is indicated in the following 3 columns. A match with a
contact in Contres1 is marked as a 1, that with a contact
in Contres2 is marked with a 2 and that with Contres3 s
marked by a 3.
INTERFACE_RESIDUES_HIGHLIGHTED
Directory InterfaceResidues
contains one file per predicted interface, with information
on the residues forming the Hpr-kinas interface in the prediction
and how well they match those in the target interfaces.
The information contained in each file is illustrated by
an example
for the prediction
T01_P22.1.A.highlighted
HIGHLIGHTED RESIDUE LIST FOR T01_P22.1.A
N_res_Hpr = 28 N_res_K = 35 Match H_Hpr = 14/27 Match H_K
= 17/29 Match I_Hpr = 15/28 Match I_K = 16/30 Match J_Hpr
= 13/26 Match J_K = 16/25
A20 THR 39.24771 2
A23 VAL 5.924515
A24 GLN 89.13425 2
A27 SER 49.59916 1 2 3
A28 LYS 91.05328 1 2
A29 TYR 6.877995
.
.
.
KINASE LIST
C138 SER 6.184550 1 2 3
C139 MET 3.325624 1
C140 HIS 69.19147 1 2 3
C141 GLY 0.2478379
.
.
.
The first line in each file, is a header listing the ID
of the analyzed interface (here, T01_P22.1.A). The second
line summarizes the results. It lists respectively, the
number of Hpr residues forming the predicted interface,
the total number of kinase residues, and three pairs of
ratios, of correctly predicted over observed residues for
the Hpr and kinase, respectively, for each of the 3 interfaces
of the target. This summary line is followed by 2 lists.
The first list is that of the Hpr residues in the predicted
interface and the second is that of the kinase residues
in the predicted interface. For each listed residue the
Delta ASA value is also given in Å2 A
match with a residue in the list of one of the target interfaces
is indicated on the right hand side. 1, indicated a match
with the interface residues list, capri.1.H.intres,
a 2 indicates a match with the interfaces residues list
capri.1.I.intres, and 3 indicates a match with a
residue in the interface residues list capri.1.J.intres.
Note that interface residues list files and contact list
ones are named the same (i.e. T01_P22.1.A.highlighted)
but they are in different directories and their contents
are completely different.
FITTING_SUMMARY
Directory FittingSummary
contains one file per predicted interface, with information
on the results of fitting the predicted complex over the
target complex. The information contained in each file is
illustrated by an example for the prediction T01_P22.1.A.
T01_P22.1.A.fitting.summary contains the following
information:
(First fitting information)
Fitting on C prediction Kinase onto K capri Kinase subunit
Rotation Matrix:
0.99823 -0.03700 0.04659
0.04238 0.99185 -0.12020
-0.04177 0.12196 0.99166
Translation vector -12.738 -6.203 -15.003
(Second fitting information)
Fitting Hpr's, A onto R
Theta angle = 133.05
Distance between geometrical centers = 4.033024
Two rigid body coordinate superpositions were computed.
first fitting information gives the info on the first
superposition, in which the kinase subunits in the predicted
and target complexes are superimposed. And the second
fitting information gives the info on the second fitting,
where the Hpr subunits of the predicted and target complexes
where superimposed, after having performed the first superposition.
The 1st fitting info indicates which kinase
subunits where used in the superposition, and give the rotation
matrix and translation vector, computed from the superposition.
We use the optimal fit in the sense that we scan all the possible kinase
subunit with the highest number of contact vs. any of the kinase
subunits in the target and select the transformation that leaves both the
predicted and target Hpr closest as a first fit.
In this case, the above rotation+translation can be applied to subunit
C of the prediction to optimally superimpose it to subunit
K of the target. The coordinate superpositions were computed
used the program of Kabsch (1978). In performing the fit
between the kinase subunits, we leave out the C-terminal
helix (residue 287 -end), which changes conformation between
the bound and unbound kinase moieties. After this first
fit, a second fit is performed so as to superimpose the
predicted Hpr molecules onto its closest counterpart in
the target structure , the chain id's of the fitted Hpr
molecules are given, and for this second fit we list the
rotation, in the form of the Theta angle, given in degrees,
and the distance between the geometrical centers (computed
before the second fit is performed).
Note that in order to not confuse chain ID's between target
and predicted coordinate sets, the chain ID's in the target
(capri.1.pdb) were renamed as follows:
A to K
B to L
C to M
for the Kinases subunits
H to R
I to S
J to T
for the Hpr subunits.
FITTED PDB
Directory
FittedPDB contains the files with the coordinates
of the predicted and target complexes superimposed, following
the first fit, in which the kinase subunits have been superimposed
(using the listed rotation matrix and translation vector).
Displaying this file using Rasmol, for each prediction,
and coloring different the predicted and target coordinates
shows clearly the differences between the predicted and
observed positions of the Hpr molecules, relative to the
kinase subunits, are pdb. Fitted Pdb are
now regenerated according to the new fits.
These files have been now regenerated according to the new fits .
CLOSE_CONTACTS
Directory
CloseContacts contains one file per predicted
interface with information on the clashes
in each predicted interface. We noticed that many predicted
complexes seemed to have an unduly large number atomic clashes,
probably due to the use of simplified models.
For example part of file
cc.T01_P27.1.A.d looks
like that:
B. SUBTILIS Hpr Atom LACTOBACILLUS Hpr KINASE Atom Distance
A 47 .ILE.O E 237 .TRP.CH2 0.34
A 47 .ILE.CD1 E 237 .TRP.NE1 0.56
A 15 .HIS.CD2 E 235 .GLU.C 0.63
A 42 .VAL.CB E 239 .PRO.CB 0.65
.
.
--
A 15 .HIS.CB E 236 .ASN.OD1 1.01
A 52 .SER.OG E 240 .ASP.CB 1.09
A 47 .ILE.C E 237 .TRP.CH2 1.12
A 47 .ILE.O E 237 .TRP.CZ2 1.17
.
.
--
A 54 .GLY.O E 235 .GLU.OE1 2.01
A 54 .GLY.C E 235 .GLU.CD 2.01
A 42 .VAL.CG1 E 239 .PRO.O 2.01
A 15 .HIS.CD2 E 236 .ASN.CA 2.02
A 49 .GLY.O E 239 .PRO.C 2.03
The atoms of the Hpr and Kinase subunits respectively,
forming the clashes are listed. The list of clashes is segregated into contacts between 0-1, 1-2 and
2-3Å.
Email Problems or Queries to Kim
Henrick, John Tate
|