CAPRI: Critical Assessment of PRediction of Interactions

Community wide experiment on the comparative evaluation of protein-protein docking for structure prediction

Hosted By EMBL/EBI-MSD Group


The First protein-protein structure interactions prediction Experiment CAPRI

Version 1 Dated 2 July 2001
From Mike Sternberg (


In June 2001, a meeting organised by Ilya Vakser and Sandor Vajda was held in Charleston, SC. Several groups involved in the development of protein-protein docking algorithms were present. In addition, John Moult, who is involved with organising the CASP blind evaluation of protein structure prediction, was at the meeting. It was agreed
1) To include a blind protein-protein docking evaluation to be held during 2002.
2) To hold, as CAPRI, a comparative evaluation of protein-protein docking algorithms on a set of known targets.
The document outlines the agreed protocols.


2.1) Organising committee
The organising committee consists of
Joel Janin
Mike Sternberg
Lynn Ten Eyck
who will consult with
John Moult

The committee will aim to establish links with the experimental community to obtain targets.

2.2) Targets

The major problem facing the docking evaluation is the difficulty of obtaining targets as far fewer complexes are solved than individual components. In addition, for the evaluation, the experimentalist may have to provide coordinates of the components (see below), which would effectively make public his coordinates prior to publication.

The following targets would be used, in decreasing order of preference:
i)   The ideal targets would be to start with coordinates of two unbound molecules and predict the structure of the complex.
ii)  The next best alternative is to have one bound and one unbound set of coordinates
iii) If only two bound coordinates are available, then backbone coordinates will be supplied for docking.

2.3 Evaluation Strategy

Unlike protein structure prediction, the season will be an extended session, starting as soon as possible and ending roughly 1 Sept 2002. An assessor will need to be selected who will present the evaluation at a date to be announced. The longer term nature of the prediction season raises two issues.
i)   It is accepted that groups will analyse their results as the coordinates of the targets become available. We need to decide if groups who sign up to CAPRI are prevented from publishing their results prior to the CAPRI meeting and claiming success on one or a few targets. We welcome comments on this issue.
ii)  At CASP, the assessor is not aware of the identity of the group submitting an entry. With the longer prediction season, this will be hard to maintain. The evaluation criteria must be agreed in advance (or after the first submission). The assessor still could provide a general evaluation of the status of protein-protein docking including comments on the quality of any side-chain modelling.

2.4 Evaluation criteria

We need to agree the rules for evaluation.

As the broad community of potential users are not familiar with RMS deviation of superposed coordinates, the number of correct protein-protein contacts will be used.

Actully most of so far published docking work is evaluated using RMS of superposed coordinates. And it makes structural sense. I don't know if it is a good idea to switch to residue contacts.

Our suggestion, for discussion, is that

A residue-residue contact occurs if any two atoms are closer than 4 A. A good prediction has at least 60% of the true residue-residue contacts in the model. A helpful prediction has at least 25% of the true residue-residue contacts in the model.

In addition as a measure (but not the evaluation criterion), the RMS of C alpha and all atoms at the interface will be reported. First the model of the larger component will be superposed using C alpha atoms on the bound larger component. The interface is defined as any residue in one component having at least one atom with 10A of an atom in the other component. The RMS quoted is for the interface residues of the smaller component.

Given the present status of the algorithms, submission of just one model will rarely provide a good prediction. We need to discuss the scoring of several submitted models. One suggestion is that each group submits 10 models ranked in order of preference. The rank of the first good prediction is evaluated and the rank of the first helpful model. The best score from the following table is the result to a group for their prediction.

   Rank		Score for good model		Score for helpful model
   1		30				15
   2		20				10
   3		18				 9
   4		16				 8
   5		14				 7
   6		12				 6
   7		10				 5
   8		 8				 4
   9		 6				 3
   10		 4				 2

The final score for all targets is the sum of scores.

The above distinction between a "good" model and a "helpful" model may be too abrupt. How about:

The final score = sum ( coefficient for the rank * percent correctly predicted residue contacts if it is larger than 20%)

2.5 Manual / Fully-automated Predictions

In keeping with CASP rules, manual intervention including use of the literature, can be used in producing the entries.

However we would like to move towards evaluation of fully automated server predictions. At present such servers are not available. By 2 Jan 2002, we should decide if a server evaluation is practicable.

2.6 CASP Team

We require the CASP team at Lawrence Livermore to set up the appropriate e-nail / web software to handle target entries and submissions.


3.1) Committee

A set of people for the benchmark needs to be established.

3.2) Objectives

The aim is for all groups with docking algorithms to deposit their program(s) with Zhiping Weng at Boston University. She will run these algorithms on a benchmark set. The results of this docking will be used to generate decoys sets for further developments. We need a volunteer to run Zhiping's programs. Sandor Vajda, who will not take part in the initial generation, will generate the decoy set. An evaluator will be identified.

3.3) Algorithms

The algorithm must be implemented under Linux or on an SG. The submission must be in the form of a script that can be run in the form run myprogram protein1 protein2 outputfile number_of_output_complexes. The output is a set of pdb files for the complexes. We will need to work out exact format details. For one protein-protein complex the algorithm must take no longer than 2 weeks on a 1GHz Linux box or ? on the ?Mz SG (Zhiping to provide details). The algorithms need to be sent and implemented at Boston by 2 Jan 2002.

I thought we agreed that the CPU limit was 1 week on a linux box ?

I would say: The algorithm must take no longer than 1 week on an 1GHz 1GB-RAM Linux processor (CPU time). We find an R10000 SGI to be typically slower than an 1GHz Linux box. So I would say the algorithm must not take longer than 2 weeks on a R10000 SGI processor.

Each algorithm can only include information from the two sets of coordinates. No functional site prediction, multiple sequence alignment data or a protein-type specific information can be used.

For antibodies, we will cut it to the Fv region.

The algorithms must produce a ranked list of model coordinates. In addition, Zhiping will implement and apply to all entries a biological filter generated to reflect some knowledge of the binding site on one of the proteins.

This mainly concerns antibodies. My group can define the CDR regions using sequence information, if that is a good idea.

3.4) The targets

The targets will be a set of non-homologous complexes (or homologous complexes if the binding modes are totally different). Unbound coordinates for at least one of the starting proteins (with side chains) must be available. The docking of monomers into multimers will NOT be included. Joel Janin will vet the list for biological validity. A list of unbound/unbound docking systems is available at the ICRF ( Other groups have their own test systems. However alternative coordinates or homologues should be used to prevent bias in favour of any groups data set. Or targets could be selected from different groups.

3.5) Evaluation

The fraction of correct contacts will be scored. I suggest the same table to score the results both after the initial docking and after the biological filter.

3.6) End of benchmark

We would aim to finish the benchmark and evaluation in time to discuss at in Dec 2002.


i)   Discuss this document
ii)  Identify other groups interested in docking
iii) Obtain volunteers for:

- committee for benchmark-

I would be happy to be on this

- evaluation of CAPRI
- evaluation of benchmark
- group to identify targets for benchmark (best a few people)

I would be happy to be on this. Actually we have generated a list with 46 targets. I sent the list to Joel and he added a few more. I will forward the email to every one in just a minute.

- group to run Zhiping's program
- CASP team to agree to process coordinates by e-mail
iv)  finalise procedures
v)   get target for CAPRI (Thanks to Joel we already have one target).