Where do the data come from?
Methods for molecular interaction identification
There are two approaches to gaining information about molecular interactions:
In this course we will concentrate on the experimental methods, but there is an increasing variety of computational methods that can predict protein–protein interactions. Because only a small proportion of all the molecular interactions in an organism are currently covered by the experimental data, these methods provide a meaningful resource that can help use to analyse under-represented regions of the interactome. You can find out more about the different methods used to predict protein-protein interactions in a comprehensive Wikipedia entry entitled protein-protein interaction prediction.
A wide variety of experimental methods can be used to detect protein-protein interactions. It is important to realise that there is no perfect approach. Each method has its limitations and is to an extent potentially artifactual. Therefore, it is advisable to check interactions using more than one approach: interactions detected by more than one method are more likely to be "real"1,2,3.
Next, we will have a look at some ways to experimentally identify protein-protein interactions.
High-Throughput: Yeast two hybrid
The most frequently used laboratory method for experimentally determining molecular interactions is yeast two-hybrid (Y2H) screening4.
Y2H is a complementation assay. The readout mechanism is based on a transcription factor, which is split into two independent parts, the DNA-binding domain (BD) and the DNA-activation domain (AD). The BD and AD domains are fused to two proteins of interest, the bait (X) and the prey (Y). This ensures that the readout can only take place when the two halves are brought into close proximity. If the bait and prey proteins bind to each other when expressed in a yeast cell, the transcription machinery becomes activated and a reporter gene is turned on (Figure 3).
Figure 3. The yeast two hybrid (Y2H) concept and a typical readout.  The BD domain fused to the bait protein (X) and the AD domain fused to prey protein (Y) are expressed in yeast cells.  If proteins X and Y interact, BD binds DNA and AD activates RNA polymerase.
An example readout  of a Y2H assay with two bait proteins (Bait 1 and Bait 2) and five prey proteins (1 to 5). In this example, positive interactions are shown by colony growth. Readouts for a Y2H assay can also be detected by DNA sequencing or colorimetric methods such as the beta-galactosidase assay (Figure from Koh et. al.5).
- An in vivo system in which binding sites can be accurately mapped
- False positives occur when a yeast protein acts as a bridge for the interaction.
- Interactions occur between proteins that would not normally be present in the same cellular compartment, in the same cell type, or at the same time.
- Both bait and prey proteins can fail to be expressed or might be toxic to the cell.
High-Throughput: Affinity Purification Mass Spectrometry
The second high-throughput method significantly contributing to the growth in published protein-protein interaction data is affinity purification mass spectometry (AP-MS, Figure 4).
In AP-MS, a single protein or molecule of interest is immobilised in a matrix as a bait. Then a protein mixture is passed through the matrix and interacting partners (prey) are captured by the bait protein. Any form of technique relying on mass spectrometry (MALDI, LC-MS/MS, etc...) is then used to identify the captured proteins.
Figure 4. Affinity purification and mass spectometry (AP-MS). The bait protein (yellow) is immobilised on a matrix . A protein mixture is passed through and only the interacting partners (prey) are retained . In the following step the prey proteins are removed, digested with a protease and the resulting peptides are analysed by MS  5.
- Potentially, depending on the sensitivity of your MS-approach and the affinity of the interacting partners, this method has the ability to examine interactions among multiple proteins at subpicomole concentrations.
- The prey proteins are present in their native state (so long as they are not affected by the sample lysis process) and concentration.
- Prey proteins without a peptide signature recognisable by MS (owing to obscure post-translational modifications) or present in very low amounts will not be identified.
- Biologically relevant transient interactions and weak interactions may be missed.
- Mixing of compartments during cell lysis/purification is a potential source of false positives. For example, interactions between proteins that would not normally be in the same cellular compartment may confound your results.
There is an overwhelming variety of techniques that can be used to detect protein-protein interactions using low- or medium-throughput setups. Next we summarise some methods that are often used to improve the confidence of an interaction detected by high-throughput methods or on their own merits in small-scale experiments.
Co-immunoprecipitation (Co-IP) is the immunoprecipitation of intact protein complexes (i.e. antigen along with any proteins or ligands that are bound to it); see Figure 5. Co-IP works by selecting an antibody that targets a known protein that is believed to be a member of a larger complex of proteins. By targeting this known member of a complex with an antibody, you might be able to pull the entire protein complex out of solution and thereby identify unknown members of the complex.
This technique works when the proteins involved in the complex bind to each other tightly, making it possible to pull multiple members of the complex out of solution by latching onto one member with an antibody.
The concept of pulling protein complexes out of solution is sometimes referred to as a "pull-down".
Co-IP has been traditionally considered as the "gold standard" assay for protein-protein interactions, but its caveats are very similar to those of AP-MS as it is also an affinity purification method.
Figure 5. Protein complex immunoprecipitation (Co-IP) method.  Addition of antibody to protein extract.  Target proteins are immunoprecipitated with the antibody.  Coupling of antibody to beads.  Isolation of protein complexes.
X-ray crystallography is a method of determining the arrangement of atoms within a crystal, in which a beam of X-rays strikes a crystal and causes the beam of light to spread into many specific directions. From the angles and intensities of these diffracted beams, a crystallographer can produce a three-dimensional picture of the density of electrons within the crystal (Figure 6). From this electron density, the mean positions of the atoms in the crystal can be determined, as well as their chemical bonds and other types of information.
This method is considered to be another "gold standard" because it provides an extremely deep level of detail about interacting surfaces and residues (at a level of atoms and chemical bonds) and high quality data.
However, it is extremely challenging technically, is very low-throughput and is not free from false negatives or false positives; not every protein is amenable to co-crystallization and some proteins that co-crystallise in vitro do not interact in a physiological context.
Figure 6. X-ray crystallography is used to to obtain detailed structural and chemical insights for selected interactions. The figure shows a model of the cullin complex 13.
Sometimes it is necessary to use methods that can be performed in mammalian cell lines, providing a more physiological environment for studies using mammalian proteins.The following techniques can be applied in medium- or high-throughput setups and have become widely used in the past few years:
MAPPIT: mammalian protein-protein interaction trap
For more information on these techniques, see Reference 9.
Finally, one of the few ways of identifying transient interactions missed by other methods is the enzyme assay. These assays are based on taking enzyme-catalysed reactions as evidence that an enzyme interacts with its substrate, for example. However, these assays can only use in vitro data, requiring purified proteins, as there are too many unknowns if they are performed with a whole cell lysates. Moreover, many enzymes are promiscuous in vitro – most prominently kinases. This can lead to a large number of false positives.