Representing molecular interactions data

Representing molecular interacton data

Representation of molecular interactions can be challenging. There is a great level of detail that can be inferred from experimental datasets, such as the interacting surfaces in a protein–protein interaction (Figure 9). IntAct supports the representation of interacting domains down to the residue level, including required post-translational modifications (PTMs) and sequence mutations that have an impact in the interaction. The domains are represented as specific ranges of the underlying amino acid sequence. These ranges are re-mapped whenever the relevant UniProtKB protein sequence is updated.

Representation of interacting domains in a protein–protein interactionFigure 9. Representation of interacting domains in a protein–protein interaction.

How to deal with complexes

Some experimental protocols, such as tandem affinity purification (TAP), generate complex data sets, depicting interactions in which more than two proteins are involved at the same time (n-ary interactions).

However, interaction data are often stored in tabular formats that aim to be amenable to quick, comprehensive searches. It may be desirable to convert these complexes into sets of binary interactions to simplify and speed up searches. There are two algorithms that will perform such conversion: the matrix model and the spoke model 2, as depicted in Figure 10. In this hypothetical example, take the bottom right protein complex (marked “reality”). A tandem affinity experiment (far left) might tell you that each of the other five proteins interact with the red bait protein in the middle. In reality, the red protein has only one interactor, which is the yellow protein. 

As you can see, both algorithms are somewhat mis-leading, but as the spoke model generates up to 3 times fewer false positives 3,4, IntAct  uses the spoke model when data are exported in tabular format. Many people do not find spoke-expanded data is useful and prefer to exclude them from their analyses, so IntAct gives you the option to filter your results to remove spoke-expanded interactions.

 The matrix expansion algorithm and the spoke expansion algorithm converting complex interactions into binary ones

Figure 10. The matrix expansion algorithm and the spoke expansion algorithm converting complex interactions into binary ones. In this hypothetical example, take the bottom right protein complex (marked “reality”). A tandem affinity experiment (far left) might tell you that each of the other five proteins interact with the red bait protein in the middle. In reality, the red protein has only one interactor, which is the yellow protein.

 Next, we will show how the data is displayed in the IntAct website.