Assessing reliability and measuring confidence
An important concern in network analysis is whether the interaction network can be trusted to represent a “real” biological interaction. Given the noise inherent in current interactome information, it is important to be stringent when evaluating the protein-protein interaction data we use in our analysis. It is important to take into account that interactome coverage is also incomplete and patchy, so we will not always have the luxury of filtering out less reliable evidence.
There are many different methods for ascertaining reliability and giving a measure of confidence. Some strategies make use of:
- Contextual biological information regarding the proteins or molecules involved in the interaction. For example, overlapping co-expression patterns (8, 9).
- Count how many times a given interaction has been reported in the literature, as a measure of experimental orthogonal validation. This is a popular and straightforward approach and there are more elaborate variations of this strategy, such as MIscore (see boxed text).
- Aggregated methods that use a number of different strategies and integrate them in a single score, such as INTscore (10).
MIscore is a method for assessing the reliability of protein-protein interaction data based on the use of standards (11). MIscore gives an estimation of confidence weighting on all available evidence for an interacting pair of proteins. The method allows weighting of evidence provided by different sources, provided the data is represented following the standards created by the IMEx consortium.
As shown in Figure 25, the method weights the:
- number of publications;
- detection method;
- interaction evidence type.
Different interaction detection methods and interaction types have different weights, assigned by a group of expert curators. These parameters are aggregated for each interacting pair and then normalised, giving a quantitative measure of how much experimental evidence there is behind a given interaction.
Figure 25 The MIscore normalised score calculates a composite score for an interaction based on the number of publications reporting the interaction, the reported interaction detection methods and interaction types. Reprinted from Villaveces et al. Merging and scoring molecular interactions utilising existing community standards: tools, use-cases and a case study. Database (Oxford), 2015 (11). By permission of Oxford University Press.
Now let’s have a look at some strategies we can use to extract information from a network.