A guided example
Check out this guided example to help explain the calculations involved in understanding overlapping entries.
Let’s consider the following sets:
- IPR1 = 19539 proteins
- IPR2 = 37052 proteins
Number of unique proteins (Union):
- IPR1 ∪ IPR2 = 38081
Intersecting proteins:
- IPR1 ∩ IPR2 = 17534
- IPR2 ∩ IPR1 = 16950

Jaccard index score calculation
Using the Jaccard index formula we can calculate:
- Jaccard index (IPR1 ∩ IPR2) = 17534/38081 = 0.46
- Jaccard index (IPR2 ∩ IPR1) = 16950/38081 = 0.45
For consistency, the average Jaccard index is calculated: JI avg = (0.46 + 0.45) /2 = 0
Containment score calculation
Application of the containment index formula then results in the following:
- Containment (IPR1, IPR2) = 17534/19539 = 0.90
- Containment (IPR2, IPR1) = 16950/37052 = 0.46
Conclusion
In summary, we can see the calculations have shown:
- Jaccard index < 0.75
- Containment (IPR1, IPR2) > 0.75
- Containment (IPR2, IPR1) < 0.75
From these results, it can be said that IPR1 and IPR2 are considered as overlapping entries, more precisely IPR1