0%

Visualising InterPro data

InterPro member database signature matches to proteins are displayed graphically (Figure 3). The protein family membership section provides information about the InterPro family entry to which the protein is predicted to belong.

Firstly, the protein sequence is shown along a grey bar. Below, up to six sections are displayed in the ‘Summary‘ view, depending on the information available:

  • AlphaFold Confidence – a bar showing the coloured confidence score of the corresponding AlphaFold prediction (pLDDT) is displayed on top if available
  • Families – the representative member database family signature matching the protein is visible in one line
  • Domains – a representation of the different domains found in a protein in one line can be observed in this section. Coloured bars indicate the location of representative domain signature matches on the protein sequence. A second line shows the domains predicted by The Encyclopedia of Domains (TED)
  • Intrinsically Disordered Regions – data from MobiDB and DisProt are shown
  • Conserved Residues – residues annotations are provided by the CDD, SFLD and PIRSR databases
  • Pathogenic And Likely Pathogenic Variants – this information is provided by UniProt

By default, a shortened version of these sections is shown, as ‘Summary‘ is selected in ‘Feature Display Mode‘ on top of the protein sequence viewer. If ‘Full’ is selected instead, all member database signatures matching the protein are displayed in the corresponding section, with coloured bars indicating the location of all the matches on the protein sequence. In addition to the sections available in the Summary view extra annotations (e.g. Binding Sites, Active Sites, Pfam-N, Funfams …) are displayed when available. The selected option will be remembered by the browser and displayed similarly throughout the website.

You can also expand or collapse each individual section by clicking on the black triangle in the left-hand side of its title.

Each matching member database signature is displayed on a separate line. They can be coloured by accession, member database or domain relationship.

Member database signatures type Domain, Repeat or Homologous Superfamily, as well as RepeatsDB annotations, are shown in the expanded Domains category

On the right-hand side, the accession, short name or name of each signature is shown. If the signature is integrated in an InterPro entry, the relevant InterPro entry information will be shown immediately above. You can access further information about an InterPro entry or a member database signature by clicking on the accession/name on the right side, which will take you to the corresponding InterPro page.

The tooltip that appears when hovering over a signature bar also shows whether it is integrated in an InterPro entry or not.

On top of the sequence viewer, different icons allow to display the viewer on full screen and zoom in and out of the protein sequence. The Options button offers to customise the display by changing the colouring, disabling tooltips information or display signatures – and the InterPro entries where they are integrated – accessions, full names and/or short names.

InterPro-N predicted matches are distinguished by a leading sparkles icon on the right hand label in the protein sequence viewer and by a top right 
superscript on the InterPro or member database accession number in the tooltip.

Click on the  icons in Figure 3 to get information about each section of the page.

Figure 3 The InterPro web interface showing protein matches (A8KBH6).

The protein sequence viewer is slightly different in the InterProScan results page. The representative family is not yet implemented in InterProScan; instead, all the family matches are displayed in the 'Summary' view.