InterPro entry types


InterPro entry types

InterPro entries are classified into one of four categories, depending on the biological entity they represent: protein family, domain, repeat or site.

The entry type is indicated by a specific icon (Figure 5), which can be found on the top left hand side of an InterPro entry page.

  Icons denoting the different type of entries that can be found in the InterPro database

Figure 5. Icons denoting the different type of entries (family, domain, repeat or site) that can be found in the InterPro database.

InterPro entry types: Family

An InterPro protein family is a group of proteins that share a common evolutionary origin, reflected by their related functions and similarities in sequence or structure. Protein families are often arranged into hierarchies, with proteins that share a common ancestor subdivided into smaller, more closely related groups. For example, steroid hormone receptors constitute a family of nuclear receptors responsible for signal transduction mediated by steroid hormones, and can be subclassified into different groups, including the liver X receptor subfamily (Figure 6). This subfamily consists of nuclear receptors that regulate the metabolism of several important lipids, including oxysterols.

Example of a protein family hierarchy

Figure 6. Example of a protein family hierarchy. The steroid hormone receptor family can be subdivided into a number of smaller, closely related subfamilies.




InterPro entry types: Domain

Domains are distinct functional and/or structural units in a protein. Usually they are responsible for a particular function or interaction, contributing to the overall role of a protein. Domains may exist in a variety of biological contexts, where similar domains can be found in proteins with different functions. 

For example, the pleckstrin homology (PH) domain is a small modular domain that occurs in a large variety of proteins and is involved in phospholipid binding. One group of proteins containing a PH domain are the beta-adrenergic receptor kinases (Figure 7). Four domains have been identified in these proteins: an RSG (regulator of G protein signalling) domain, a protein kinase (PK) domain, an AGCK domain, involved in regulation by phosphorylation, and a C-terminal PH domain.                                                                                            

Graphical representation of the domain architecture of beta-adrenergic receptor kinases

Figure 7. Graphical representation of the domain architecture of beta-adrenergic receptor kinases.

InterPro entry types: Sites and Repeats


Sites are groups of amino acids that confer certain characteristics upon a protein, and may be important for its overall function. Sites are usually quite small (often only a few amino acids long). The types of site covered by InterPro are:

  • active sites, which contain amino acids involved in catalytic activity
  • binding sites, containing amino acids that are directly involved in binding molecules or ions
  • post-translational modification (PTM) sites, which contain residues known to be chemically modified (phosphorylated, palmitoylated, acetylated, etc) after the process of protein translation
  • conserved sites, which are found in specific types of proteins, but whose function is uknown. 


For example, the stretch of residues involved in the catalytic function of the S1B subfamily of serine peptidases is specific to this type of peptidases and constitutes the active site of these proteins.


Repeats are typically short amino acid sequences that are repeated within a protein, and may confer binding or structural properties upon it. 

For example, pentapeptide repeats are sequence motifs of five amino acids found in multiple tandem copies. They were first identified in cyanobacterial proteins, where they can be found in many copies. Their function is currently unknown.