What are protein families?

A protein family is a group of proteins that share a common evolutionary origin, reflected by their related functions and similarities in sequence or structure.

Protein families are often arranged into hierarchies, with proteins that share a common ancestor subdivided into smaller, more closely related groups. The terms superfamily (describing a large group of distantly related proteins) and subfamily (describing a small group of closely related proteins) are sometimes used in this context.  A hypothetical protein family hierarchy is illustrated in Figure 2.

Figure 2 A hypothetical protein family hierarchy showing the relationships between superfamily, family and subfamily members. Directional arrows indicate that one group is a subgroup of another.

One set of proteins that comprise a superfamily are the G protein-coupled receptors (GPCRs). These are a large and diverse group of proteins that are involved in many biological processes, including photoreception, regulation of the immune system, and nervous system transmission. At the superfamily level, GPCRs share two common properties – they have seven transmembrane domains, and interact with specialised proteins (called G proteins) to influence intracellular pathways after binding extracellular signals (you can visit this GPCR webpage for more information).

Figure 3 The GPCR superfamily hierarchy. Families and subfamilies to which the short-wave-sensitive opsin 1 protein belongs are highlighted in blue.

As we group the GPCRs into smaller families, the individual groups have more properties in common. For example, the protein short-wave-sensitive opsin 1 belongs to a specialised family, known as the rhodopsin-like GPCRs. The rhodopsin-like GPCRs themselves can be further broken down into smaller families that respond to different signals. Short-wave-sensitive opsin 1 proteins belong to the opsin family (opsins being the photoreceptors of animal retinas), but more specifically, they are members of the blue-sensitive opsin subfamily, all of which are activated by a particular wavelength of light. This protein family hierarchy is illustrated in Figure 3.

As can be seen from this example, when classifying proteins into hierarchical families, the level at which we can place a protein in the hierarchy is vital, since it determines the amount of specific functional information that we can infer.