0%

What is hierarchical classification?

In CATH hierarchical classification, high-quality protein structures are split into their consecutive polypeptide chains, which are further subdivided into one or more domains, using a mixture of automatic methods and manual curation. The domains are then classified within the CATH structural hierarchy:

  • Class refers to the secondary structure content (e.g. mainly-alpha, mainly-beta, mixed alpha/beta or ‘few secondary structures’).
  • Architecture refers to the general arrangement of the secondary structures irrespective of connectivity between them (e.g. alpha/beta sandwich).
  • Topology, also known as the ‘fold’ level, takes into account the connectivity of secondary structures in the chain.
  • Homologous Superfamily refers to a set of domains that are related through a common ancestor.

Each classification level has a CATH code associated with it. Have a look at the following example about HIGH-signature proteins, UspA superfamily and the PP-ATPase superfamily (HUPs, Figure 2):

Figure 2 The HUPS superfamily example illustrating four levels of classification in CATH.

In this example, the CATH code is 3.40.50.620. The 3 refers to the class to which the domain belongs (mixed alpha-beta), the 3.40 refers to the architecture (3-Layer(aba) Sandwich), the 3.40.50 refers to the actual fold or topology the domain adopts (Rossmann fold) and 3.40.50.620 is the homologous superfamily code (HUPs).

 To browse CATH Hierarchy, please see link: http://www.cathdb.info/browse/tree