Where does the data come from?

In order to classify proteins into families and to predict the presence of important domains or sequence features, we require computational tools. One such set of tools are predictive models known as protein signatures.

Signatures are built by the member databases in the InterPro consortium (Figure 2). Different member databases use different methods to construct their signatures, and they have their own particular focus of interest: structural and/or functional domains, protein families, or protein features such as active sites or binding sites.

To learn more about the different methods that can be used to classify proteins and the different types of models used by InterPro's member databases, see our Introduction to Protein Classification tutorial.

Figure 2.  InterPro member databases grouped by signature construction method and focus of interest.