0%

How are Genome Properties composed?

Genome Properties are composed of a number of reactions or steps. Each step is represented by an InterPro accession that contains one or more protein signature models that describe the specific protein required to carry out the step. Matching these protein signature models to a species’ proteome is what determines the result for that Genome Property in that species. The use of InterPro entries to define the steps, means that example proteins from extremely diverse species can be assigned as a match for the step, and so the presence of the Genome Property can be asserted in proteomes which are relatively unstudied.

If all steps within a property are found to match a species’ proteome, then the result for that property in that species is a YES. There are a number of reasons why some steps may not be found within a proteome. For example, it may be that there is no really good protein signature model for a particular protein. To account for cases like this, we define a threshold value to each property. This is the number of found steps that must be exceeded to infer that the property exists in that species (Figure 2).

As a consequence, each Genome Property result can be either YES (present), PARTIAL (likely present) or NO (absent).

Figure showing an example pathway, how the individual enzymatic steps can be described by InterPro models, and how the result is calculated based on the defined threshold.
Figure 2 The example pathway shown has three required steps, represented by three InterPro entries. The threshold value, is the number of matched steps above which a partial result can be reported.