InterPro2GO mapping

InterPro2GO mapping

InterPro is an integrated resource of protein families, domains and sites which are combined from a number of different protein signature databases, including. Gene3D, Panther, PRSF, Pfam, PRINTS, ProSite, ProDom, SMART, SUPERFAMILY and TIGRFAMs.

Signatures describing the same protein family or domain are grouped into unique InterPro entries. This InterPro resource is then applied across the UniProt KnowledgeBase, and all UniProtKB protein sequences that have matches to a particular InterPro entry are cross-referenced.

Where an InterPro entry hits a set of functionally similar proteins, GO terms describing the conserved function or location are associated with the InterPro entry. The ‘mapping’ between InterPro domains and GO terms is available at:

http://www.geneontology.org/external2go/interpro2go .

This InterPro2GO file is generated manually by the InterPro team at the EBI. To generate this table, curators compare InterPro and protein entries and for matching entries they;

  • Look at the statistics on DE lines, keywords and comments

  • Check how conserved the common annotation is
  • Look for an appropriate GO term at the most specific level to be relevant to all proteins in that family

The mapping file is then used to assign annotations to UniProtKB proteins at each GOA release. GO annotations using this technique receive the evidence code Inferred from Electronic Annotation (IEA).

This method has been evaluated at 91-100% accurate(Camon et. al . 2005).

Example.

The InterPro domain IPR001095 is the alpha subunit of the acetyl coenzyme A carboxylase complex. Proteins with this domain have been shown to have acetyl-CoA carboxylase activity, therefore this domain has been mapped to the GO term ‘acetyl-CoA carboxylase activity’ (GO:0003989). Any protein which contains this domain will automatically be assigned the GO term ‘acetyl-CoA carboxylase activity’.

The annotations created by InterPro2GO mapping are displayed in the GOA gene association files (Fig. 1), the InterPro domain identifier of the annotation source will be indicated in column 8 ('With') and column 6 (DB:Reference) will indicate that this method has the GO reference: GO_REF:0000002.

InterPro2GO annotations can also be viewed in QuickGO .

Camon et. al. (2005) An evaluation of GO annotation retrieval for BioCreAtIvE and GOA. BMC Bioinformatics 6 Suppl. 1:S17

spacer