InterPro2GO mapping

InterPro2GO mapping

InterPro is an integrated resource of protein families, domains and sites which are combined from a number of different protein signature databases, including. Gene3D, Panther, PRSF, Pfam, PRINTS, ProSite, ProDom, SMART, SUPERFAMILY and TIGRFAMs.

Signatures describing the same protein family or domain are grouped into unique InterPro entries. This InterPro resource is then applied across the UniProt KnowledgeBase, and all UniProtKB protein sequences that have matches to a particular InterPro entry are cross-referenced.

Where an InterPro entry hits a set of functionally similar proteins, GO terms describing the conserved function or location are associated with the InterPro entry. The ‘mapping’ between InterPro domains and GO terms is available at:

http://www.geneontology.org/external2go/interpro2go .

This InterPro2GO file is generated manually by the InterPro team at the EBI. To generate this table, curators compare InterPro and protein entries and for matching entries they;

  • Look at the statistics on DE lines, keywords and comments

  • Check how conserved the common annotation is
  • Look for an appropriate GO term at the most specific level to be relevant to all proteins in that family

The mapping file is then used to assign annotations to UniProtKB proteins at each UniProt-GOA release. GO annotations using this technique receive the evidence code Inferred from Electronic Annotation (IEA).

This method has been evaluated at 91-100% accurate(Camon et. al . 2005).

Example.

The InterPro domain IPR001095 is the alpha subunit of the acetyl coenzyme A carboxylase complex. Proteins with this domain have been shown to have acetyl-CoA carboxylase activity, therefore this domain has been mapped to the GO term ‘acetyl-CoA carboxylase activity’ (GO:0003989). Any protein which contains this domain will automatically be assigned the GO term ‘acetyl-CoA carboxylase activity’.

The annotations created by InterPro2GO mapping are displayed in the UniProt-GOA gene association files (Fig. 1), the InterPro domain identifier of the annotation source will be indicated in column 8 ('With') and column 6 (DB:Reference) will indicate that this method has the GO reference: GO_REF:0000002.

InterPro2GO annotations can also be viewed in QuickGO .

Figure 1. Representation of an InterPro2GO annotation in the gene association file.

Database Object ID Object Symbol Qualifier GO ID Reference Evidence 'With' Column
UniProt B0CF67 B0CF67_ACAM1   GO:0003989 GOA:interpro|

GO_REF:0000002
IEA InterPro:IPR001095

 

Aspect Object Name Object Synonym Object Type Taxon ID Date Source DB
F accA, AM1_5633: Acetyl-coenzyme A carboxylase carboxyl transferase subunit alpha   protein taxon:329726 20080501 UniProtKB


Camon et. al. (2005) An evaluation of GO annotation retrieval for BioCreAtIvE and GOA. BMC Bioinformatics 6 Suppl. 1:S17

spacer

UniProt-GOA collaborates with

Cardiovascular Gene Ontology Annotation Initiative DictyBase Ensembl Compara Enzyme Nomenclature FlyBase  GO Consortium logo GUDMAP Gramene HAMAP HGNC HPA IntAct InterPro LIFEdb MGI MTBbase logo Reactome RI RGD SGD SwissProt TAIR TIGR WormBase ZFIN The Evidence & Conclusion Ontology