Keyword2GO

UniProtKB/Swiss-Prot entries are assigned keywords manually based on literature and sequence analysis checks by curators. In addition, UniProtKB/TrEMBL entries are assigned keywords automatically from two sources;

  • initially, as the TrEMBL entry is first created, based on keywords in the nucleotide sequence entry

  • subsequently, during automatic annotation, by two programs

    RuleBase – which uses manually curated rules
    Spearmint – which is an automatic system based on decision trees

Keywords are mapped to corresponding GO terms in the UniProtKB-KW2GO file, which was originally constructed manually by MGI curators and is now maintained by the UniProt-GOA team at EBI. The mappings are then transitively assigned at each UniProt-GOA release. GO annotations using this technique will receive the evidence code Inferred from Electronic Annotation (IEA).

This method has been evaluated at 91-98% accurate (Camon et. al., 2005).

The UniProtKB-KW2GO mapping file is available at:
http://www.geneontology.org/external2go/uniprotkb_kw2go .

Example. The UniProtKB keyword ‘Cell junction’ (KW-0965) has been assigned to the Angiomotin protein (UniProtKB accession Q4VCS5). A mapping was manually created between this keyword and the GO term ‘cell junction’ (GO:0030054). Therefore, Angiomotin, and any other protein associated with KW-0965, will automatically be assigned the GO term ‘cell junction’.

The annotations created by UniProtKB-KW2GO mapping are displayed in the UniProt-GOA gene association files (Fig. 1), the keyword will be indicated in column 8 ('With') and column 6 (DB:Reference) will indicate that this method has either GO reference: GO_REF:0000037 or GO_REF:0000038 depending on whether the keyword is applied to a curator reviewed (UniProtKB/Swiss-Prot) or unreviewed (UniProtKB/TrEMBL) entry. UniProtKB-KW2GO annotations can also be viewed in QuickGO .

Figure 1. Representation of an UniProtKB-KW2GO annotation in the gene association file.

Database Object ID Object Symbol Qualifier GO ID Reference Evidence 'With' Column
UniProt Q4VCS5 AMOT_HUMAN   GO:0030054 GO_REF:0000037 IEA UniProtKB-KW:KW-0965

 

Aspect Object Name Object Synonym Object Type Taxon ID Date Source DB
C AMOT, KIAA1071: Angiomotin IPI00163085 protein taxon:9606 20080501 UniProtKB

 

Camon et. al. (2005) An evaluation of GO annotation retrieval for BioCreAtIvE and GOA. BMC Bioinformatics 6 Suppl. 1:S17

spacer