UniProtKB/Swiss-Prot entries are assigned keywords manually based on literature and sequence analysis checks by curators. In addition, UniProtKB/TrEMBL entries are assigned keywords automatically from two sources;

  • initially, as the TrEMBL entry is first created, based on keywords in the nucleotide sequence entry

  • subsequently, during automatic annotation, by two programs

    RuleBase – which uses manually curated rules
    Spearmint – which is an automatic system based on decision trees

Keywords are mapped to corresponding GO terms in the UniProtKB-KW2GO file, which was originally constructed manually by MGI curators and is now maintained by the GOA team at EBI. The mappings are then transitively assigned at each GOA release. GO annotations using this technique will receive the evidence code Inferred from Electronic Annotation (IEA).

This method has been evaluated at 91-98% accurate (Camon et. al., 2005).

The UniProtKB-KW2GO mapping file is available at:
http://www.geneontology.org/external2go/uniprotkb_kw2go .

Example. The UniProtKB keyword ‘Cell junction’ (KW-0965) has been assigned to the Angiomotin protein (UniProtKB accession Q4VCS5). A mapping was manually created between this keyword and the GO term ‘cell junction’ (GO:0030054). Therefore, Angiomotin, and any other protein associated with KW-0965, will automatically be assigned the GO term ‘cell junction’.

The annotations created by UniProtKB-KW2GO mapping are displayed in the GOA gene association files (Fig. 1), the keyword will be indicated in column 8 ('With') and column 6 (DB:Reference) will indicate that this method has either GO reference: GO_REF:0000037 or GO_REF:0000038 depending on whether the keyword is applied to a curator reviewed (UniProtKB/Swiss-Prot) or unreviewed (UniProtKB/TrEMBL) entry. UniProtKB-KW2GO annotations can also be viewed in QuickGO .

Camon et. al. (2005) An evaluation of GO annotation retrieval for BioCreAtIvE and GOA. BMC Bioinformatics 6 Suppl. 1:S17