Keyword2GO
UniProtKB/Swiss-Prot entries are assigned keywords manually based on literature and sequence analysis checks by curators. In addition, UniProtKB/TrEMBL entries are assigned keywords automatically from two sources;
-
initially, as the TrEMBL entry is first created, based on keywords in the nucleotide sequence entry
-
subsequently, during automatic annotation, by two programs
RuleBase – which uses manually curated rules
Spearmint – which is an automatic system based on decision trees
Keywords are mapped to corresponding GO terms in the UniProtKB-KW2GO file, which was originally constructed manually by MGI curators and is now maintained by the UniProt-GOA team at EBI. The mappings are then transitively assigned at each UniProt-GOA release. GO annotations using this technique will receive the evidence code Inferred from Electronic Annotation (IEA).
This method has been evaluated at 91-98% accurate (Camon et. al., 2005).
The UniProtKB-KW2GO mapping file is available at:
http://www.geneontology.org/external2go/uniprotkb_kw2go .
Example. The UniProtKB keyword ‘Cell junction’ (KW-0965) has been assigned to the Angiomotin protein (UniProtKB accession Q4VCS5). A mapping was manually created between this keyword and the GO term ‘cell junction’ (GO:0030054). Therefore, Angiomotin, and any other protein associated with KW-0965, will automatically be assigned the GO term ‘cell junction’.
The annotations created by UniProtKB-KW2GO mapping are displayed in the UniProt-GOA gene association files (Fig. 1), the keyword will be indicated in column 8 ('With') and column 6 (DB:Reference) will indicate that this method has either GO reference: GO_REF:0000037 or GO_REF:0000038 depending on whether the keyword is applied to a curator reviewed (UniProtKB/Swiss-Prot) or unreviewed (UniProtKB/TrEMBL) entry. UniProtKB-KW2GO annotations can also be viewed in QuickGO .
Figure 1. Representation of an UniProtKB-KW2GO annotation in the gene association file.
| Database | Object ID | Object Symbol | Qualifier | GO ID | Reference | Evidence | 'With' Column |
|---|---|---|---|---|---|---|---|
| UniProt | Q4VCS5 | AMOT_HUMAN | GO:0030054 | GO_REF:0000037 | IEA | UniProtKB-KW:KW-0965 |
| Aspect | Object Name | Object Synonym | Object Type | Taxon ID | Date | Source DB |
|---|---|---|---|---|---|---|
| C | AMOT, KIAA1071: Angiomotin | IPI00163085 | protein | taxon:9606 | 20080501 | UniProtKB |
Camon et. al. (2005) An evaluation of GO annotation retrieval for BioCreAtIvE and GOA. BMC Bioinformatics 6 Suppl. 1:S17

