Since December 2006, the UniProt-GOA and Ensembl Compara teams have used the gene orthology data obtained from Ensembl Compara to project GO terms from a source species onto one or more target species. More information about the Compara homology method can be found here .
Originally, this method supplied annotations to proteins from vertebrate species. For this pipeline, only one to one and apparent one to one orthologies are used, and only manually annotated GO terms with an experimental evidence type of either IDA, IEP, IGI, IMP or IPI are projected. No annotations with a 'NOT' qualifier are projected and annotations to the GO:0005515 protein binding term are also not projected. GO annotations created using this technique receive the evidence code 'IEA' (Inferred from Electronic Annotation).
Since August 2011 a new pipeline using Compara orthologies has been created by the Ensembl Plants and Gramene groups that projects GO terms between plant species. One to one, one to many and many to many orthologies are used but annotations are only projected between orthologs that have at least a 40% peptide identity to each other. Only GO annotations with an evidence type of IDA, IEP, IGI, IMP or IPI are projected, no annotations with a 'NOT' qualifier are projected and annotations to the GO:0005515 protein binding term are not projected. Projected GO annotations using this technique also receive the evidence code IEA.
All projections and the resulting annotations are updated monthly.
In the UniProt-GOA gene association files, either the Ensembl protein identifier or the model organism database identifier of the annotation source is indicated in column 8 ('With') and column 15 (Source DB) contains the value 'Ensembl' (for the vertebrate pipeline) or 'EnsemblPlants/Gramene' (for the plant pipeline). Column 6 (DB:Reference) indicates that the vertebrate method has the GO reference: GO_REF:0000019 and the plant method has the GO reference: GO_REF:0000035.
Figure 1. shows examples of projected GO annotations from both pipelines.
Figure 1. Representation of the projected GO annotations in QuickGO a) An annotation produced by the vertebrate pipeline, the GO term has been projected from a mouse to a human protein. b) Annotations produced by the plant pipeline, the GO terms have been projected from Arabidopsis proteins to a grape protein.