Ensembl Compara

Ensembl Compara

Since December 2006, the UniProt-GOA and Ensembl Compara teams have used the gene orthology data obtained from Ensembl Compara to project GO terms from a source species onto one or more target species. More information about the Compara homology method can be found here .

Originally, this method supplied annotations to proteins from vertebrate species. For this pipeline, only one to one and apparent one to one orthologies are used, and only manually annotated GO terms with an experimental evidence type of either IDA, IEP, IGI, IMP or IPI are projected. No annotations with a 'NOT' qualifier are projected and annotations to the GO:0005515 protein binding term are also not projected. GO annotations created using this technique receive the evidence code 'IEA' (Inferred from Electronic Annotation).

Since August 2011 a new pipeline using Compara orthologies has been created by the Ensembl Plants and Gramene groups that projects GO terms between plant species and in June 2012 the EnsemblFungi group initiated projection of GO terms between fungal species.

For both EnsemblPlants/Gramene and EnsemblFungi, one to one, one to many and many to many orthologies are used but annotations are only projected between orthologs that have at least a 40% peptide identity to each other. Only GO annotations with an evidence type of IDA, IEP, IGI, IMP or IPI are projected, no annotations with a 'NOT' qualifier are projected and annotations to the GO:0005515 protein binding term are not projected. Projected GO annotations using this technique also receive the evidence code IEA.

All projections and the resulting annotations are updated monthly.

In the UniProt-GOA gene association files, either the Ensembl protein identifier or the model organism database identifier of the annotation source is indicated in column 8 ('With') and column 15 (Source DB) contains the value 'Ensembl' (for the vertebrate pipeline) or 'EnsemblPlants/Gramene' (for the plant pipeline). Column 6 (DB:Reference) indicates that the vertebrate method has the GO reference: GO_REF:0000019 and the plant method has the GO reference: GO_REF:0000035.

The annotations from these projections can also be viewed in QuickGO;
vertebrate GO annotation projections
plant GO annotation projections

Figure 1. shows examples of projected GO annotations from both pipelines.

Figure 1. Representation of the projected GO annotations in QuickGO a) An annotation produced by the vertebrate pipeline, the GO term has been projected from a mouse to a human protein. b) Annotations produced by the plant pipeline, the GO terms have been projected from Arabidopsis proteins to a grape protein.



UniProt-GOA collaborates with

Cardiovascular Gene Ontology Annotation Initiative DictyBase Ensembl Compara Enzyme Nomenclature FlyBase  GO Consortium logo GUDMAP Gramene HAMAP HGNC HPA IntAct InterPro LIFEdb MGI MTBbase logo Reactome RI RGD SGD SwissProt TAIR TIGR WormBase ZFIN The Evidence & Conclusion Ontology