About GOA

Manual Annotation Efforts

Manual annotation is the direct assignment of GO terms to proteins, ncRNA and protein complexes by curators from evidence extracted during the review of published scientific literature, with an appropriate evidence code assigned to give an assessment of the strength of the evidence.

Manual curation by the GOA group is according to the following rules:

  1. The GOA project provides GO annotation to the UniProt Knowledgebase. UniProtKB accessions are the primary sequence identifier used for proteins. We also annotate to protein isoforms (e.g. Q4VCS5-2 ), post-processed chains (PRO_0000030311), to protein complexes using Complex Portal identifiers (CPX-593) and to ncRNAs using RNAcentral identifiers (URS00000064B1_559292)
  2. Papers are read in full and data may be extracted from any section, including the Supplementary Materials
  3. The curator will always assign the most specific term that describes a piece of biology proven in that paper, having first read the evidence presented in the paper.

If you are interested in any gene products which have not been manually annotated recently, please e-mail goa@ebi.ac.uk and we will endeavour to add them to our priority lists.

For additional information, please visit
tutorialand webinaron QuickGO

Electronic Annotation Methods

The majority - more than 99% - of GO annotations in the GOA database are made using electronic annotation methods.

There are a number of different techniques used to associate GO terms with gene products, including:

    - projection of annotations from one species to another based on orthology
    - prediction of GO terms based on manually-curated rules
    - prediction of GO terms based on sequence features
    - mapping of corresponding concepts in other controlled vocabularies to GO terms

Each GO annotation created by an electronic method has the following attributes:

    - an ECO (evidence) code indicating that an automated assertion method was used
    - a GO_REF reference that gives an overview of the prediction / association methodology
    - a with/from attribute that gives additional, context-dependent, information about the source of the association

We currently integrate annotations from the following nine electronic annotation pipelines:

Source Description/Reference QuickGO Link
InterPro2GO GO_REF:0000002


UniProt Keywords2GO GO_REF:0000043


UniProt Subcellular Location2GO GO_REF:0000044


EC2GO GO_REF:0000003


UniRule2GO GO_REF:0000104


Ensembl & EnsemblGenomes GO_REF:0000107






UniPathway2GO GO_REF:0000041


Gene Ontology Consortium GO_REF:0000108


RNACentral GO_REF:0000115