Manual Annotation Efforts
Manual annotation is the direct
assignment of GO terms to proteins, ncRNA and protein complexes by curators from evidence
extracted during the review of published scientific literature, with an appropriate
evidence code assigned to give an assessment of the strength of the
evidence.
Manual curation by the GOA group is according to the following rules:
- The GOA project provides GO annotation to the UniProt Knowledgebase. UniProtKB accessions are the primary sequence identifier used for proteins. We also annotate to protein isoforms (e.g. Q4VCS5-2 ), post-processed chains (PRO_0000030311), to protein complexes using Complex Portal identifiers (CPX-593) and to ncRNAs using RNAcentral identifiers (URS00000064B1_559292)
- Papers are read in full and data may be extracted from any section, including the Supplementary Materials
- The curator will always assign the most specific term that describes a piece of biology proven in that paper, having first read the evidence presented in the paper.
If you are interested in any gene
products which have not been manually annotated recently, please e-mail goa@ebi.ac.uk
and we will endeavour to add them to our priority lists.
For additional information, please visit tutorialand
webinaron
QuickGO
Electronic Annotation Methods
The majority - more than 99% - of GO annotations in the GOA database are made using electronic annotation methods.
There are a number of different techniques used to associate GO terms with gene products, including:
- projection of annotations from one species to another based on
orthology
- prediction of GO terms based on manually-curated rules
- prediction of GO terms based on sequence features
- mapping of corresponding concepts in other controlled vocabularies to GO
terms
Each GO annotation created by an electronic method has the following attributes:
- an ECO (evidence) code indicating that an automated assertion method was
used
- a GO_REF reference that gives an overview of the prediction / association
methodology
- a with/from attribute that gives additional, context-dependent, information
about the source of the association
We currently integrate annotations from the following nine electronic annotation pipelines: