Manual Annotation Efforts
Manual annotation is the direct assignment of GO terms to proteins, ncRNA and protein complexes by curators from evidence extracted during the review of published scientific literature, with an appropriate evidence code assigned to give an assessment of the strength of the evidence.
Manual curation by the GOA group is according to the following rules:
- The GOA project provides GO annotation to the UniProt Knowledgebase. UniProtKB accessions are the primary sequence identifier used for proteins. We also annotate to protein isoforms (e.g. Q4VCS5-2 ), post-processed chains (PRO_0000030311), to protein complexes using Complex Portal identifiers (CPX-593) and to ncRNAs using RNAcentral identifiers (URS00000064B1_559292)
- Papers are read in full and data may be extracted from any section, including the Supplementary Materials
- The curator will always assign the most specific term that describes a piece of biology proven in that paper, having first read the evidence presented in the paper.
If you are interested in any gene products which have not been manually annotated recently, please e-mail email@example.com and we will endeavour to add them to our priority lists.
For additional information, please visit tutorial and webinar on QuickGO
Electronic Annotation Methods
The majority - more than 99% - of GO annotations in the GOA database are made using electronic annotation methods.
There are a number of different techniques used to associate GO terms with gene products, including:
- projection of annotations from one species to another based on orthology
- prediction of GO terms based on manually-curated rules
- prediction of GO terms based on sequence features
- mapping of corresponding concepts in other controlled vocabularies to GO terms
Each GO annotation created by an electronic method has the following attributes:
- an ECO (evidence) code indicating that an automated assertion method was used
- a GO_REF reference that gives an overview of the prediction / association methodology
- a with/from attribute that gives additional, context-dependent, information about the source of the association
We currently integrate annotations from the following nine electronic annotation pipelines:
|UniProt Subcellular Location2GO||GO_REF:0000039
|Ensembl & EnsemblGenomes||GO_REF:0000107|
|Gene Ontology Consortium||GO_REF:0000108|