spacer
spacer

UniProt-GOA README

1.  Contents
------------

1.  Contents
2.  Introduction
3.  Differences in the UniProtKB gene association file from GO and GOA ftp sites
4.  List of files and file formats
5.  Assigment of GO terms to UniProtKB data
6.  Addition of manual annotation in UniProt-GOA.
7.  Addition of GO assignments from other data sources
8. Further information on the PDB association file
9. Contacts
10. Copyright Notice


2.  Introduction
----------------

UniProt-GOA (GO Annotation@EBI) is a project run by the European
Bioinformatics Institute that aims to provide assignments
of proteins to the Gene Ontology (GO) resource.  The
goal of the Gene Ontology Consortium is to produce a dynamic
controlled vocabulary that can be applied to all organisms,
even while the knowledge of the gene product roles in cells is
still accumulating and changing.

In the UniProt-GOA project, this vocabulary is applied to all proteins
described in the UniProt (Swiss-Prot and TrEMBL) Knowledgebase.

UniProt-GOA also provides species-specific annotation sets
using the UniProtKB Complete Proteome sets that have undergone an
additional electronic annotation filtering to remove redundancy.
Currently Arabidopsis thaliana, Gallus gallus, Bos taurus, Dictyostellium discoideum, 
Drosophila melanogaster, Homo sapiens, Mus musculus, rattus norvegicus,  Danio rerio,
Canis familiaris, Caenorhabditis elegans, Saccharomyces cerevisiae and Sus scrofa 
datasets are available.

Additional, non-filtered species-specific sets are available from the proteomes
sets, which include separate annotation files for all species whose
genome has been fully sequenced, where the sequence is publicly
available, and where the proteome contains >25% GO annotation.

UniProtKB manual GO annotations are created by UniProtKB curators from the EBI 
and  the Swiss Insitute of Bioinformatics.
The dataset is supplemented with manual GO annotation from external
model organism databases: AgBase, BHF-UCL, CGD,DictyBase, Ensembl, FlyBase, GDB,
GeneDB(S.pombe, P.falciparum), Gramene, HGNC, MGI, MTBbase, PAMGO, Reactome,
RefGenome, RGD, Roslin, SGD, TAIR, TIGR, WormBase, ZFIN, the IntAct
protein-protein interaction database,LIFEdb and the Proteome Inc dataset
(see section 9). The original source of an annotation is always indicated
in column 15 ('assigned by') of an association file.

For manual annotation, curators aim to capture the most recent data
from curated papers that provide experimental evidence for the unique
features of a given protein. Our approach is protein-centric rather
than paper-centric, as we don't read all papers that might be used
to assign the same GO term. However when experimental evidence is
read which further experimentally verifies a function, redundant
annotations to a term using different references are created as
this can provide greater confidence to a GO annotation.

For further information please refer to our web site at:
http://www.ebi.ac.uk/GOA

External Contributors to the UniProt-GOA Gene Association Files:

AgBase                                              http://www.agbase.msstate.edu
BHF-UCL                                             http://www.cardiovasculargeneontology.com
CGD                                                 http://www.candidagenome.org
DictyBase                                           http://dictybase.org
EcoCyc                                              http://www.ecocyc.org
EcoWiki                                             http://ecowiki.org
Ensembl                                             http://www.ensembl.org
FlyBase                                             http://www.flybase.org
GDB  (Human Genome Database)                        http://www.gdb.org
GeneDB (S.pombe)                                    http://www.genedb.org/genedb/pombe
GeneDB (P. falciparum)                              http://www.genedb.org/genedb/malaria
GOC (inferred annotations from GO OBO v1.2)         http://www.geneontology.org
Gramene                                             http://www.gramene.org
HGNC (HUGO Gene Nomenclature Committee)             http://www.gene.ucl.ac.uk/nomenclature
Human Protein Atlas                                 http://www.proteinatlas.org
IntAct                                              http://www.ebi.ac.uk/intact (see also section 9.)
InterPro                                            http://www.ebi.ac.uk/interpro
LifeDB                                              http://www.lifedb.de
MGI (Mouse Genome Informatics)                      http://www.informatics.jax.org
MTBbase						    http://www.ark.in-berlin.de/Site/MTBbase.html
PAMGO project; Agrobacterium Genome Consortium      http://www.agrobacterium.org
Proteome Inc. (see section 9.)
Reactome                                            http://www.reactome.org
RefGenome (GO Consortium Reference Genomes project) http://www.geneontology.org/GO.refgenome.shtml
RGD (Rat Genome Database)                           http://rgd.mcw.edu
Roslin Institute                                    http://www.ri.bbsrc.ac.uk
SGD (Saccaromyces Genome Database)                  http://www.yeastgenome.org
TAIR (The Arabidopsis Information Resource)         http://www.arabidopsis.org
TIGR (The Insitute for Genomic Research)            http://www.tigr.org
WormBase                                            http://www.wormbase.org
ZFIN (Zebrafish Information Network)                http://zfin.org


3. Differences in the UniProtKB gene association file from GO and GOA ftp sites.
------------------------------------------------------------------------------
Please note that both the filtered and unfiltered versions of the GOA UniProtKB
gene association file are available from the GO Consortium ftp site
(ftp.geneontology.org). The filtered version does not contain annotations for
those species where a different Consortium group is primarily responsible
for providing GO annotations.

If you would like to download an unfiltered GOA UniProtKB gene association
file, please use either the GOA ftp site:
ftp://ftp.ebi.ac.uk/pub/databases/GO/goa/UNIPROT/gene_association.goa_uniprot.gz

Or the submissions folder in the GO Consortium ftp site:
ftp://ftp.geneontology.org/pub/go/gene-associations/submission/gene_association.goa_uniprot.gz

Species which are not present in the filtered version of the gene_association.goa_uniprot.gz
file on the GO Consortium site include:

Danio rerio, Drosophila melanogaster, Mus musculus, Rattus norvegicus,
Arabidopsis thaliana, all rice species,
Bacillus anthracis str. Ames, Campylobacter jejuni RM1221, Candida albicans,
Caenorhabditis elegans, Coxiella burnetii RSA 493, Dehalococcoides ethenogenes 195,
Dictyostelium sp., Dictyostelium discoideum, Geobacter sulfurreducens PCA,
Glossina morsitans morsitans, Leishmania major, Listeria monocytogenes str. 4b F2365,
Methylococcus capsulatus str. Bath, Pseudomonas syringae pv. tomato str. DC3000,
Plasmodium falciparum, Saccharomyces cerevisiae, Schizosaccharomyces pombe,
Shewanella oneidensis MR-1, Silicibacter pomeroyi DSS-3, Trypanosoma brucei and
Vibrio cholerae O1 biovar eltor.

Further information on this filtering script can be found at:
http://www.geneontology.org/GO.annotation.shtml#taxon

4.  List of files and file formats
----------------------------------

The UniProt-GOA project produces the following gene association files:

i) gene_association.goa_uniprot

Locations: ftp://ftp.geneontology.org/pub/go/gene-associations/submission/gene_association.goa_uniprot.gz
ftp://ftp.ebi.ac.uk/pub/databases/GO/goa/UNIPROT/gene_association.goa_uniprot.gz
This file contains all GO annotations for proteins in the UniProt KnowledgeBase (UniProtKB).

ii) gene_association.goa_human

Locations:ftp://ftp.geneontology.org/pub/go/gene-associations/gene_association.goa_human.gz
ftp://ftp.ebi.ac.uk/pub/databases/GO/goa/HUMAN/gene_association.goa_human.gz
* Please note that the human file on the ftp.geneontology.org site may contain newer annotations (up to two weeks
different) than the human file on the ftp.ebi.ac.uk site listed above, due to a more regular release schedule of the
human and chicken files for the GO Consortium's Reference Genomes Project.
This file contains the GO assignments for the proteins of the human UniProtKB Complete Proteome.

iii) gene_association.goa_mouse

Locations:ftp://ftp.geneontology.org/pub/go/gene-associations/submission/gene_association.goa_mouse.gz
ftp://ftp.ebi.ac.uk/pub/databases/GO/goa/MOUSE/gene_association.goa_mouse.gz
This file contains the GO assignments for the proteins of the mouse UniProtKB Complete Proteome.

iv) gene_association.goa_rat

Locations:ftp://ftp.geneontology.org/pub/go/gene-associations/submission/gene_association.goa_rat.gz
ftp://ftp.ebi.ac.uk/pub/databases/GO/goa/RAT/gene_association.goa_rat.gz
This file contains the GO assignments for the proteins of the rat UniProtKB Complete Proteome.

v) gene_association.goa_arabidopsis

Locations:ftp://ftp.geneontology.org/pub/go/gene-associations/submission/gene_association.goa_arabidopsis.gz
ftp://ftp.ebi.ac.uk/pub/databases/GO/goa/ARABIDOPSIS/gene_association.goa_arabidopsis.gz
This file contains the GO assignments for the proteins of the arabidopsis UniProtKB Complete Proteome. 

v1)  gene_association.goa_chicken

Locations:ftp://ftp.geneontology.org/pub/go/gene-associations/gene_association.goa_chicken.gz
ftp://ftp.ebi.ac.uk/pub/databases/GO/goa/CHICKEN/gene_association.goa_chicken.gz
* Please note that the chicken file on the ftp.geneontology.org site may contain newer annotations (up to two weeks
different) than the chicken file on the ftp.ebi.ac.uk site listed above, due to a more regular release schedule of
the human and chicken files for the GO Consortium's Reference Genomes Project.
This file contains the GO assignments for the proteins of the chicken UniProtKB Complete Proteome.

vii) gene_association.goa_cow

Locations:ftp://ftp.geneontology.org/pub/go/gene-associations/gene_association.goa_cow.gz
ftp://ftp.ebi.ac.uk/pub/databases/GO/goa/COW/gene_association.goa_cow.gz
This file contains the GO assignments for the proteins of the cow UniProtKB Complete Proteome.

viii) gene_association.goa_zebrafish

Locations:ftp://ftp.geneontology.org/pub/go/gene-associations/submission/gene_association.goa_zebrafish.gz
ftp://ftp.ebi.ac.uk/pub/databases/GO/goa/ZEBRAFISH/gene_association.goa_zebrafish.gz
This file contains the GO assignments for the proteins of the zebrafish UniProtKB Complete Proteome.

ix) gene_association.goa_dicty.gz
Location: ftp://ftp.ebi.ac.uk/pub/databases/GO/goa/DICTY/gene_association.goa_dicty.gz
This file contains the GO assignments for the proteins of the Dictyostellium discoideum  UniProtKB Complete Proteome.

x) gene_association.goa_dog.gz
Locations:ftp://ftp.ebi.ac.uk/pub/databases/GO/goa/DOG/gene_association.goa_dog.gz
ftp://ftp.geneontology.org/pub/go/gene-associations/submission/gene_association.goa_dog.gz
This file contains the GO assignments for the proteins of the Canis familiaris UniProtKB Complete Proteome.

xi) gene_association.goa_fly.gz
Location:ftp://ftp.ebi.ac.uk/pub/databases/GO/goa/FLY/gene_association.goa_fly.gz
This file contains the GO assignments for the proteins of the Drosophila melanogaster UniProtKB Complete Proteome.

xii) gene_association.goa_pig.gz
Locations: ftp://ftp.ebi.ac.uk/pub/databases/GO/goa/PIG/gene_association.goa_pig.gz
ftp://ftp.geneontology.org/pub/go/gene-associations/submission/gene_association.goa_pig.gz
This file contains the GO assignments for the proteins of the Sus scrofa UniProtKB Complete Proteome.

xiii)gene_association.goa_worm.gz
Location:ftp://ftp.ebi.ac.uk/pub/databases/GO/goa/WORM/gene_association.goa_worm.gz
This file contains the GO assignments for the proteins of the Caenorhabditis elegans UniProtKB Complete Proteome.

xiv)Caenorhabditis elegans
Location: ftp://ftp.ebi.ac.uk/pub/databases/GO/goa/YEAST/gene_association.goa_yeast.gz
This file contains the GO assignments for the proteins of the Saccharomyces cerevisiae  UniProtKB Complete Proteome.

xv) gene_association.goa_pdb
Locations:ftp://ftp.geneontology.org/pub/go/gene-associations/submission/gene_association.goa_pdb.gz
ftp://ftp.ebi.ac.uk/pub/databases/GO/goa/PDB/gene_association.goa_pdb.gz
This file contains the GO assignments for the PDB entries in the PDB database.

We comply with the file format described by the Gene Ontology
Consortium for annotation files
GO Consortium documentation: http://www.geneontology.org/GO.format.annotation.shtml

N.B. This readme describes the GAF 2.0 file format.

Since we deal with proteins rather than genes, the semantics of some
fields in our files may be slightly different to other gene association files.

1.  DB
Database from which annotated entry has been taken.
For the UniProtKB and UniProtKB Complete Proteomes gene associaton files: UniProtKB
For the PDB association file:  PDB
Example: UniProtKB

2.  DB_Object_ID
A unique identifier in the database for the item being annotated.
Here: an accession number or identifier of the annotated protein
(or PDB entry for the gene_association.goa_pdb file)
For the UniProtKB and UniProtKB Complete Proteomes gene association files: a UniProtKB Accession.
Examples O00165

3.  DB_Object_Symbol
A (unique and valid) symbol (gene name) that corresponds to the DB_Object_ID.
An officially approved gene symbol will be used in this field when available.
Alternatively, other gene symbols or locus names are applied.
If no symbols are available, the identifier applied in column 2 will be used.
Examples: G6PC
CYB561
MGCQ309F3

4.  Qualifier
This column is used for flags that modify the interpretation of an
annotation.
If not null, then values in this field can equal: NOT, colocalizes_with, contributes_to,
NOT | contributes_to, NOT | colocalizes_with
Example: NOT

5.  GO ID
The GO identifier for the term attributed to the DB_Object_ID.
Example: GO:0005634

6.  DB:Reference
A single reference cited to support an annotation.
Where an annotation cannot reference a paper, this field will contain
a GO_REF identifier. See section 8 and
http://www.geneontology.org/doc/GO.references
for an explanation of the reference types used.
Examples: PMID:9058808
DOI:10.1046/j.1469-8137.2001.00150.x
GO_REF:0000002
GO_REF:0000020
GO_REF:0000004
GO_REF:0000003
GO_REF:0000019
GO_REF:0000023
GO_REF:0000024
GO_REF:0000033

7.  Evidence
One of either EXP, IMP, IC, IGI, IPI, ISS, IDA, IEP, IEA, TAS, NAS,
NR, ND or RCA.
Example: TAS

8.  With
An additional identifier to support annotations using certain
evidence codes (including IEA, IPI, IGI, IMP, IC and ISS evidences).
Examples: UniProtKB:O00341
InterPro:IPROO1878
RGD:123456
CHEBI:12345
Ensembl:ENSG00000136141
GO:0000001
EC:3.1.22.1

9.  Aspect
One of the three ontologies, corresponding to the GO identifier applied.
P (biological process), F (molecular function) or C (cellular component).
Example: P

10. DB_Object_Name
Name of protein
The full UniProt protein name will be present here,
if available from UniProtKB. If a name cannot be added, this field
will be left empty.
Examples: Glucose-6-phosphatase
Cellular tumor antigen p53
Coatomer subunit beta

11. Synonym
Gene_symbol [or other text]
Alternative gene symbol(s), IPI identifier(s) and UniProtKB/Swiss-Prot identifiers are
provided pipe-separated, if available from UniProtKB. If none of these identifiers
have been supplied, the field will be left empty.
Example:  RNF20|BRE1A|IPI00690596|BRE1A_BOVIN
IPI00706050
MMP-16|IPI00689864

12. DB_Object_Type
What kind of entity is being annotated.
Here: protein (or protein_structure for the
gene_association.goa_pdb file).
Example: protein

13. Taxon_ID
Identifier for the species being annotated.
Example: taxon:9606

14. Date
The date of last annotation update in the format 'YYYYMMDD'
Example: 20050101

15. Assigned_By
Attribute describing the source of the annotation.  One of
either UniProtKB, AgBase, BHF-UCL, CGD, DictyBase, EcoCyc, EcoWiki, Ensembl,
FlyBase, GDB, GeneDB_Spombe,GeneDB_Pfal, GOC, GR (Gramene), HGNC, Human Protein Atlas,
JCVI, IntAct, InterPro, LIFEdb, PAMGO_GAT, MGI, Reactome, RGD,
Roslin Institute, SGD, TAIR, TIGR, ZFIN, PINC (Proteome Inc.) or WormBase.
Example: UniProtKB

16. Annotation_Extension 
Contains cross references to other ontologies/databases that can be used to qualify or 
enhance the GO term applied in the annotation.
The cross-reference is prefaced by an appropriate GO relationship; references to multiple ontologies
can be entered.
Example: part_of(CL:0000084)
occurs_in(GO:0009536)
has_input(CHEBI:15422)
has_output(CHEBI:16761)
has_participant(UniProtKB:Q08722)
part_of(CL:0000017)|part_of(MA:0000415)

17. Gene_Product_Form_ID
The unique identifier of a specific spliceform of the protein described in column 2 (DB_Object_ID)
Example:O43526-2

5.  Assignment of GO terms to UniProtKB data
------------------------------------------------------------

In this release, we have used eleven data source types to assign GO terms to
proteins.

A) PMID:nnnnnnnn
All such annotations are manually curated and can contain any of the
evidence codes available, except 'IEA' (see section 4). Curators have
read the abstract or full paper with the PubMed identifier nnnnnnnn
and assigned the GO terms manually.

B) Digital Object Identifiers (DOI:10.nnnn/*)
All such annotations are manually curated and can contain any of the
evidence codes available, except 'IEA' (see section 4). Curators have
read the abstract or full paper with the DOI identifier
and assigned the GO terms manually.

C) Reactome:REACT_nnnn
All such annotations are manually curated by the Reactome team and apply the
TAS evidence code. Reactome entries are curated (from published papers and
expert knowledge), then peer reviewed by domain
experts.

D) GO_REF:0000002
Transitive assignment of GO terms based on InterPro classification.
For any protein that has been annotated with one or more InterPro
domains, the corresponding GO terms are obtained from a translation
table of InterPro entries to GO terms (interpro2go) generated
manually by the InterPro team at EBI. The mapping file is available at:
http://www.geneontology.org/external2go/interpro2go

E) GO_REF:0000020
GO terms are manually assigned to each HAMAP family rule. HAMAP family
rules are a collection of orthologous microbial protein families,
from bacteria, archaea and plastids, generated manually by expert
curators. The assigned GO terms are then transferred to all the
proteins that belong to each HAMAP family. Only GO terms from the
molecular function and biological process ontologies are assigned.
GO annotations using this technique will receive the evidence code
Inferred from Electronic Annotation (IEA). These annotations are
updated monthly by HAMAP and are available for download on both
GO and GOA EBI ftp sites. HAMAP (High-quality Automated and
Manual Annotation of Microbial proteins) is a project based at
the Swiss Institute of Bioinformatics (Gattiker et al. 2003,
Comp. Biol and Chem. 27: 49-58).
For further information, please see: http://www.expasy.org/sprot/hamap

F) GO_REF:0000004
Transitive assignment using Swiss-Prot keywords. This method is used
for any database record that has one or more Swiss-Prot keywords assigned.
Each keyword is mapped to the corresponding GO term in the spkw2go file,
which was originally constructed manually by MGI curators and is now
maintained by the GOA team at EBI. The mapping file is available at:
http://www.geneontology.org/external2go/spkw2go

G) GO_REF:0000003
Transitive assignment using Enzyme Commission identifiers.
This method is used for any database entry, such as a protein record
in Swiss-Prot or TrEMBL, that has had an Enzyme Commission number
assigned. The corresponding GO term is determined using the EC
cross-references in the GO molecular function ontology.
Also see Hill et al., Genomics (2001) 74:121-128.
The mapping file is available at:
http://www.geneontology.org/external2go/ec2go

H) GO_REF:0000019
GO terms from a source species are projected onto one or more target
species based on gene orthology obtained from the Ensembl Compara system.
Only one to one and apparent one to one orthologies are used, and only GO
annotations with an evidence type of IDA, IEP, IGI, IMP or IPI are
projected. Projected GO annotations using this technique will receive the
evidence code, inferred from electronic anotation, 'IEA'. The UniProtKB
protein accession of the annotation source will be indicated in the 'With'
column of the GOA association file.

I)GO_REF:0000023
Transitive assignment of GO terms based on Swiss-Prot Subcellular Location
vocabulary annotation. The UniProt Consortium has developed a Subcellular
Location vocabulary (SPSL) to annotate UniProt Knowledgebase entries (in
CC_SUBC LOCATION lines). The UniProt-GOA curators at EBI have manually mapped this
vocabulary to the GO cellular component ontology. This mapping file, spsl2go,
is used to obtain corresponding GO terms for any UniProtKB entry that has
SPSL annotation; the mapping file is available is available from:
ftp://ftp.ebi.ac.uk/pub/databases/GO/goa/external2go/spsl2go

J)GO_REF:0000024
Method for transferring manual annotations to an entry based on a curator's
judgment of its similarity to a putative ortholog which has annotations with
experimental evidence. Annotations are created when a curator judges that the
sequence of a protein shows high similarity to another protein that has
annotation(s) supported by experimental evidence (IDA, IGI, IMP, IPI or IEP).
Annotations resulting from the transfer of GO terms display the 'ISS' evidence
code and include an accession for the protein from which the annotation was
projected in the 'with' field (column 8). This field can contain either a
UniProtKB Accession or an IPI (International Protein Index) identifier.
Further information on this method can be found at:
http://www.ebi.ac.uk/GOA/ISS_method.html

K)GO_REF:0000033
GO terms based on experimental data from the scientific literature are used to
annotate ancestral genes in phylogenetic trees from the PANTHER database by
sequence similarity (ISS), and unannotated descendants of these ancestral genes
are inferred to have inherited these same GO annotations by descent. The annotations
are done using a tool called PAINT (Phylogenetic Annotation and INference Tool).
Further information on this method can be found at:
http://gocwiki.geneontology.org/index.php/PAINT

L) Source='GOC'
Annotations automatically generated using the Molecular Function->Biological Process
inter-ontology relationships present in the GO OBO v1.2 format. As many GO users
do not currently reason over these relationships, a set of inferred annotations are
being generated. Such GO annotations are produced when an annotation has been made
(either manually or electronically) to a Molecular Function term that, either
directly or via one of its parent terms, has an relationship to a Biological
Process term, and where this Process term (or one of its children) has not
already been used in the annotation set for the same gene product identifier. This
inferred annotation set applies the same gene product identifier, reference and
evidence code as the asserted function annotation and are generated from all sources of GO
annotations, with only 'NOT'-qualified annotations being excluded.

6. Additional information on Manual Annotation in UniProt-GOA
-----------------------------------------------------

For information on manual annotation guidelines and the usage of
manual evidence codes please see:

http://www.geneontology.org/GO.annotation.html
http://www.geneontology.org/GO.evidence.html

Usage of the ISS code within UniProt-GOA

There are three ways in which a curator can use the ISS evidence code:

1. If a curator reads a paper that provides functional information
for a protein and also states an orthology between it and another
protein, then manual annotation can be transferred to the ortholog.
The ortholog's annotation will contain the evidence code 'ISS' and
the original literature identifier is displayed in the DB:reference
field (column 6). Any information previously in the 'with' column
of the original protein's annotation is replaced in that of the
sequence identifier (UniProt accession) of the original
protein's accession number. This allows the source of the 'ISS'
annotation to be traced.

2. If a curator is confident that a protein shows high similarity
to another protein (e.g. from using BLAST) and it
seemed reasonable to infer that the two proteins have a common
function, then manual annotation can be transferred to an ortholog.
The ortholog's annotation will contain the evidence code 'ISS', an
accession for the protein from which the annotation was projected
will be present in the 'with' field (column 8) and
the reference field (column 6) will contain the GO_REF:0000024.
Further information on this method can be found at:
http://www.ebi.ac.uk/GOA/ISS_method.html

3. If sequence similarity and functional information is reported in
two different papers, then the primary annotation can be transferred
to an ortholog. The ortholog's annotation will contain the evidence
code 'ISS', the identifier of the paper which describes the sequence
similarity is displayed in the DB:reference field (column 6) and any
information that was previously contained in the 'with' column of
the original entry is changed in that of the ortholog to contain the
original entry's accession number. This allows the source of the
annotation to be traced.

N.B. For all of the methods described above, only annotations that
have an experimental evidence code (either: IDA, IEP, IGI, IMP or IPI)
can be further transferred to other proteins. In addition, annotations
having the 'NOT' qualifier cannot be transferred by ISS.


7.  Addition of GO assignments from other data sources
-------------------------------------------------------

The UniProt-GOA dataset has also been supplemented with the last (2001) public
release of manual annotation from Proteome Incorporated. The replacement
of this subset with more up-to-date and detailed GO annotation is one
of UniProt-GOA's priorities.

UniProt-GOA has integrated annotations from the EBI's IntAct protein-protein
interaction database. Only those interactions which are of high
enough quality to be integrated into the UniProt database have been
included (this is decided on experimental method type). All GO terms
in these annotations are children of the protein binding term
(GO:0005515), use the 'IPI' evidence code along with the sequence
identifier of the protein's binding partner in column 8 ('with').

8. Further information on the PDB association file
----------------------------------------------------

The 'gene_association.goa_pdb' gene association file provided by the UniProt-GOA
group contains GO assignments to PDB entries. In this file
PDB entries are only assigned GO terms based on matching InterPro domains.

9. Contacts
-----------

Please direct any questions to goa@ebi.ac.uk  We welcome any
feedback.

10. Copyright Notice
--------------------

UniProt-GOA - GO Annotation@EBI
Copyright 2011 (C) The European Bioinformatics Institute.
This README and the accompanying databases may be copied and
redistributed freely, without advance permission, provided that this
copyright statement is reproduced with each copy.

$Date: 2011/07/28  $

spacer
spacer