Team Leader - Orchard team: Protein Function Content
Sandra Orchard is Team Leader for Protein Function Content, with a strong background in protein annotation. Her team is primarily responsible for the curation of UniProt at EMBL-EBI, and annotation of the Gene Ontology. Sandra and her team support professional training in data resources for exploring protein function, sequence analysis, interactions and pathways. She is a member and Treasurer of the Executive Committee of the International Society for Biocuration.
Sandra's team applies advanced methods for ensuring the accuracy and quality of the universal protein resource, UniProt, carrying out manual annotation and producing the rules for automated annotation. They also create and maintain value-added interfaces for exploring protein data, for example the Enzyme Portal and Complex Portal.
orchard [at] ebi.ac.uk
ORCID iD: 0000-0002-8878-3972
Tel:+44 (0)1223 494 675 / Fax:
The central activity of the Protein Function Content team is the biocuration of our UniProt databases. Biocuration involves the interpretation and integration of information relevant to biology into a database or resource that enables integration of the scientific literature as well as large data sets. Accurate and comprehensive representation of biological knowledge, as well as easy access to this data for working scientists and a basis for computational analysis, are primary goals of biocuration.
Manual curation involves a critical review of experimental and predicted data for each protein, as well as manual verification of each protein sequence. The curation methods we apply to UniProtKB/Swiss-Prot include manual extraction and structuring of experimental information from the literature, manual verification of results from computational analyses, quality assessment mining and integration of large-scale data sets and continuous updating as new information becomes available.
UniProt has developed two complementary approaches in order to automatically annotate protein sequences with a high degree of accuracy. UniRule is a collection of manually curated annotation rules, which define annotations that can be propagated based on specific conditions. The Statistical Automatic Annotation System (SAAS) is an automatic decision-tree-based rule-generating system. The central components of these approaches are rules based on InterPro classification and the manually curated data in UniProtKB/Swiss-Prot from the experimental literature and InterPro classification.
Gene Ontology annotation (GOA)
The UniProt GO annotation (GOA) program aims to add high-quality GO annotations to proteins in the UniProt Knowledgebase (UniProtKB). The assignment of GO terms to UniProt records is an integral part of UniProt biocuration. We supplement UniProt manual and electronic GO annotations are supplemented with manual annotations supplied by external collaborating GO Consortium groups. This ensures that users have a comprehensive GO annotation dataset. UniProt-GOA is a member of the GO consortium.
The Enzyme Portal integrates enzyme-related data from all EMBL-EBI enzyme resources, as well as the underlying functional and genomic data.
Our team works in a fully complementary fashion with Maria-Jesus Martin's UniProt Development team to provide essential resources to the biological community, as the databases have become an integral part of the tools researchers use on a daily basis for their work. The Universal Protein Resource (UniProt) is a comprehensive resource for protein sequence and functional annotation data. UniProt is comprised of four major components, each optimised for different uses:
- the UniProt Knowledgebase (UniProtKB) is an expertly curated database, a central access point for integrated protein information with cross-references to multiple sources
- the UniProt Archive (UniParc) is a comprehensive sequence repository, reflecting the history of all protein sequences
- UniProt Reference Clusters (UniRef) merge closely related sequences based on sequence identity to speed up searches