Claire O’Donovan

Team Leader, UniProt content
BSc (Hons) in Biochemistry, 1992, University College Cork, Ireland. Diploma in Computer Science, 1993, University College Cork, Ireland. At EMBL since 1993, at EMBL-EBI since 1994. Team Leader since 2009.
odonovan [at] ebi.ac.uk
Tel:+44 (0)1223 494 460 / Fax:+44 (0)1223 494 468
O'Donovan team
The central activity of Claire O'Donovan's team is the biocuration of our UniProt databases.
Biocuration involves the interpretation and integration of information relevant to biology into a database or resource that enables integration of the scientific literature as well as large data sets. Accurate and comprehensive representation of biological knowledge, as well as easy access to this data for working scientists and a basis for computational analysis, are primary goals of biocuration.
UniProt manual curation: Manual curation involves a critical review of experimental and predicted data for each protein, as well as manual verification of each protein sequence. The curation methods we apply to UniProtKB/Swiss-Prot include manual extraction and structuring of experimental information from the literature, manual verification of results from computational analyses, quality assessment mining and integration of large-scale data sets and continuous updating as new information becomes available.
UniProt automatic annotation: UniProt has developed two complementary approaches in order to automatically annotate protein sequences with a high degree of accuracy. UniRule is a collection of manually curated annotation rules, which define annotations that can be propagated based on specific conditions. The Statistical Automatic Annotation System (SAAS) is an automatic decision-tree-based rule-generating system. The central components of these approaches are rules based on InterPro classification and the manually curated data in UniProtKB/Swiss-Prot from the experimental literature and InterPro classification.
UniProt GO annotation (GOA): The UniProt GO annotation (GOA) program aims to add high-quality GO annotations to proteins in the UniProt Knowledgebase (UniProtKB). The assignment of GO terms to UniProt records is an integral part of UniProt biocuration. We supplement UniProt manual and electronic GO annotations are supplemented with manual annotations supplied by external collaborating GO Consortium groups. This ensures that users have a comprehensive GO annotation dataset. UniProt-GOA is a member of the GO consortium.
Claire's team works in a fully complementary fashion with Maria-Jesus Martin's UniProt development group to provide essential resources to the biological community such that the databases have become an integral part of the tools researchers use on a daily basis for their work. The Universal Protein Resource (UniProt) is a comprehensive resource for protein sequence and functional annotation data. UniProt is comprised of four major components, each optimized for different uses. The UniProt Knowledgebase (UniProtKB) is an expertly curated database, a central access point for integrated protein information with cross-references to multiple sources.
The UniProt Archive (UniParc) is a comprehensive sequence repository, reflecting the history of all protein sequences. UniProt Reference Clusters (UniRef) merge closely related sequences based on sequence identity to speed up searches while the UniProt Metagenomic and Environmental Sequences database (UniMES) was created to respond to the expanding area of metagenomic data.
