Explore the known protein space through UniProt Archive and Clusters



UniProt Archive (UniParc) is the most comprehensive, non-redundant protein sequence database available. Its protein sequences are retrieved from predominant, publicly accessible resources. To avoid redundancy, each unique sequence is stored only once with a stable protein identifier. As a result, performing a sequence search against UniParc is equivalent to performing the same search against all databases cross-referenced by UniParc.

UniProt Reference Clusters (UniRef) provides clustered sets of sequences from the UniProt Knowledgebase (UniProtKB) and selected UniParc records to obtain complete coverage of sequence space at several resolutions while hiding redundant sequences. The reduced redundancy increases the speed of similarity searches and improves detection of distant relationships.

This webinar was recorded on 17 October 2019. It is best viewed in full screen mode using Google Chrome. The slides from this webinar can be downloaded below.


This webinar is for individuals who wish to learn more about UniParc. No prior knowledge of bioinformatics is required, but an undergraduate level knowledge of biology would be useful.


See the EMBL-EBI training pages for a list of upcoming webinars.

About this course