UniProtKB (Figure 3) is a database of protein sequences and functional information (1). It is part of the UniProt resource, which also includes databases of clustered sequences (UniRef) and metagenomic data (UniMES), and an archive of sequence data that is in the public domain (UniParc). UniProtKB consists of two sections:

  • UniProtKB/Swiss-Prot – The information in this section of the database is manually annotated and reviewed, so it is of high quality and is non-redundant
  • UniProtKB/TrEMBL – The information in this section is computationally annotated and not reviewed, so it provides high annotation coverage of the proteome

The majority of sequences in UniProtKB (~85%) are originally derived from translations of genetic coding sequences submitted to the public nucleic acid databases, the ENA/GenBank/DDBJ databases. Translated sequences are automatically added to UniProtKB/TrEMBL, and are migrated to UniProtKB/SwissProt after manual curation.

You can use UniProtKB to find a wealth of information on a protein of interest. For example, you can find evidence for the structure or function of a protein, summarised from peer-reviewed papers, and evidence for subcellular location or involvement in disease). UniProtKB also enables you to compare protein sequences to investigate areas of homology.

Explore the UniProt homepage by clicking on the  below:

Figure 3 UniProt homepage.

To learn more about using UniProt try the course UniProt: Exploring protein sequence and functional information.