Pfam Database: Creating Protein Families


Detection of conserved evolutionary units by profile hidden Markov Models (HMM) 

This tutorial describes how different types of entries are created in the Pfam database. This is an intermediate course which requires familiarity with the Pfam website. For a more general overview of the different functions available from Pfam please refer to Pfam:Quick Tour. Pfam is a large collection of protein families, each represented by multiple sequence alignments and profile hidden Markov models (HMMs), which in combination cover the majority of UniProt sequences. In this course we will present the different types of entries in Pfam and describe the process of creating profile HMMs for all entry types except Tandem Repeats (TR). Due to the inherit complexities of generating TR entries, these are discussed in a seperate course found here.

Undergraduate-level understanding of biology would be an advantage. Familiarity with the Pfam database is required.

About this course

Learning objectives: 
  • List the six Pfam entry categories
  • Describe the concept of Hidden Markov Models (HMM)
  • Outline the process of creating a Pfam model
  • List the features that define a clan of entries