Edit

Sequence Family Resources

Providing protein and RNA family resources

The Sequence Family Resources team led by Alex Bateman provides a range of important databases for proteins and non-coding RNAs. We provide the InterPro and Pfam resources for proteins and the RNAcentral and Rfam resources for non-coding RNAs.

Edit

Data resources

Antifam

Antifam

AntiFam is a collection of HMMs to help identify protein sequences in the databases that are likely to be false predictions.

Chimira

Chimira

A web-based engine for small RNA sequence analysis, quantitation and variant calling.

InterPro

InterPro

      InterPro is used to classify proteins into families and to predict the presence of domains and functionally important site. The project integrates signatures from 13 major protein signature databases: CATH-Gene3D, CDD, HAMAP, PANTHER, Pfam, PIRSF, PRINTS, PROSITE (patterns and profiles), SFLD,…

InterProScan

InterProScan

InterProScan is a tool that provides automated functional analysis of protein and nucleic acid sequences, the latter via a full six-frame translation. It offers the ability to identify both structural and functional regions of interest, based upon methods and models that have been generated by a la…

MEROPS

MEROPS

The MEROPS database comprises proteolytic enzymes (also termed proteases, proteinases and peptidases), their substrates and inhibitors. MEROPS uses a hierarchical, structure-based classification of proteolytic enzymes and protein inhibitors. Each peptidase or inhibitor is assigned to a Family on th…

Pfam

Pfam

Pfam is a database of protein sequence families. Each Pfam family is represented by a statistical model, known as a profile-hidden Markov model, which is trained using a curated alignment of representative sequences. These models can be searched against all protein sequences in order to find occurre…

Pfam search

Pfam search

Search a sequence against the Pfam HMM library

Rfam

Rfam

Rfam is a curated database of non-coding RNA families, each represented by multiple sequence alignments, consensus secondary structures and covariance models. Our families may be divided into non-coding RNA genes, structured cis-regulatory elements and self-splicing RNAs. Rfam families are created f…

Rfam search

Rfam search

Search of Rfam’s covariance model collection

RNAcentral

RNAcentral

RNAcentral is a database of non-coding RNA sequences that provides a single entry point for accessing the data from an international consortium of RNA resources. RNAcentral provides a unified view of non-coding RNA sequence data and aims to represent all non-coding RNA types from all organisms

Latest jobs

Software Engineer – Data Discovery

Technology in EMBL-EBI Hinxton

We are looking for a dynamic Software Engineer to join the EBI Search project [PMID: 40322924], a scalable text search engine providing easy and uniform access to the biological data resources hosted at the European Bioinformatics Institute (EMBL-EBI).EBI Search provides the central discovery infras…

Closes on 28th June. Posted 5th June 2026

Edit

Finance Business Partner x2

Administrative and support in EMBL-EBI Hinxton

Help Shape the Future of Life Science ResearchAt the European Molecular Biology Laboratory (EMBL), we're united by a shared mission: advancing life science research for the benefit of humanity. As Europe's leading laboratory for the life sciences, EMBL brings together more than 1,900 talented people…

Closes on 3rd July. Posted 5th June 2026

Edit
Edit