InterPro

 

InterPro is used to classify proteins into families and to predict the presence of domains and functionally important site. The project integrates signatures from 14 major protein signature databases: CATH-Gene3D, CDD, HAMAP, PANTHER, Pfam, PIRSF, PRINTS, ProDom, PROSITE (patterns and profiles), SFLD, SMART, SUPERFAMILY and TIGRFAMs. The diversity of databases helps to ensure that annotations are as comprehensive as possible. Furthermore, the different databases offer complementary levels of protein classification, from broad-level (e.g., a protein is a member of a superfamily) to more fine-grained assignments (e.g. a protein is a member of a specific family, or possesses a particular type of domain). These different levels of granularity are used by InterPro to produce a hierarchical classification system: one or more-member database signatures are integrated into an InterPro entry, and, where appropriate, relationships are highlighted between different entries, identifying those that represent smaller, functionally specific subsets of a broader entry.

 
Through the use of specific residue-level annotation, InterPro provides identification of important amino acids, such as those responsible for ligand-binding or protein-protein interactions. Using MobiDB-lite, InterPro also provides prediction of intrinsically disordered regions (IDRs). Such polypeptide segments that have little or no three-dimensional structure, and provide a wide range of potential functions, from acting as flexible linkers between domains, to interacting with other proteins and modifying their activity
 
InterPro adds biological annotation, including Gene Ontology terms, and provides links to external databases such as, PDB, SCOP and CATH. It precomputes all matches of its signatures to UniProt Archive (UniParc) proteins using the InterProScan software, making the data available in a variety of machine-readable formats and via web-based graphical interfaces. This data is updated and incorporated into each UniProtKB release.

InterPro has a number of important applications, including the automatic annotation of proteins for UniProtKB/TrEMBL and genome annotation projects. InterPro is used by Ensembl and in the GOA project to provide large-scale mapping of proteins to GO terms. InterProScan also forms a core component of the EBI Metagenomics analysis pipeline.

Team members

Rob Finn
Rob Finn
Alex Mitchell
Alex Mitchell
Simon Potter
Simon Potter
Matloob Qureshi
Matloob Qureshi
Gustavo Salazar-Orejuela
Gustavo Salazar-Orejuela
Matthias Blum
Matthias Blum
Typhaine Paysan-Lafosse
Typhaine Paysan-Lafosse
Aurélien Luciani
Aurélien Luciani
Neil Rawlings
Neil Rawlings
Alex Bateman
Alex Bateman
Sebastien Pesseat
Sebastien Pesseat
Siew-Yit Yong
Siew-Yit Yong
Amaia Sangrador
Amaia Sangrador
Matthew Fraser
Matthew Fraser
Hsin Yu Chang
Hsin Yu Chang
Lorna Richardson
Lorna Richardson
Gift Nuka
Gift Nuka