Release Notes

Release 16.0, Thursday August 9th 2007


R.Apweiler (1), T.K.Attwood (4), A.Bairoch (2), A.Bateman (5), D.Binns (1), P.Bork (8), P.Bucher (3), L.Cerutti (3), R.Copley (13), E.Courcelle (6), U.Das (1), L.Daugherty (1), M.Dibley (7), R.Finn (5), J.Gough (11), D.Haft (9), N.Hulo (2), S.Hunter (1), D.Kahn (6), A.Kanapin (1), A. Kejariwal (14), D.Lonsdale (1), R.Lopez (1), I.Letunic (8), M.Madera (12), J.Maslen (1), C.McAnulla (1), J.McDowall (1), J.Mistry (5), A.Mitchell (1,4), N.Mulder (1), A.N.Nikolskaya (10), S.Orchard (1), C.Orengo (7), M.Pagni (3), R. Petryszak (1), C.Sigrist (2), P.D.Thomas (14), D.Wilson (12), C.H.Wu (10), C.Yates (7).

(1) EMBL Outstation European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK;
(2) Swiss Institute for Bioinformatics, Geneva, Switzerland;
(3) Swiss Institute for Experimental Cancer Research, Lausanne, Switzerland;
(4) School of Biological Sciences, The University of Manchester, Manchester, UK;
(5) The Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK;
(6) CNRS/INRA, PRABI, 69622 Villeurbanne Cedex, France;
(7) Biochemistry and Molecular Biology Department, University College London, University of London, UK;
(8) Biocomputing Unit, EMBL-Heidelberg, Germany;
(9) The Institute for Genomic Research, Maryland, USA;
(10) Protein Information Resource, Georgetown University Medical Center, Washington, D.C., USA;
(11) Unite de Bioinformatique Structurale, Institut Pasteur, Paris, France;
(12) MRC Laboratory of Molecular Biology, Cambridge, UK;
(13) Wellcome Trust Centre for Human Genetics, Oxford, UK;
(14) Computational Biology, SRI International, 333 Ravenswood Ave., Menlo Park, CA 94025, USA.

Background Information

The databases Swiss-Prot and TrEMBL (now part of UniProtKB), PROSITE, PRINTS, Pfam and ProDom joined forces to launch an Integrated Documentation Resource of Protein Families, Domains and Functional Sites, abbreviated InterPro. Since then SMART, TIGRFAMs, PIRSF, SUPERFAMILY and more recently PANTHER and Gene3D, have joined InterPro. A detailed description of the project can be found in the user manual.

Current Release

Changes since last release

  • InterPro 16.0 has increased coverage of UniProtKB with new methods from PANTHER, Gene3D, SMART and SUPERFAMILY.
  • Unintegrated signatures pages are now available; with the unintegrated signatures being displayed in the detailed match views.
  • Sequences in FASTA format can now be downloaded using links from the taxonomy pop-up box and via the match view selection.
  • UniProtKB accession number match lists are now available from the entry page and via the match view selection.
  • Curated SCOP, CATH and PDB matches to UniProtKB sequences in InterPro are provided in feature.xml, available from the ftp site.
  • Retrieval and analysis of InterPro data is now available via Web Services.
  • Parent/Child relationships are listed in ParentChildTreeFile.txt, available from the ftp site.
  • UniProtKB (match_complete.xml), UniParc (uniparc_match.tar.gz) and UniMES (unimes_match.tar.gz) matches to InterPro methods have been updated and are available from the ftp site in XML format. Due to the large size of UniParc and UniMES the data has been divided into chunks and the latest updates are provided in these files at each InterPro release.
  • Member database version information has been added to match_complete.xml.

Contents and coverage of the current release

InterPro protein matches are now calculated for all UniProtKB and UniParc proteins. The following statistics are for all UniProtkB proteins. InterPro release 16.0 contains 15045 entries, representing:

Active sites 34
Binding sites 22
Domains 4676
Families 10060
PTMs 18
Repeats 235

Last entry: IPR015927

24820 publications in PUBMED are referenced from InterPro.

Signature Database Version All Signatures Integrated Signatures
PANTHER 6.1 30128 2061
Pfam 21.0 8957 8957
PIRSF 2.68 1748 1499
PRINTS 38.0 1900 1898
ProDom 2005.1 3538 1041
PROSITE patterns 20.0 1319 1319
SMART 5.1 724 721
TIGRFAMs 6.0 2949 2933
Gene3D 3.0.0 2147 783
SUPERFAMILY 1.69 1538 463

Sequence Database Version Count Count of proteins matching
All Signatures Integrated Signatures
UniProtKB/Swiss-Prot 53.3 276256 263589 (95.4%) 254650 (92.2%)
UniProtKB/TrEMBL 36.3 4672908 3720155 (79.6%) 3517001 (75.3%)
UniProtKB 12.0 4949164 3983744 (80.5%) 3771651 (76.2%)

InterPro to GO

Forthcoming changes

  • The next release of InterPro is scheduled for October 2007.
  • In the next release of InterPro, match_complete.xml will contain all UniProtKB proteins, including those that have no match to InterPro entries.