Release NotesRelease 13.0, Friday August 11th 2006
R.Apweiler (1), T.K.Attwood (4), A.Bairoch (2), A.Bateman (5), D.Binns (1), P.Bork (8), P.Bucher (3), L.Cerutti (3), R.Copley (13), E.Courcelle (6), U.Das (1), L.Daugherty (1), M.Dibley (7), R.Finn (5), W.Fleischmann (1), J.Gough (11), D.Haft (9), N.Hulo (2), S.Hunter (1), D.Kahn (6), A.Kanapin (1), A. Kejariwal (14), D.Lonsdale (1), R.Lopez (1), I.Letunic (8), M.Madera (12), J.Maslen (1), C.McAnulla (1), J.McDowall (1), J.Mistry (5), A.Mitchell (1,4), N.Mulder (1), A.N.Nikolskaya (10), S.Orchard (1), C.Orengo (7), M.Pagni (3), R. Petryszak (1), C.Sigrist (2), P.D.Thomas (14), D.Wilson (12), C.H.Wu (10), C.Yates (7).
(1) EMBL Outstation European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK;
(2) Swiss Institute for Bioinformatics, Geneva, Switzerland;
(3) Swiss Institute for Experimental Cancer Research, Lausanne, Switzerland;
(4) School of Biological Sciences, The University of Manchester, Manchester, UK;
(5) The Sanger Centre, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK;
(6) CNRS/INRA, Toulouse, France;
(7) Biochemistry and Molecular Biology Department, University College London, University of London, UK;
(8) Biocomputing Unit, EMBL-Heidelberg, Germany;
(9) The Institute for Genomic Research, Maryland, USA;
(10) Protein Information Resource, Georgetown University Medical Center, Washington, D.C., USA;
(11) Genomic Sciences Centre, RIKEN Yokohama Institute, Suehiro-cho, Tsurumi-ku, Yokohama, Japan;
(12) MRC Laboratory of Molecular Biology, Cambridge, UK;
(13) Wellcome Trust Centre for Human Genetics, Oxford, UK;
(14) Computational Biology, SRI International, 333 Ravenswood Ave., Menlo Park, CA 94025, USA.
The databases Swiss-Prot and TrEMBL (now part of UniProtKB), PROSITE, PRINTS, Pfam and ProDom joined forces to launch an Integrated Documentation Resource of Protein Families, Domains and Functional Sites, abbreviated InterPro. Since then SMART, TIGRFAMs, PIRSF, SUPERFAMILY and more recently PANTHER and Gene3D, have joined InterPro. A detailed description of the project can be found in the user manual.
Changes since last release
- The protein matches have been updated according to the latest update of UniProtKB and new methods from Gene3D, PANTHER, PROSITE, Pfam and SMART have been integrated.
- The Gene3D accession number prefix has changed to 'G3DSA:'.
- PROSITE preliminary profiles and their matches have been removed.
- Links to IntAct, the protein interaction database, have been incorporated providing manually curated examples of domain-domain interactions.
- Links to Pfam Clan pages are now available; a pop-up display of Pfam clan InterPro entry relationships for all Pfam clan members is provided.
- The Additional Reading field has been updated with new references from the member databases and structural references from the PDB.
- A list of all current UniParc matches to InterPro methods is now available from the ftp site in XML format: uniparc_match.tar.gz. Due to the large size of UniParc the data has been divided into chunks and the latest updates are provided in these files at each InterPro release.
Contents and coverage of the current release
InterPro protein matches are now calculated for all UniProtKB proteins. For more information see UniProtKB.
InterPro release 13.0 contains 13147 entries, representing 3760 domains, 9080 families, 232 repeats, 32 active sites, 22 binding sites and 21 post-translational modification sites. Overall, there are 13175961 InterPro hits from 2530773 UniProtKB protein sequences. A complete list is available from the ftp site.
92.5% of UniProtKB/Swiss-Prot - 211569 of 228670 proteins
76.5% of UniProtKB/TrEMBL - 2319204 of 3031970 proteins
77.6% of UniProtKB - 2530773 of 3260640 proteins
Last entry: IPR014023
23000 publications in PUBMED are referenced from InterPro.
The next release of InterPro will be scheduled for October/November 2006.