Release NotesRelease 14.0, Friday December 8th 2006
R.Apweiler (1), T.K.Attwood (4), A.Bairoch (2), A.Bateman (5), D.Binns (1), P.Bork (8), P.Bucher (3), L.Cerutti (3), R.Copley (13), E.Courcelle (6), U.Das (1), L.Daugherty (1), M.Dibley (7), R.Finn (5), W.Fleischmann (1), J.Gough (11), D.Haft (9), N.Hulo (2), S.Hunter (1), D.Kahn (6), A.Kanapin (1), A. Kejariwal (14), D.Lonsdale (1), R.Lopez (1), I.Letunic (8), M.Madera (12), J.Maslen (1), C.McAnulla (1), J.McDowall (1), J.Mistry (5), A.Mitchell (1,4), N.Mulder (1), A.N.Nikolskaya (10), S.Orchard (1), C.Orengo (7), M.Pagni (3), R. Petryszak (1), C.Sigrist (2), P.D.Thomas (14), D.Wilson (12), C.H.Wu (10), C.Yates (7).
(1) EMBL Outstation European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK;
(2) Swiss Institute for Bioinformatics, Geneva, Switzerland;
(3) Swiss Institute for Experimental Cancer Research, Lausanne, Switzerland;
(4) School of Biological Sciences, The University of Manchester, Manchester, UK;
(5) The Sanger Centre, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK;
(6) CNRS/INRA, Toulouse, France;
(7) Biochemistry and Molecular Biology Department, University College London, University of London, UK;
(8) Biocomputing Unit, EMBL-Heidelberg, Germany;
(9) The Institute for Genomic Research, Maryland, USA;
(10) Protein Information Resource, Georgetown University Medical Center, Washington, D.C., USA;
(11) Unite de Bioinformatique Structurale, Institut Pasteur, Paris, France;
(12) MRC Laboratory of Molecular Biology, Cambridge, UK;
(13) Wellcome Trust Centre for Human Genetics, Oxford, UK;
(14) Computational Biology, SRI International, 333 Ravenswood Ave., Menlo Park, CA 94025, USA.
The databases Swiss-Prot and TrEMBL (now part of UniProtKB), PROSITE, PRINTS, Pfam and ProDom joined forces to launch an Integrated Documentation Resource of Protein Families, Domains and Functional Sites, abbreviated InterPro. Since then SMART, TIGRFAMs, PIRSF, SUPERFAMILY and more recently PANTHER and Gene3D, have joined InterPro. A detailed description of the project can be found in the user manual.
Changes since last release
- The protein matches have been updated according to the latest update of UniProtKB.
- New methods from Gene3D, PANTHER, PIRSF and TIGRFAMs have been integrated.
- Links to ADAN, SPICE, and Dasty have been added to the protein match displays, please see User_Manual for further information.
- Splice variants have been added to match_complete.xml.
- Match.xml, match_complete.xml and UniParc matches to InterPro methods (uniparc_match.tar.gz) have been updated and are available from the ftp site in XML format.
- Note: Due to the large size of UniParc the data has been divided into chunks and the latest updates are provided in these files at each InterPro release. After this release match.xml will be discontinued.
Contents and coverage of the current release
InterPro protein matches are now calculated for all UniProtKB and UniParc proteins. The following statistics are for all UniProtkB proteins:
InterPro release 14.0 contains 13828 entries, representing 3905 domains, 9614 families, 232 repeats, 34 active sites, 22 binding sites and 21 post-translational modification sites. Overall, there are 14689258 InterPro hits from 2798632 UniProtKB protein sequences. A complete list is available from the ftp site.
92.4% of UniProtKB/Swiss-Prot - 222974 of 241365 proteins
76.5% of UniProtKB/TrEMBL - 2575658 of 3368793 proteins
77.5% of UniProtKB - 2798632 of 3610158 proteins
Last entry: IPR014734
23454 publications in PUBMED are referenced from InterPro.
Lists of InterPro entries by type and to GO
The next release of InterPro will be scheduled for March 2007.