InterPro documentation

Release 2.0, October 2000

InterPro has been prepared by:

R.Apweiler (1), T.K.Attwood (4), A.Bairoch (2), A.Bateman (5), E.Birney (1), M.Biswas (1), P.Bucher (3), L.Cerutti (5), M.D.R.Croning (1,4), R.Durbin (5), L.Falquet (3), W.Fleischmann (1), H.Hermjakob (1), N.Hulo (2) A.Kanapin (1), Y.Karavidopoulou (1), R.Lopez (1), B.Marx (1), N.Mulder (1), T.Oinn (1), M.Pagni (3), F.Servant (6), C.Sigrist (2), E.Zdobnov (1).

(1) EMBL Outstation - European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK;
(2) Swiss Institute for Bioinformatics, Geneva, Switzerland;
(3) Swiss Institute for Experimental Cancer Research, Lausanne, Switzerland;
(4) School of Biological Sciences, The University of Manchester, Manchester, UK;
(5) The Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK;
(6) CNRS/INRA, Toulouse, France;

1. Introduction
The databases SWISS-PROT, TrEMBL, PROSITE, PRINTS, Pfam, and ProDom joined forces to launch an Integrated Resource of Protein Families, Domains and Sites, abbreviated InterPro. A detailed description of the project can be found in the InterPro user manual.

2. Contents of current release
InterPro release 2.0 (October 2000) contains 3204 entries, representing 767 domains, 2372 families, 50 repeats, and 15 post-translational modification sites. Overall, there are 1315512 InterPro hits from 462483 SWISS-PROT + TrEMBL protein sequences. A complete list is available from the ftp site.

The release was build using the following database versions:

   Database          Version  Entries    Date

   ---------         -------  -------    ------

   SWISS-PROT          39.7    88753   02-OCT-2000

   TrEMBL              15.3   373730   20-OCT-2000

   PROSITE            16.25     1424   31-AUG-2000

   prelim. profiles     -        236   25-SEP-2000

   Pfam                 5.5     2478   01-SEP-2000

   PRINTS              27.0     1356   25-AUG-2000

   ProDom             2000.1    1309   07-FEB-2000

The SWISS-PROT and TrEMBL data used includes updates of these versions.

3. Changes since release 1.2
A further 769 ProDom methods have been integrated into InterPro. The protein matches have been updated according to the latest updates of SWISS-PROT and TrEMBL, and additional methods from new releases of the member databases have been added. The format of the flatfiles has changed, and consist of two separate files, one for all the data and one for the protein matches. An SRS-based InterProScan suite as well as a Perl-based stand-alone InterProScan package have been developed. The Perl-based InterProScan has a robust and efficient (parallel) internal architecture that benefits from network distributed computing with support of queuing systems. It is capable of providing a pre-processed, integrated view of the results.

The InterPro webserver has been redesigned to make the pages more coherent and consistent. It also includes links to a more comprehensive help menu. Within an InterPro entry there are links to pop-up windows with explanations of terms. In addition, where a parent/child relationship exists, there is a link to a view to the hierarchical tree of the relationship. The match views are also improved, with legends describing symbols or colours.

4. Forthcoming changes
The third production release 3.0 is scheduled for January 2001. For Release 3.0, we aim to integrate more of the ProDom database, and the SMART database. We will also incorporate the Gene Ontology (GO) classification system (

5. We need your help
We welcome any feedback. If you find errors or omissions please let us know. You can contact us at:

6. Copyright Notice
InterPro - Integrated Resource Of Protein Domains And Functional Sites
Copyright © 2000 The InterPro Consortium.

This manual and the accompanying database may be copied and redistributed freely, without advance permission, provided that this Copyright statement is reproduced with each copy.