InterPro documentation

Release 1.0, March 2000



Acknowledgments

InterPro has been prepared by:



R.Apweiler (1), T.K.Attwood (4), A.Bairoch (2), A.Bateman (5), 

E.Birney (1), M.Biswas (1), P.Bucher (3), L.Cerutti (5), 

M.D.R.Croning (1,4), R.Durbin (5), W.Fleischmann (1), 

H.Hermjakob (1), N.Hulo (2) A.Kanapin (1), Y.Karavidopoulou 

(1), R.Lopez (1), B.Marx (1), N.Mian (5), N.Mulder (1), T.Oinn (1),

C.Sigrist (2), E.Zdobnov (1).



(1) EMBL Outstation - European Bioinformatics Institute, 

Wellcome Trust Genome Campus, Hinxton, Cambridge, UK; 

(2) Swiss Institute for Bioinformatics, Geneva, Switzerland; 

(3) Swiss Institute for Experimental Cancer Research, 

Lausanne, Switzerland; 

(4) School of Biological Sciences, The University of 

Manchester, Manchester, UK; 

(5) The Sanger Institute, Wellcome Trust Genome Campus, Hinxton, 

Cambridge, UK; 

(6) CNRS/INRA, Toulouse, France; 





1. Introduction



The databases SWISS-PROT, TrEMBL, PROSITE, PRINTS, Pfam, and 

ProDom joined forces to launch a new Integrated Resource of 

Protein Domains and Functional Sites, abbreviated InterPro. A 

detailed description of the project can be found in the 

InterPro user manual.





2. Contents of current release



InterPro release 1.0 (March 2000) contains 2990 entries, 

representing 556 domains, 2373 families, 47 repeats, 

and 14 post-translational modification sites. Overall, 

there are 823000 InterPro hits from 307361 SWISS-PROT + TrEMBL 

protein sequences. A complete list is available from the ftp 

site.



The release was build using the following database versions:



   Database          Version  Entries    Date

   ---------         -------  -------    ------

   SWISS-PROT          38      82229   15-DEC-1999

   TrEMBL              12     225132   15-DEC-1999

   PROSITE             16.0     1038   01-APR-1999

   prelim. profiles     -        241   10-FEB-1999

   Pfam                 5.0     2008   11-JAN-2000

   PRINTS              25.0     1260   07-JAN-2000



The SWISS-PROT and TrEMBL data used includes updates of these 

versions. 



3. Changes since the beta release



InterPro entries describe protein families, domains, repeats, 

or post-translational modification sites, whereas the member 

databases PROSITE, PRINTS, Pfam provide methods to recognise 

these biological objects. To make sure that InterPro entries 

and linked methods are pointing to related information on the 

same biological object, all links have been checked manually 

and corrected where necessary.



InterPro entries now contain a one-line description and merged 

annotation taken from the member databases combined to form an 

abstract. This was done manually by curators. Also in the 

entries is a list of example proteins indicating diversity 

within the group, and a list of references derived from the 

member databases. The hits against SWISS-PROT and TrEMBL are 

displayed in a table of matches or in a graphical view. Text- 

and sequence-based searches are also available through SRS.





4. Forthcoming changes



The second production release 2.0 is scheduled for September 

2000. For Release 2.0, we aim to integrate the ProDom database 

and incorporate the Gene Ontology (GO) classification system.





5. We need your help



We welcome any feedback. If you find errors or omissions 

please let us  know. You can contact us at 

Interhelp@ebi.ac.uk.





6. Copyright Notice



InterPro - Integrated Resource Of Protein Domains And 

Functional Sites 

Copyright (C) 2000 The InterPro Consortium. 

This  manual and the accompanying database may be copied and 

redistributed freely, without advance permission, provided 

that this Copyright statement is reproduced with each copy.