InterPro documentation

Release 1.2, June 2000


Acknowledgments

InterPro has been prepared by:



R.Apweiler (1), T.K.Attwood (4), A.Bairoch (2), A.Bateman (5), 

E.Birney (1), M.Biswas (1), P.Bucher (3), L.Cerutti (5), 

M.D.R.Croning (1,4), R.Durbin (5), W.Fleischmann (1), 

H.Hermjakob (1), N.Hulo (2) A.Kanapin (1), Y.Karavidopoulou 

(1), R.Lopez (1), B.Marx (1), N.Mulder (1), T.Oinn (1), 

F.Servant (6), C.Sigrist (2), E.Zdobnov (1).



(1) EMBL Outstation - European Bioinformatics Institute, 

    Wellcome Trust Genome Campus, Hinxton, Cambridge, UK; 

(2) Swiss Institute for Bioinformatics, Geneva, Switzerland; 

(3) Swiss Institute for Experimental Cancer Research, 

    Lausanne, Switzerland; 

(4) School of Biological Sciences, The University of Manchester, 

    Manchester, UK; 

(5) The Sanger Institute, Wellcome Trust Genome Campus, Hinxton, 

    Cambridge, UK; 

(6) CNRS/INRA, Toulouse, France; 





1. Introduction



The databases SWISS-PROT, TrEMBL, PROSITE, PRINTS, Pfam, and 

ProDom joined forces to launch an Integrated Resource of 

Protein Families, Domains and Sites, abbreviated InterPro. A 

detailed description of the project can be found in the InterPro 

user manual.





2. Contents of current release



InterPro release 1.2 (June 2000) contains 3052 entries, 

representing 574 domains, 2418 families, 46 repeats, 

and 14 post-translational modification sites. Overall, 

there are 1060226 InterPro hits from 384572 SWISS-PROT + 

TrEMBL protein sequences. A complete list is available from 

the ftp site.



The release was build using the following database versions:



  

   Database           Version  Entries    Date

   ---------          -------  -------    ------

   SWISS-PROT          38.19   86552      10-JUN-2000

   TrEMBL              13.7    298020     12-JUN-2000

   PROSITE             16.0    1370       01-APR-1999

   prelim. profiles     -      241        10-FEB-1999

   Pfam                5.2     2128       10-APR-2000

   PRINTS              26.1    1310       05-APR-2000

   ProDom              2000.1  540        07-FEB-2000







The SWISS-PROT and TrEMBL data used includes updates of 

these versions. 



3. Changes since release 1.0



The first 540 ProDom methods have been integrated into 

InterPro. The protein matches have been updated according to 

the latest updates of SWISS-PROT and TrEMBL, and additional 

methods from new releases of the member databases have been 

added.



A new relationship has been introduced which can be used to 

identify multi-domain proteins. This relationship, 

"CONTAINS/FOUND IN", is used to show that a domain can be 

found in more than one type of protein or family of proteins, 

but is not a SUBTYPE in the family sense. The domain is a 

functional entity, which can be found in proteins with 

different domain architectures. The "CONTAINS/FOUND IN"

relationship does not imply that this is always the case, but 

suggests that a protein may contain this domain. It is useful 

in linking InterPro entries which are often associated, but 

not in a family/subfamily relationship. With the introduction 

of this new relationship, some of the previously existing 

parent/child relationships have been altered. 





4. Forthcoming changes



The second production release 2.0 is scheduled for September 

2000. For Release 2.0, we aim to integrate more of the ProDom 

database and incorporate the Gene Ontology (GO) classification 

system (http://genome-www.stanford.edu/GO/).





5. We need your help



We welcome any feedback. If you find errors or omissions 

please let us  know. You can contact us at Interhelp@ebi.ac.uk.





6. Copyright Notice



InterPro - Integrated Resource Of Protein Domains And 

Functional Sites 

Copyright (C) 2000 The InterPro Consortium. 

This  manual and the accompanying database may be copied and 

redistributed freely, without advance permission, provided 

that this Copyright statement is reproduced with each copy.