InterPro documentation

Release 5.2, Sep 2002

InterPro has been prepared by:

R.Apweiler (1), T.K.Attwood (4), A.Bairoch (2), A.Bateman (5), D.Binns (1), M.Biswas,(1), P.Bradley (1,4), P.Bork (8), P.Bucher (3), R.Copley (8), E.Courcelle (6), R.Durbin(5), L.Falquet (5), W.Fleischmann (1), J.Gouzy (6), S.Griffiths-Jones (5), D.Haft (9), N.Hulo (2), D.Kahn (6), A.Kanapin (1), , M.Krestyaninova (1), R.Lopez (1), I.Letunic(8), N.Mulder (1), S. Orchard (1), M.Pagni (3), D.Peyruc (6), C.Ponting (7), F.Servant (1), C.Sigrist (2).

(1) EMBL Outstation - European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK;
(2) Swiss Institute for Bioinformatics, Geneva, Switzerland;
(3) Swiss Institute for Experimental Cancer Research, Lausanne, Switzerland;
(4) School of Biological Sciences, The University of Manchester, Manchester, UK;
(5) The Sanger Centre, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK;
(6) CNRS/INRA, Toulouse, France;
(7) MRC Functional Genetics Unit, Department of Human Anatomy & Genetics, University of Oxford, UK;
(8) Biocomputing Unit, EMBL-Heidelberg, Germany.
(9) The Institue for Genome Research, Maryland, USA.

  Contents
  • 1 - Introduction
  • 2 - Changes since last major release
  • 3 - Contents of current release
  • 4 - Forthcoming changes
  • 5 - Feedback

1. Introduction
The databases Swiss-Prot, TrEMBL, PROSITE, PRINTS, Pfam, and ProDom joined forces to launch an Integrated Resource of Protein Families, Domains and Sites, abbreviated InterPro. SMART joined InterPro earlier this year, and the most recent member to join is TIGRFAMs. A detailed description of the project can be found in the InterPro user manual.

2. Changes since last major release
The number of mappings to the Gene Ontology (GO) classification system has increased since the last release. The protein matches have been updated according to the latest updates of Swiss-Prot and TrEMBL, and additional

methods from new releases of the member databases have been added. InterProScan has also been updated to include new member database signatures. Since release 5.0 additional signatures from the latest releases of Pfam

and Prosite, as well as more TIGRFAMs have been integrated into InterPro. The Database Links field now includes cross-references to the Transport Classification Database TC numbers.

3. Contents of current release
InterPro release 5.2 contains 5875 entries, representing 1272 domains, 4491 families, 97 repeats, and 15 post-translational modification sites. Overall, there are 2734407 InterPro hits from 799080 Swiss-Prot + TrEMBL protein sequences. A complete list is available from the ftp site.

The release was build using the following database versions:


DATABASE VERSION ENTRIES DATE

Swiss-Prot

40.27

113470

30-AUG-2002

TREMBL

21.12

685610

13-SEP-2002

PROSITE

17.5

1565

21-JUN-2002

PREFILE

N/A

252

18-JUL-2001

PFAM

7.3

3865

17-MAY-2002

PRINTS

33.0

1650

24-JAN-2002

PRODOM

20001.3

1346

28-JAN-2002

SMART

3.1

509

16-NOV-2000

TIGRFAMs

1.2

814

03-AUG-2001


The Swiss-Prot and TrEMBL data used includes updates of these versions.

4. Forthcoming changes
We have made progress in plans to provide users with the opportunity to display scores of InterProScan sequence search results, and the scores for protein matches in each entry will also be made available. We currently have all the scores in the database and are working on display options. InterPro will begin incorporating protein structural information by integrating the SCOP and CATH databases. Links to 3D structure information will be provided where known. We hope to have more regular releases of the InterPro xml file to keep the data in synch with the website. These will occur in the form of point releases.

5. Feedback
We need your help and would welcome any feedback. If you find errors or omissions please let us know.
You can contact us at: Interhelp@ebi.ac.uk.