InterPro documentation

Release 5.0, May 2002

InterPro has been prepared by:

R.Apweiler (1), T.K.Attwood (4), A.Bairoch (2), A.Bateman (5), D.Binns (1), M.Biswas (1), P.Bradley (1,4), P.Bork (8), P.Bucher (3), R.Copley (8), E.Courcelle (6), R.Durbin (5), L.Falquet (5), W.Fleischmann (1), J.Gouzy (6), S.Griffiths-Jones (5), D.Haft (9), N.Hulo (2), D.Kahn (6), A.Kanapin (1), , M.Krestyaninova (1), R.Lopez (1), I.Letunic (8), N.Mulder (1), S.Orchard (1), M.Pagni (3), D.Peyruc (6), C.Ponting (7), F.Servant (1), C.Sigrist (2).

(1) EMBL Outstation - European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK;
(2) Swiss Institute for Bioinformatics, Geneva, Switzerland;
(3) Swiss Institute for Experimental Cancer Research, Lausanne, Switzerland;
(4) School of Biological Sciences, The University of Manchester, Manchester, UK;
(5) The Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK;
(6) CNRS/INRA, Toulouse, France;
(7) MRC Functional Genetics Unit, Department of Human Anatomy & Genetics, University of Oxford, UK;
(8) Biocomputing Unit, EMBL-Heidelberg, Germany;
(9) The Institute for Genomic Research, Maryland, USA.

  Contents
  • 1 - Introduction
  • 2 - Changes since last major release
  • 3 - Contents of current release
  • 4 - Forthcoming changes
  • 5 - Feedback

1. Introduction
The databases SWISS-PROT, TrEMBL, PROSITE, PRINTS, Pfam and ProDom joined forces to launch an Integrated Resource of Protein Families, Domains and Sites, abbreviated InterPro. SMART and TIGRFAMs have subsequently joined InterPro. A detailed description of the project can be found in the InterPro user manual.

2. Changes since last major release
The number of mappings to the Gene Ontology (GO) classification system has increased since the last release. The protein matches have been updated according to the latest updates of SWISS-PROT and TrEMBL, and additional methods from new releases of the member databases have been added. InterProScan has also been updated to include new member database signatures.

A new SRS-based text search for InterPro has been added to the text search page, which allows the user to search fields in both InterPro and SWISS-PROT or TrEMBL entries simultaneously. This allows for more complex queries than the existing simple text search. The Perl stand-alone InterProScan tool now has the option to include scores of the results where applicable.

3. Contents of current release
InterPro release 5.0 contains 5312 entries, representing 1177 domains, 4028 families, 92 repeats, and 15 post-translational modification sites. Overall there are 2524220 InterPro hits from 734448 SWISS-PROT + TrEMBL protein sequences. A complete list is available from the ftp site.

The release was build using the following database versions:


DATABASE

VERSION

ENTRIES

DATE

SWISS-PROT

40.16

108158

02-MAY-2002

TREMBL

20.6

626290

10-MAY-2002

PROSITE

17.1

1517

31-JAN-2002

PREFILE

N/A

252

18-JUL-2001

PFAM

7.1

3621

17-MAR-2002

PRINTS

33.0

1650

24-JAN-2002

PRODOM

20001.3

1346

28-JAN-2002

SMART

3.1

509

16-NOV-2000

TIGRFAMs

1.2

814

03-AUG-2001


The SWISS-PROT and TrEMBL data used includes updates of these versions.

4. Forthcoming changes
We have made progress in plans to provide users with the opportunity to display scores of InterProScan sequence search results, and the scores for protein matches in each entry will also be made available. We currently have all the scores in the database and are working on display options. InterPro will begin incorporating protein structural information by integrating the SCOP and CATH databases. Links to 3D structure information will be provided where known. We hope to have more regular releases of the InterPro xml file to keep the data in synch with the website. These will occur in the form of point releases.

5. Feedback
We need your help and would welcome any feedback. If you find errors or omissions please let us know.
You can contact us at: Interhelp@ebi.ac.uk.