Release Notes

Release 8.0, Friday June 25 2004

Acknowledgements

R.Apweiler (1), T.K.Attwood (4), A.Bairoch (2), A.Bateman (5), D.Binns (1), P.Bradley (1,4), P.Bork (8), P.Bucher (3), L. Cerutti (3), R.Copley (13), E.Courcelle (6), U.Das (1) R.Durbin (5), W.Fleischmann (1), J.Gough (11), J.Gouzy (6), S.Griffiths-Jones (5) D.Haft (9), N.Harte (1), N.Hulo (2), D.Kahn (6), A.Kanapin (1), M.Krestyaninova (1), D.Lonsdale (1), R.Lopez (1), I.Letunic (8), M.Madera (12), J.Maslen (1), J.McDowall (1), N.Mulder (1), A.N. Nikolskaya (10), S.Orchard (1), M.Pagni (3), D.Peyruc (6), C.Ponting (7), E.Quevillon (1), F.Servant (1), C.Sigrist (2), D.J.Studholme (5), R.Vaughan (1), C.H. Wu (10).

(1) EMBL Outstation European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK;
(2) Swiss Institute for Bioinformatics, Geneva, Switzerland;
(3) Swiss Institute for Experimental Cancer Research, Lausanne, Switzerland;
(4) School of Biological Sciences, The University of Manchester, Manchester, UK;
(5) The Sanger Centre, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK;
(6) CNRS/INRA, Toulouse, France;
(7) MRC Functional Genetics Unit, Department of Human Anatomy and Genetics, University of Oxford, UK;
(8) Biocomputing Unit, EMBL-Heidelberg, Germany;
(9) The Institute for Genomic Research, Maryland, USA;
(10) Protein Information Resource, Georgetown University Medical Center, Washington, D.C., USA;
(11) Genomic Sciences Centre, RIKEN Yokohama Institute, Suehiro-cho, Tsurumi-ku, Yokohama, Japan;
(12) MRC Laboratory of Molecular Biology, Cambridge, UK.
(13) Wellcome Trust Centre for Human Genetics, Oxford, UK

Introduction

The databases UniProt, PROSITE, PRINTS, Pfam, and ProDom joined forces to launch an Integrated Documentation Resource of Protein Families, Domains and Functional Sites, abbreviated InterPro. SMART, TIGRFAMs, PIR SuperFamily and more recently SUPERFAMILY joined InterPro. A detailed description of the project can be found in the user manual.

Changes since last release

The protein matches have been updated according to the latest update of UniProt. New methods from Pfam, PIR SuperFamily, ProDom, PROSITE, and TIGRFAMs, have been integrated. Since the last release the TIGRFAM HMM matches have been recalculated, making them more specific.

The number of ways in which an InterPro entry can be viewed has been increased; the graphical views include:

  • an overview sorted by UniProt protein accession number,
  • an overview sorted by UniProt protein name,
  • an overview sorted by proteins of known structure,
  • an overview grouped by taxonomy,
  • a detailed graphical view sorted by UniProt protein name,
  • a detailed graphical view of proteins of known structure,
  • a table view for all matching proteins
  • and a table view for those proteins of known structure.

An additional new view is the 'Architectures' view. The InterPro Domain Architecture (IDA) view facilitates domain composition analysis of proteins with domains within InterPro entries. It presents a graphical representation of protein domain architecture, where the domain architecture of a protein sequence is displayed as a series of non-overlapping domains. This viewer facilitates retrieval of all proteins sharing a common architecture, i.e. proteins containing the same domain(s) or repeat(s) in the same order in the protein sequence.

The Taxonomy Viewer has increased functionality; a pop-up box appears as the mouse cursor is moved over the picture, clicking on a particular lineage returns the protein overview matches for the selected taxonomy and the main taxonomy branches, the species being sorted and displayed alphabetically. Both the UniProt protein accession number and the protein overview match are clickable and return the detailed matches view for the protein. Full taxonomic information can be retrieved from the Newt taxonomy browser for the species by clicking on the taxonomic id number next to the species name on the display.

In this release we have included the AstexViewer(tm) that permits a 3D view of the CATH and SCOP domains on the PDB chain(s). This is accessed by clicking on the icon in the second column of 'Detailed protein matches' views. Upon clicking the icon, both the licence for the use of the AstexViewer(tm) and the Java applet page containing the 3D viewer are loaded. The residues included in the CATH or SCOP domain definition are highlighted on the PDB chain. The view can be rotated by clicking on the left mouse button and moving the cursor over the image, clicking the right mouse button over the image opens a menu to perform various functions, such as adding a ligand to the view. This viewer will require your browser to be java enabled. The software should run on most operating systems and in most internet browsers (so far we are aware that the viewer does not work on Internet Explorer 4).

Contents of current release

InterPro protein matches are now calculated for all UniProt proteins, which are a combination of UniProt/Swiss-Prot, UniProt/TrEMBL and PIR proteins. For more information see UniProt.

InterPro release 8.0 contains 11007 entries, representing 2573 domains, 8166 families, 201 repeats, 26 active sites, 21 binding sites and 20 post-translational modification sites. Overall, there are 4586349 InterPro hits from 1215489 UniProt protein sequences. A complete list is available from the ftp site.

The current version of InterPro consists of:

DATABASE VERSION ENTRIES
UniProt/Swiss-Prot 43.5 153325
PRINTS 37.0 1850
UniProt/TrEMBL 26.5 1062164
Pfam 13.0 7426
PROSITE patterns 18.24 1697
PROSITE preprofiles N/A 125
ProDom 2004.1 1522
Smart 4.0 663
TIGRFAMs 3.0 1977
PIR SuperFamily 2.41 549
SUPERFAMILY 1.63 552

Forthcoming changes

The next release of InterPro will be release 8.1 scheduled for October 2004. This will include new data from the member databases and further improvements to the Taxonomy servlet and to the InterPro Domain Architecture tool. In addition, work is underway to include extended predictions of structural class methods through CATH HMMs.

Feedback

We need your help and would welcome any feedback. If you find errors or omissions please let us know. You can contact us at: EBI Support..

Copyright

InterPro - Integrated Resource Of Protein Domains And Functional Sites. Copyright (C) 2001 The InterPro Consortium. This manual and the accompanying database may be copied and redistributed freely, without advance permission, provided that this Copyright statement is reproduced with each copy.