Get   for     ? 
 Site search     ? 
Catalytic Site Atlas Version 2.2.12
Find Annotated Site: PDB code:
Swiss-Prot code:
EC number:
Help
Catalytic Site Atlas: Help Page

Topics

Version Numbers

The CSA version number is composed of three parts- for example, 2.3.11.

Introduction

The Catalytic Site Atlas (CSA) is a database documenting enzyme active sites and catalytic residues in enzymes of 3D structure. We defined a classification of catalytic residues which includes only those residues thought to be directly involved in some aspect of the reaction catalysed by an enzyme. For a full description of the classification, see Reference 2.

The CSA contains 2 types of entry:

  1. Original hand-annotated entries, derived from the primary literature. References for these entries are given.
  2. Homologous entries, found by PSI-BLAST alignment (using an e value cut-off of 0.0005) to one of the original entries. The equivalent residues, which align in sequence to the catalytic residues found in the original entry are documented.

Access to the CSA is via PDB code, SWISS-PROT entry or E.C. number. Accessing via PDB code takes you straight to the CSA entry for that PDB, while accessing via SWISS-PROT or E.C. number gives a list of all PDB codes for structures assigned that particular SWISS-PROT identifier or E.C. number. Structures with entries in the CSA are given as hyperlinks.

Each CSA entry lists the catalytic residues found in that entry, using PDB residue numbering. Each site is also marked with an evidence tag, which is either "Literature reference" or "PSI-BLAST hit". If the entry is a PSI-BLAST hit you can follow the link to the original entry. The active site can be visualised using RasMol, for detailed instructions on how to set this up, please click here.

Each entry contains a link to a list of homologous entries found by PSI-BLAST, and a link to other PDB structures with identical E.C. numbers or SWISS-PROT identifier to the entry you are viewing.

Homologous entries

Entries in the CSA can be divided into literature entries (for which the catalytic residues are derived directly from papers) and homologous entries, found by PSI-BLAST alignment to one of the original entries.

Homologous entries are searched for using a PSI-BLAST search against all sequences currently in the Protein Databank, plus all sequences in a non-redundant subset of UniProt. Any UniProt sequences matched are not included in the CSA if they do not have a structure associated; however, they may still prove useful for bridging the gap between related structures. The PSI-BLAST search has an e-value cut-off of 0.0005 and goes through five iterations.

Generally speaking, homologous entries are only included in the CSA if the residues which align with the catalytic residues in the parent literature entry are identical in residue type. In other words, there must be no mutations at the catalytic residue positions. There are, however, a few exceptions to this rule:

  1. In order to allow for the many active site mutants in the PDB, one (and only one) catalytic residue per site can be different in type from the equivalent in the parent literature entry. This is only permissible if all residue spacing is identical to that in the parent literature entry, and there are at least two catalytic residues.
  2. Sites with only one catalytic residue are permitted to be mutant provided that the residue number is identical to that in the parent entry.
  3. Fuzzy matching of residues is permitted within the following groups: [V,L,I],[F,W,Y],[S,T],[D,E],[K,R],[D,N],[E,Q],[N,Q]. This fuzzy matching cannot be used in combination with rules (1) or (2) above.

It is always possible that the catalytic residue assignments in a homologous entry are incorrect. This is a particular danger when the catalytic function of the homologous entry differs from that of the original literature entry. For this reason, homologous entries which have a different EC code from the original literature entry are clearly marked.

Catalytic Site Atlas Entries

CSA entry for 1apx
(1) Original Entry
Title:
Peroxidase
Compound:
Crystal structure of recombinant ascorbate peroxidase
Swiss-Prot:
APX1_PEA(2)
EC Class:
1.11.1.11(3)
Other CSA Entries:
Homologues of 1apx(4)
Entries for Swiss-Prot:APX1_PEA(5)
Entries for EC:1.11.1.11(6)
Other Databases:
PDB entry: 1apx(7)
PDBsum entry: 1apx(8)

Swiss-Prot: APX1_PEA(9)
IntEnz entry: 1.11.1.11(10)

  1. Original Entry - indicates that this is one of the original hand annotated set of enzymes. This means that the entry will contain references to the primary literature from which the information was gathered to produce this entry.
  2. Link to SWISS-PROT page at EBI
  3. Link to IntEnz enzyme database at EBI
  4. Link to a list of enzymes found by PSI-BLAST to the enzyme shown here (1apx, peroxidase). These enzymes will form part of the homologous set of enzyme structures contained in the CSA
  5. Link to a list of enzyme structures which share the same SWISS-PROT identifier as the enzyme shown here. Those with entries in the CSA are shown as hyperlinks
  6. Link to a list of enzyme structures which share the same EC number as the enzyme shown here. Those with entries in the CSA are shown as hyperlinks
  7. Link to PDBe PDB site
  8. Link to PDBsum summary of the enzyme structure
  9. Link to IntEnz at EBI

  1. This part of the page lists the active sites found or inferred in a particular enzyme structure
  2. One active site within the structure. Clicking on this link will load the structure into RasMol and highlight this particular site on the structure
  3. This is the evidence tag and will read either "Literature reference" or "PSI-BLAST to 1xyz" where 1xyz is one of the original set of enzymes. The Literature reference tag links to a list of the literature used to put the entry together; the PSI-BLAST tag links to the original CSA entry for 1xyz.
  4. These residues are defined by our classification as being catalytic, in the original set. For all homologous entries, the annotation is inferred.
  5. Sometimes an enzyme has a different activity to the literature entry that was used to annotate it. When this is the case, it suggests that the transfer of annotation may be inaccurate. Where a relative has a different EC number to the original literature entry used to annotate it, this message calls attention to the change.
  6. This link allows you to load the protein into RasMol. Ticking the box next to "Catalytic Site" enables you to highlight that particular site on the structure. You can highlight more than one site on a structure by ticking other boxes.


CSA site details

The CSA entries that have been entered into the database more recently have details of individual residue function, and details of the evidence that these residues are catalytic. Where this detail is available, there will be a link from the main page for that PDB ID to a page with the format displayed below. This link is only available for literature entries- if you are looking at an entry identified by PSI-BLAST, and you want to look at residue function details, you will have to follow the link to the literature entry, and then follow the link to the details.

  1. Each residue or cofactor has its own section.
  2. This section provides a list of the chemical functions of the residue in catalysis, and the types of group that it acts upon.
  3. This section expands on the descriptions in part (2), providing further detail.
  4. The evidence that this is a catalytic residue is listed. This column provides the PubMed IDs of the residues in question.
  5. The protein to which the evidence applies is listed. Often this will be the same protein whose residues are currently being described, in which case this section reads "Current Protein". If the evidence comes from studies on a related protein, the identity of this protein is provided in the form of a UniProt ID for that protein.
  6. This column gives the nature of the experimental evidence that this residue is catalytic.

Homologues in the CSA

Homologues found for: 1apx (1)
Literature Entries:
Entry
Title
Species
EC Class
Peroxidase
Pisum sativum
Homologues found from: 1apx(2)
CSA Entries:
Entry
Title
Species
EC Class
Oxidoreductase
Saccharomyces cerevisiae
Oxidoreductase
Saccharomyces cerevisiae
Oxidoreductase
Saccharomyces cerevisiae
Oxidoreductase
Saccharomyces cerevisiae
Oxidoreductase
Saccharomyces cerevisiae
Oxidoreductase
Saccharomyces cerevisiae
...
Oxidoreductase (h2o2(a))
Saccharomyces cerevisiae
Oxidoreductase (h2o2 (a))
Saccharomyces cerevisiae
Oxidoreductase (h2o2(a))
Saccharomyces cerevisiae recombinant
Oxidoreductase
Armoracia rusticana
Oxidoreductase
Armoracia rusticana

  1. This lists the homologous entries found for 1apx, and is divided into two, firstly any homologues found by PSI-BLAST that also have a hand-annotated "literature" entry, and secondly homologues for which we infer the annotation.
  2. The list of enzymes with inferred annotation. Each link is a hyperlink to the CSA entry page for that PDB.


EC Classes in the CSA

EC Search of Catalyic Site Atlas. (1)

PDB files contained in the Catalytic Site Atlas.

EC: 1.11.1.5 (2)
CSA Entries:

  1. Wildcard searches can be performed by leaving a field of the search form blank (eg Searching on EC Number 1.11. ... will find all the Peroxidases in the CSA)
  2. PDB files corresponding to EC number 1.11.1.11. There are links to entries in the CSA.



Displaying Active Sites with RasMol.

The Catalytic Site Atlas displays three-dimensional representations of catalytic sites within enzymes using a RasMol script.


Getting RasMol

The RasMol program is a popular program for displaying molecular structures. The program is free and runs under Windows, unix, and Macintosh/PPC computers.

You can install the program on your own machine by going to: Getting & Installing RasMol.


Configuring Netscape to run RasMol

Netscape version 4

  1. From the Edit option on the menubar select: Preferences.
  2. Locate the Navigator entry in the Category column on the left. If its arrow points to the right, then click on the arrow to open up the subentries Languages and Applications. (If the arrow is pointing downwards, then these should already be listed below the Navigator entry).
  3. Click on Applications.
  4. Click on New and fill in the form as follows:-
  5. Click the application check-box and type the following command into the box:- Or use the Choose button to select the file.

  6. Click on OK


Netscape 6 and Mozilla

Later versions of Netscape and Mozilla require a helper program to launch Rasmol.

For unix create a file named 'start_rasmol_script.com' containing the following two lines:

#!/bin/csh
xterm -e rasmol -script $1

and make it executable using the command:
chmod +x start_rasmol_script.com

For windows create a file named 'start_rasmol_script.bat' containing the following two lines:

cd /D C:\TEMP
"c:\Program Files\rasmol\raswin.exe" -pdb %~1

In Netscape:

  1. From the Edit option on the menubar select: Preferences.
  2. Locate the Navigator entry in the Category column on the left. If its arrow points to the right, then click on the arrow to open up the subentries Languages and Applications. (If the arrow is pointing downwards, then these should already be listed below the Navigator entry).
  3. Click on Applications.
  4. Click on New and fill in the form as follows:-
  5. Click the application check-box and type the following command into the box: Or use the Choose button to select the file.

  6. Click on OK

Note:

To display PDB files using RasMol use the above procedures with:

and run RasMol with the option:

Templates in the CSA

Structural templates are a geometric description of small groups of residues within a protein. Structural templates have been created for the catalytic sites in the CSA. These are available for download individually; alternatively, the whole set of templates can be used to search a structure. The effectiveness of each template at recognising related catalytic sites has been analysed.

Each template represents a single CSA family. A CSA family consists of one structure annotated using information from the literature, plus all its relatives.

Two different types of template have been created for each CSA family. These two template types used different atom subsets to represent the position of each residue. The first type of template represented residues in terms of the positions of their alpha carbon and beta carbon atoms, and was therefore a reflection of backbone orientation. The second type represented each residue using three functional atoms, and was a reflection of the orientation of the ends of the residue sidechains. In each case, the template was constructed from the family member whose catalytic residues were nearest the average position for all catalytic residues in the family.

The templates have been analysed to discover how well they can discriminate family matches from random noise. Each CSA family was analysed individually. For each CSA family, a representative template was selected, as mentioned above. This representative template was used to search a non-redundant subset of the PDB, in order to discover the distribution of random matches for that template. These random matches were compared with matches to other members of the CSA family. The ability of the template to discriminate random matches from family matches is quantified by performance statistics described below.

Templates in the CSA

Templates
Ca/Cb atom template
(1)
Download template in JESS format
(2)
Download template in TESS format
Template performance
(3)
Sensitivity 0.758
Predictive accuracy 1.000
RMSD threshold 0.460 Å
Template derived from family member 1a2f
Explanation of this plot
(4)
Explanation of this plot
(5)
  1. Type of template. For each CSA family, there are two types of template available. These two template types used different atom subsets to represent the position of each residue. The first type of template represented residues in terms of the positions of their alpha carbon and beta carbon atoms, and was therefore a reflection of backbone orientation. The second type represented each residue using three functional atoms, and was a reflection of the orientation of the ends of the residue sidechains. In each case, the template was constructed from the family member whose catalytic residues were nearest the average position for all catalytic residues in the family.
  2. Each template can be downloaded for use with template-matching programs. Two template formats are available.
  3. These statistics describe the performance of the template at discriminating family matches from random matches.
    Sensitivity
    A measure of how well the template detects real matches. The proportion of family members for this template which are below the RMSD threshold.
    Predictive accuracy
    A measure of how often the matches detected by the template are real. The proportion of matches below the RMSD threshold for this template which are family members.
    RMSD threshold
    This level of RMSD was found to be the optimal dividing line between family members (with lower RMSDs) and random matches (with higher RMSDs), for this template. Hits with RMSDs higher than the threshold may still be meaningful. The RMSD threshold is calculated by maximising Matthews Correlation Coefficient, a measure of how well two categories (family and random matches) are separated.
  4. Sequence-structure plot. This template represents a "family" of enzymes: one enzyme that has had its catalytic residues identified based on the scientific literature, plus other enzymes identified as relatives using PSI-BLAST. One of thsee has been used to derive the representative template. This plot shows the variability in the family. Each point represents a single family member. The Y-axis represents the structural difference between the catalytic residues of the representative template and the family member in question. The X-axis represents the sequence identity between the representative template and the family member in question.
  5. Histogram of matches. This template represents a "family" of enzymes: one enzyme that has had its catalytic residues identified based on the scientific literature, plus other enzymes identified as relatives using PSI-BLAST. One of thsee has been used to derive the representative template. RMSDs have been calculated between the representative template and other family members, and also between the representative template and random matches to the rest of the PDB. This histogram shows the distribution of family matches across different levels of RMSD, and also the distribution of random matches. Note that the Y-axis shows the percentage of family hits (or the percentage of random hits) falling in each histogram bin. Because this is a percentage in each case, it obscures the fact that there are hundreds of times more random hits than family hits.
      Which EBI biological databases are available and how do I access them? EBI Site Map