![]() |
Annotating the Overall Reaction in MACiEOverall annotation in MACIE has two components:
Annotation in ISIS/Base
There are a number of fields of the overall annotation that must be filled in within ISIS/Base, these are:
Figure 2, below, shows an example of the overall annotation that is entered using ISIS.
Once this annotation has been completed, the remainder should be done using the annotation script, available to registered developers. Annotation using the ScriptThis script is only available to registered developers of MACiE and is password protected. Below is a screen shot of the overall annotation script. If you would like to become a registered developer, please email me with details on how you would like to cotribute.
The overall annotation basically consists of four sections: 1) Annotation DetailsThese have nothing to do with the reaction annotation, but state the date the entry was last updated, and the people responsible for adding the reaction. The two fields here are: Date last updated. This should be in the format ddmmyyy, e.g. 05122007 for the 5th December 2007 Entry created by. This should be the initials of the person annotating the reaction. Current initials used for annotators are: GLH (Gemma Holliday), GJB (Gail Bartlett), DEA (Daniel Almonacid), AM (Andre Minoche), JDF (Julia Fischer), JAR (Judith Reeks). 2) Overall Reaction IdentifiersThese are the identifiers that link the current MACiE entry to other databases. The identifiers included are: MACiE ID. This is the MACiE id of the current entry. It should always take the form M\d{4}, e.g. M0001 CATH code(s). This links MACiE to the CATH database and allows us to access information on structural evolution of the enzyme. This is auto-generated from the primary PDB code for the MACiE entry. CATH codes are split into catalytic domains (those domains which furnish at least one catalytic residue) and "other" domains. KEGG Reaction Identifiers. This should the the reaction identifier in KEGG for the enzyme's overall reaction. It will not always be possible to find an exact correspondence, in which case, the field should be left blank. EzCatDB code. This links MACiE entries to Nozomi Nagano's database (EzCatDB). SFLD identifier. This links MACiE to the Structure-Function Linkage Database (SFLD). To find the relevant ID in the SFLD it is necessary to first search for the reaction using the EC number, this presents the user with a list of all the families that perform that EC number, select a family, some members of this family may well have crystal structures associated with them. Clicking upon the number of associated crystal structures will give a list of PDB codes, from which the relevant code can be selected, giving the selected structures database Id. Species Name. There are two fields associated with species name, the scientific name and the common name. UniProt code. This links us to the UniProt database, which gives information on the protein sequences which may be split into Catalytic and non-catalytic chains. It should be noted that it is not always possible to fill in all the identifier fields, e.g. there is only a sinlge UniProt identifier, and no non-catalytic UniProt identifiers. 3) Overall Reaction ChemistryThis section describes the chemistry occurring in the overall reaction. It should just be a reflection of the changes between the reactants and products of the overall reaction and should not take into account the mechanism involved in the individual steps of the reaction. The Reactive Centres should be annotated. These are defined as any atom at which a reaction occurs, such that a bond is either formed, cleaved or altered in order or an atom at which the oxidation state changes. They should be listed as if a unique ID was assignable to each one, thus there may be multiple occurrences of each atom type. Unfortunately, unique ID's are not assignable in ISIS, however, we hope to eventually automate this particular annotation, which will allow unique ID's to become assignable. All the bonds annotated in MACiE are entered as if a unique Id were assignable to the bond, allowing for multiple occurrences of the same bond type. The Bonds Involved are those bonds whose change cannot be described by bond formation, cleavage or change in order. An example of a bond involved would be one in which the stereochemistry is changed. They are written as a normal bond, e.g. N-H for a single bond, C=O for a double bond, C#N for a triple bond. Bonds Formed are those bonds that are formed during the course of the overall reaction. Bonds Cleaved are those bonds that are broken during the course of the overall reaction. They are written as a normal bond, N-H for a single bond, C=O for a double bond, C#N for a triple bond. Note: a multiple bond may be formed or cleaved in the overall reaction. However, the bond should also be included in the bonds changed in order Bonds changed in order are those bonds that change in order during the course of a reaction. They are written in a different format to the other bonds involved in the overall reaction, with the format: starting bond, starting order, final bond, final order. E.g. a C=O bond going to a C-O bond would be entered as: C=O,2,C-O,1. Cofactors are non-standard amino-acid small molecules that assists an enzyme in catalysis. To exclude allosteric regulaters, we require a cofactor to be present in the active site. Cofactors can be inorganic molecules (e.g. metal ions), or organic molecules (e.g. PLP), which may somtimes be complexes with metal ions (e.g. heme). These should be annotated in the following way: First, the name of the cofactor should be included, followed by the HET group name from the PDB file. Where there is no HET group name, the word "none" should be used. Then, the number associated with the HET group (analogous to the residue number) followed by the chain identifier. Again, if these values are not available, the word "none" should be used. All these values should be separated by commas, e.g. FAD,FAD,450,A or for when there is no information available FAD,none,none,none. If there are multiple cofactors of the same type with no PDB information, please distinguish them by assigning an arbitrary series of numbers (e.g. 1 and 2). Amino acid residues involved in the reaction should also be annotated here. They should be entered in the format of the three letter code, PDB residue number, followed by chain identifier, e.g. Asp126A. Finally, the question Does the enzyme return to its native state is posed. There are three options available here: Yes, No and the default value of a blank field. This question is a control, and is used to determine the number of reactions in which further evidence is needed in order to return the enzyme to a state in which it is able to undergo another reaction. However, if the enzyme does not have any amino acid residues that change in their bonding or electronic configurations, then the enzyme has not changed, and so cannot return, but it is not a No return, and so this field should be left blank. A special case of return are racemases, which may act upon either enantiomer to produce to the other enantiomer, in these cases the reaction is often symmetrical and so the enzyme is considered to return at the end of a single cycle, even if the end point is not identical to the original starting point. 4) Other Overall Reaction InformationEvidence for the mechanism should also be included. This currently takes one or more of the following values, which are a subset of those also allowed for amino acid residue function:
The evidence terms have been designed to be somewhat analogous to those used by the CSA (although a little more generic) and more can be added if needed. Because the EC number for an enzyme may change over time, it is important to include Previous EC Numbers in the annotation, especially as the PDB very often only lists the original EC number associated with the crystal structure, and not the current one. The biological unit (either according to the authors or the computational methods of the PDBe) should be noted. If the biological unit is a homodimer, then it should be annotated as homodimeric. Currently, this is a free-text box. The location of the active site should also be annotated, this can be within a single domain, at the interface between two domains on a single chain, at the interface between two (or more) chains or, more rarely, the reaction can occur across multiple active sites. Finally, Overall Reaction Comments may be added to the overall annotation. These are comments that are more appropriate to the entire reaction than a specific reaction step and may include information on alternative mechanisms that have been suggested, or simply more detailed information on the protein structure itself. Once all this data has been input, the script should be updated (using the Update button), at which point it will be possible to add in the evidence for the presence and function of the amino acid residues and cofactors. The evidence for the amino acid residues and the cofactors is essentially the same as for the overall reaction, with an extra value:
Completing the Overall Reaction AnnotationWhen all the annotation has been completed, the Write to File box should be checked and the Update button pressed, and this produces a text string in the correct format for copy and pasting into the annotation field of the overall reaction in ISIS/Base.
![]() |
||||||||||