Data Structure in ISIS/Base
MACiE in ISIS/Base has the following format:

However, when it is exported as an RDFile, it is a flat text file. The following hierarchy is implied

Data Structure in MySQL
Although MACiE is a standalone database in its own right, the full strength of MACiE comes from its interactions with Metal-MACiE and CoFactor as well as the core and data databases themselves.

Metal-MACiE, whilst conceptually part of the MACiE database, is designed to be function as a separate entity. Metal-MACiE is designed to organise the available information on the properties and the roles of metal ions along the reaction pathways of the catalytic reactions carried out by metalloenzymes, and more specifically, those metalloenzymes in MACiE. The main aim of Metal-MACiE is to aid in our understanding of the chemistry that underlies metal-dependent catalysis, in the same manner that MACiE is aiding in our understanding of how enzymes perform catalysis.
CoFactor is a distinct entity and provides a web interface to access hand-curated data extracted from the literature on the 27 organic enzyme cofactors, as well as automatically collected information. CoFactor includes information on the conformational and solvent accessibility variation of the enzyme-bound cofactors, including mechanistic and structural information about the hosting enzymes. Whilst it is closely related to MACiE, it concentrates on the overview of the cofactors, with the step details still being held within MACiE.
There are four basic components to the MACiE database:
- The Overall Reaction.
- The reaction steps
- The citation details
- Automatically generated structural data
There are six basic types of table in the MySQL database:
- The main entry table - this includes all the data specific to a MACiE entry, such as the representative PDB code, and the resolution of that crystal structure, the EC number, along with the break down of the EC number into its constituent parts, the accepted name of the enzyme, any comments relating to either the overall reaction or the chosen PDB code, the species from which the entry has come and whether the enzyme is returned to a state in which it can catalyse a further cycle and whether the overall reaction can proceed in the reverse direction
- The overall reaction tables describe the chemistry occurring in the overall reactions in seven tables:
- Overall substrates and products - this lists the overall substrates and products involved in the reaction and also notes the stoichiometry (count) of those molecules, their molecular weight and atomic compositions.
- Catalytic amino acid residues - this describes the catalytic residues involved in the reaction mechanism in general terms, with a description of the residue's activity (spectator only, reactant only, or both) and notes on the residue, as well as more detailed information on the base residue type for those residues that are post-translationally modified.
- Cofactors - similarly to the catalytic amino acid residue table, this describes the cofactors involved in the reaction mechanisms in general terms.
- Bond changes - this table describes the bond changes occurring in the overall reaction, and lists the bonds formed, cleaved, changed in order and those that are "involved".
- Reactive centres - this table details the atoms involved as reactive centres in the overall reaction.
- Evidence - this table details the evidence cited for the mechanism as shown in MACiE. This table is not yet fully implemented.
- The reaction step tables describe the chemistry occurring during each reaction step that is involved in the enzyme mechanism in seven tables:
- Step - this is the main reaction step table and includes whether the reaction is reversible or not, and comments that might be associated with it, and where it is known a representative PDB structure.
- Mechanisms - this table describes the Ingold mechanism or reaction type of the step
- Mechanism components - this table describes the reaction attributes of the step.
- Bond changes - this table describes the bond changes occurring in the reaction step, and lists the bonds formed, cleaved and changed in order
- Reactive centres - this table details the atoms involved as reactive centres in the reaction step.
- Residue functions - this table described the functions, and the location of that function, that the catalytic amino acid residues are performing in a given reaction step.
- Cofactor functions - this table described the functions that the cofactors are performing in a given reaction step. It includes both metal and non-metal cofactors, although the actual annotation details seen in the web version of MACiE are drawn from Metal-MACiE for the metal cofactors.
- The reference tables which hold the information of all the references cited in MACiE.
The background database contains the various statistics tables:
- Version Statistics - this table contains the various statistics relating to the individual versions of MACiE, including the number of entries, representative PDB codes, unique EC numbers, unique EC sub-subclasses, unique catalytic CATH domains and the total number of unique CATH domains.
- Amino acid residue catalytic propensity - this table is used to calculate the propensity of a residue to be catalytic, thus contains the total percentage of a given residue type in MACiE (for each EC class) and the total count of that residue type in all the proteins in MACiE, the catalytic percentage of a given residue type (again, for each EC class) and the number of that residue type that are catalytic, as well as the propensity, which is calculated as the catalytic percentage divided by the total percentage for a given residue type.
- The identifier tables which describe the various identifiers used in MACiE, which include the CATH codes, UniProt codes, previous EC numbers, EzCatDB and SFLD identifiers as well as the KEGG to ChEBI correspondences.
All of these tables are then used to automatically generate both the web pages that make up the publicly available database, and the CML.
Data Structure in CML
All data in a CML file are held under the top level cml element, this is because there must only ever be one top level element in XML. Each entry starts with a reactionScheme element which has an id of a MACiE number (Mxxxx, where x is a digit). All elements will have an associated dictRef where appropriate.
The top level reactionScheme may contain the following:
- reactions. Typically there are two of these, one with the role="macie:overallReaction" and one with role="macie:overallProductsAndReactants"
- a metadataList. This includes the metadataList for the references and the general information associated with the overall annotation. The references metadataList contains a metadataList for each reference cited, this then contains individual matadata elements for each author, the journal, start page, end page, volume and year.
- name of the enzyme, may also include the name of the species the enzyme came from
- identifiers for the PDB code, CATH code and EC Number (each has their own identifier with specific conventions which are PDB, CATH and EC respectively). The EC number consists of four labels, each one relating to a specific level of the EC Code.
- a reactionStepList which includes all the steps in the reaction. This contains reactionSteps which typically include a single reaction although they could also contain a reactionScheme. A reactionStepList may also contain a list of mappings between reactions.
All reactions can have the following children (and all elements may have matadataList children):
- metadataList which contains all the information not suited to any other element.
- reactantList which has reactant children. Reactant amino acid residues will have label children which detail the function of a given amino acid. Reactants (and products) have atomArray and bondArray children, which have atom and bond children, which in turn may have label and bondStereo children. labels typically are the name given to a specific R group
- productList which has product children
- spectatorList which has spectator children. Amino acid spectators will have label children which detail the function of a given amino acid spectator.
- mechanism which can have children of labels or mechanismComponents
- reactiveCentre which can have atomList, atomTypeList, bondList and bondTypeList children. Typically only atonTypeLists and bondTypeLists are found in MACiE. Each list can contain atomTypes and bondTypes respectively.
- map which contains the atom-atom mapping.
It should be noted, that we do not currently have the chemical structures of the reactions in the CML version of MACiE, this is due to a technical issue, and we hope to have this resolved soon.
 |