Enzyme-catalysed reactions are ubiquitous and essential to the chemistry of life. A great deal of knowledge, including structures, gene sequences, mechanisms, metabolic pathways and kinetic data exists, but is spread between many different databases and throughout the literature. To consolidate much of this information, two databases were developed:
- MACiE (which stands for Mechanism, Annotation and Classification in Enzymes) is a database of enzyme reaction mechanisms. MACiE was the world's first electronic database of the chemical mechanisms of enzymatic reactions.
- The Catalytic Site Atlas (CSA) is a database documenting enzyme active sites and catalytic residues in enzymes of 3D structure.
The M-CSA (Mechanism and Catalytic Site Atlas) represents a unified resource that combines the data in both MACiE and the CSA to facilitate the searching of active sites and remove redundancy between the resources. The first version of the M-CSA was released in September 2017 (publication in preparation) and represents a major update of both the data and underlying resource architecture. Firstly, we have combined the MACiE and CSA datasets. The M-CSA now contains 944 entries of which 401 have a complete mechanism, and 543 describe the catalytic site only. 57 new entries have also been added. During this data amalgamation process, 303 entries were identical between the CSA and MACiE, and there were a further 127 duplicate entries within the CSA. These duplicates were retained within the curator-defined parent entry. A literature review has also been performed to update the citations and, where appropriate, mechanisms. To date, approximately a third of the entries have had this review completed. This version of MACiE also represents a change in how the data are stored and updated. We no longer use ISIS/Base and now have a new annotation process that directly accesses a PostgreSQL database using the Python Django web framework.
We'd like to thank the following for their help and input into M-CSA: Antonio Ribeiro, Gemma Holliday, Nick Furnham, Jon Tyzack, Katherine Ferris, Neera Borkakoti, Roman Laskowski, John Mitchell and Janet Thornton.
M-CSA had been funded by EMBL and the Wellcome Trust.
MACiE was a collaborative project between the Thornton Group at the European Bioinformatics Institute and the Mitchell Group at the University of St Andrews (initially within the Unilever Centre for Molecular Informatics, part of the University of Cambridge). We also extended our collaboration to include the Bertini Group at the Magnetic Resonance Center (CERM) in Florence (Italy). This aspect of the collaboration incorporated the expertise of CERM with metalloproteins and we developed Metal MACiE, a database of catalytic metal ions, with a view to understanding the functions of the roles and activity of catalytic metals in enzymes.
MACiE was first released in December 2005 (PMID:16188925). This version contained 100 entries that were non-homologous and based from the CatRes dataset (see: Bartlett et al. (2002) Analysis of Catalytic Residues in Enzyme Active Sites. DOI:10.1016/S0022-2836(02)01036-7, PMID:12421562). The initial web-based version of MACiE consisted of static HTML pages and was unsearchable. Entries were browsable through HTML look-up tables, arranged by EC number, CATH code, PDB code, or MACiE entry number. Approximately half of the entries had 2D-SVG animations associated with them. The underlying data was stored using MDL's ISIS/Base.
Version 2 of MACiE was released in 2007 (PMID:17082206 and extended coverage of MACiE to 202 entries. Several changes were made during the release cycle of Version 2 (from version 2.0 released in January 2007 through to version 2.5 released in December 2010) which saw the further addition of 78 entries (to a total of 280 entries) and the database was moved from static HTML pages to Perl CGI scripts based on an underlying MySQL database.
Version 3 of MACiE was released in August 2011 (PMID:22058127) and extended coverage of MACiE to 335 entries. This version of MACiE represented a shift in emphasis for new entries, from non-homologous representatives covering EC reaction space to enzymes with mechanisms of interest to our users and collaborators with a view to exploring the chemical diversity of life.
Many people have been involved in the MACiE project over the years and we would like to thank the following for their valuable contributions: Gemma L. Holliday, Gail J. Bartlett, Daniel E. Almonacid, Roman Laskowski, Syed Asad Rahman, Julia D. Fischer, Claudia Andreini, James Torrance, Sophie T. Williams, Anna M. Goral, Noel M. O'Boyle, Mattias Blomberg, André Minoche, Judith Reeks, John B. O. Mitchell, Peter Murray-Rust, William Pearson, and Janet M. Thornton
MACiE has been funded by: EMBL, EPSRC, BBSRC, Wellcome Trust, Cambridge Overseas Trust, Gobierno de Chile Ministero de Planificación y Cooperación, and Unilever.
The CSA was first released in 2002 as the Catalytic Residue dataset (CatRes, PMID:12421562), and later formally published as the CSA in 2004 (PMID:14681376). Version 1 of the CSA contained 177 original hand-annotated entries (from the primary literature, also called literature entries) and 2,608 homologous entries (determined automatically) which covered around 30 % of all EC numbers found in PDB at that time. Homologous entries were found by performing ain itterative PsiBLAST search using the Literature entries and then the residues in the homologoues were aligned to the core sequence and the residue correspondences were documented.
The CSA underwent several annotation cycles which extended both coverage and the level of annotation data captured. When Version 2 was released in 2014 (PMID:24319146) the databse contained 983 literature entries and well over 32 thousand homoologue annotations. Version 2 of the CSA implemented a more stringent homology search using FASTA and also implemented chemical similarity information for the substrate and products of the enzyme overall chemical transformations using SMSD. The CSA also extended the annotation contained to include the overall function of the catalytic residues and a brief textual descritpion of that function.
In all versions of the CSA, access to the CSA was via PDB ID, UniProtKB ID or EC number only, and each CSA entry listed the catalytic residues found in that entry, using PDB residue numbering. Each site was also marked with an evidence tag, which is either "Literature reference" or "Homologue". When an entry was found by sequence comparison, it was possible to follow the link to the original entry. In Version 2, a JMol visualisation was added for the active site. Each entry contained a link to a list of homologous entries (identified using the in-house homology method), and a link was also provided to other CSA entries identical EC numbers or UniProKB identifier to the entry currently being viewed.
A number of people have contributed to the CSA over the years as annotators and developers. We would like to thank Jonathan Barker, Carine Berezin, Amy Buchanan-Huges, Lynn Carr, Olivia Chan, Josephine Charalambous, Emma Compton, Atlanta Cook, Jennifer Dawe, Angelica Datta, Christian Drew, Alex Gutteridge, Stephanie Juniat, Roman Laskowski, Oleg Lenive, Mei Leung, Stuart Lucas, Ben McLeod, Malcolm MacArthur, Gary McDowell, Angela Malumbe, Duncan Milburn, Fiona Morgan, James Murray, Nozomi Nagano, Jonathan Ng, Emma Penn, Craig Porter, Judith Reeks, Peter Sarkies, Steven Smith, James Torrance, Annabel Todd, Andrew Wallace, Anna Waters, Sophie Williams, Eleanor Wright, Gemma Holliday, Nick Furnham, and (of course) Janet Thornton.
The CSA was funded by EMBL and the Wellcome Trust (grant number:081989/Z/07/A)
Other publications relating to M-CSA
- Rahman SA et al. (2016), Bioinformatics, 32, 2065-2066. Reaction Decoder Tool (RDT): extracting features from chemical reactions. DOI:10.1093/bioinformatics/btw096. PMID:27153692.
- Furnham N et al. (2016), J Mol Biol, 428, 253-267. Large-Scale Analysis Exploring Evolution of Catalytic Machineries and Mechanisms in Enzyme Superfamilies. DOI:10.1016/j.jmb.2015.11.010. PMID:26585402.
- Martínez Cuesta S et al. (2016), Proc Natl Acad Sci U S A, 113, 1796-1801. Exploring the chemistry and evolution of the isomerases. DOI:10.1073/pnas.1509494113. PMID:26842835.
- Furnham N et al. (2014), Nucleic Acids Res, 42, D485-D489. The Catalytic Site Atlas 2.0: cataloging catalytic sites and residues identified in enzymes. DOI:10.1093/nar/gkt1243. PMID:24319146.
- Martinez Cuesta S et al. (2014), Curr Opin Struct Biol, 26, 121-130. The evolution of enzyme function in the isomerases. DOI:10.1016/j.sbi.2014.06.002. PMID:25000289.
- Holliday GL et al. (2014), J Mol Biol, 426, 2098-2111. Exploring the biological and chemical complexity of the ligases. DOI:10.1016/j.jmb.2014.03.008. PMID:24657765.
- Rahman SA et al. (2014), Nat Methods, 11, 171-174. EC-BLAST: a tool to automatically search and compare enzyme reactions. DOI:10.1038/nmeth.2803. PMID:24412978.
- Alcántara R et al. (2013), Nucleic Acids Res, 41, D773-D780. The EBI enzyme portal. DOI:10.1093/nar/gks1112. PMID:23175605.
- Furnham N et al. (2012), PLoS Comput Biol, 8, e1002403-. Exploring the evolution of novel enzyme functions within structurally defined protein superfamilies. DOI:10.1371/journal.pcbi.1002403. PMID:22396634.
- Holliday GL et al. (2012), Nucleic Acids Res, 40, D783-D789. MACiE: exploring the diversity of biochemical reactions. DOI:10.1093/nar/gkr799. PMID:22058127.
- Furnham N et al. (2012), Nucleic Acids Res, 40, D776-D782. FunTree: a resource for exploring the functional evolution of structurally defined enzyme superfamilies. DOI:10.1093/nar/gkr852. PMID:22006843.
- Holliday GL et al. (2011), FEBS J, 278, 3835-3845. Characterizing the complexity of enzymes on the basis of their mechanisms and structures with a bio-computational analysis. DOI:10.1111/j.1742-4658.2011.08190.x. PMID:21605342.
- Fischer JD et al. (2010), J Mol Biol, 403, 803-824. The Structures and Physicochemical Properties of Organic Cofactors in Biocatalysis. DOI:10.1016/j.jmb.2010.09.018. PMID:20850456.
- Fischer JD et al. (2010), Bioinformatics, 26, 2496-2497. The CoFactor database: organic cofactors in enzyme catalysis. DOI:10.1093/bioinformatics/btq442. PMID:20679331.
- Andreini C et al. (2009), Bioinformatics, 25, 2088-2089. Metal-MACiE: a database of metals involved in biological catalysis. DOI:10.1093/bioinformatics/btp256. PMID:19369503.
- Rahman SA et al. (2009), J Cheminform, 1, 12-. Small Molecule Subgraph Detector (SMSD) toolkit. DOI:10.1186/1758-2946-1-12. PMID:20298518.
- Holliday GL et al. (2009), J Mol Biol, 390, 560-577. Understanding the functional roles of amino acid residues in enzyme catalysis. DOI:10.1016/j.jmb.2009.05.015. PMID:19447117.
- Andreini C et al. (2008), J Biol Inorg Chem, 13, 1205-1218. Metal ions in biological catalysis: from enzyme databases to general principles. DOI:10.1007/s00775-008-0404-5. PMID:18604568.
- O'Boyle NM et al. (2007), J Mol Biol, 368, 1484-1499. Using Reaction Mechanism to Measure Enzyme Similarity. DOI:10.1016/j.jmb.2007.02.065. PMID:17400244.
- Holliday GL et al. (2007), J Mol Biol, 372, 1261-1277. The Chemistry of Protein Catalysis. DOI:10.1016/j.jmb.2007.07.034. PMID:17727879.
- Holliday GL et al. (2007), Nucleic Acids Res, 35, D515-D520. MACiE (Mechanism, Annotation and Classification in Enzymes): novel tools for searching catalytic mechanisms. DOI:10.1093/nar/gkl774. PMID:17082206.
- Torrance JW et al. (2007), J Mol Biol, 369, 1140-1152. The Geometry of Interactions between Catalytic Residues and their Substrates. DOI:10.1016/j.jmb.2007.03.055. PMID:17466330.
- Holliday GL et al. (2006), J Chem Inf Model, 46, 145-157. Chemical Markup, XML, and the World Wide Web. 6. CMLReact, an XML Vocabulary for Chemical Reactions. DOI:10.1021/ci0502698. PMID:16426051.
- Holliday GL et al. (2005), Bioinformatics, 21, 4315-4316. MACiE: a database of enzyme reaction mechanisms. DOI:10.1093/bioinformatics/bti693. PMID:16188925.
- Torrance JW et al. (2005), J Mol Biol, 347, 565-581. Using a library of structural templates to recognise catalytic sites and explore their evolution in homologous families. DOI:10.1016/j.jmb.2005.01.044. PMID:15755451.
- Gutteridge A et al. (2005), Trends Biochem Sci, 30, 622-629. Understanding nature's catalytic toolkit. DOI:10.1016/j.tibs.2005.09.006. PMID:16214343.
- George RA et al. (2005), Proc Natl Acad Sci U S A, 102, 12299-12304. Effective function annotation through catalytic residue conservation. DOI:10.1073/pnas.0504833102. PMID:16037208.
- Porter CT et al. (2004), Nucleic Acids Res, 32, D129-D133. The Catalytic Site Atlas: a resource of catalytic sites and residues identified in enzymes using structural data. DOI:10.1093/nar/gkh028. PMID:14681376.
- Bartlett GJ et al. (2003), J Mol Biol, 331, 829-860. Catalysing new reactions during evolution: economy of residues and mechanism. PMID:12909013.
- Bartlett GJ et al. (2002), J Mol Biol, 324, 105-121. Analysis of Catalytic Residues in Enzyme Active Sites. DOI:10.1016/S0022-2836(02)01036-7.