Examples of why we need ChEBI
Chemical Nomenclature - part 1
Often a single molecule can have a large number of valid chemical names (Figure 2), whilst a single or a chemically ambiguous common name can be used for more than one molecule. This causes major problems when trying to find all references to a compound in the scientific literature.
Paracetamol, acetaminophen, 4-acetamidophenol, N-(4-hydroxyphenyl)acetamide, panadol, tylenol...?
Figure 2 This drug is generally known as paracetamol in the UK but as acetaminophen in US. It also has numerous names based on its structure (including those conforming to either IUPAC or Chemical Abstracts naming rules), brand names, etc.
ChEBI reduces ambiguity by providing a unique and recommended name along with a stable ChEBI identifier. It also includes a recommended IUPAC name and a collection of synonyms, including brand names and International Nonproprietary Names (INNs) for drugs, which can be used by text mining applications.
Try it yourself
1. Go to ChEBI homepage and type "tylenol" into the search box. What is the ChEBI name for this compound?
2. One of the secondary IDs (to the right of the structure) is CHEBI:46191. Return to ChEBI homepage and search for "46191". What is the ChEBI ID of the record you find?
Chemical Nomenclature - part 2
Another problem with chemical nomenclature is the use of same chemical name for more than one substance (Figure 3).
Figure 3 Adrenaline can have several possible structures.
ChEBI names are more specific and are followed by an appropriate definition.
Try it yourself
Go to ChEBI homepage and type “adrenaline" into the search box. Find the biologically active isomer.
Missing stereochemical representation of biologically important molecules in the research literature.
ChEBI provides stereospecific illustrations in the form of a 2D and 3D figure, where possible. Additional structural information is also available in the form of IUPAC International Chemical Identifier (InChI) and Simplified Molecular Input Line Entry System (SMILES).
Go to ChEBI homepage and type “cholesterol" into the search box. Find the stereochemisty.
Annotation of Bioinformatics data
Chemical annotations are commonly captured as free text. The terminologies used vary from one annotator to another, so it can be difficult to interpret the intended meaning.
ChEBI uses standard and consistent vocabularies or ontologies with clearly defined terms (see ChEBI Ontology) for annotations that enable semantic compatibility for machine searching of biological information.
Updating existing information
It can take many years to identify natural products, involving a long series of publications that cover the initial identification of a molecular skeleton (and its possible revision), followed by identification of some stereocentres (and their possible subsequent revisions), and evenutally the full identification of the complete structure.
ChEBI is a manually curated database where the information stored is frequently updated with useful literature resources and database cross-references.