- What is ChEBI?
- What classes of compounds are included in ChEBI?
- What are ChEBI's sources of data?
- What is the 'ChEBI Name'?
- What is the 'ChEBI ID'?
- What is the 'IUPAC Name'?
- How does ChEBI handle names of compounds which contain special characters (e.g. italics and Greek letters)?
- Why is there sometimes a 'more structures' link displayed for an entity in ChEBI?
- The structure of an entity is too complex to be viewed clearly on the ChEBI page. How do I magnify it?
- How do I save or transfer a structure?
- What is the IUPAC International Chemical Identifier?
- What is SMILES?
- How is the information in ChEBI validated?
- Can I provide input into ChEBI?
- How frequently is ChEBI updated?
- Why does ChEBI seem to use the name of the free acid as the 'ChEBI name' while listing under 'Synonyms' the names of its associated anions?
- What is the molecule portrayed in the ChEBI logo?
- Why are some synonyms and formulae fully formatted with italics, sub- and super-scripts, Greek characters, etc., while others appear as plain text?
- What is the recommended way for citing ChEBI in publications?
- You say that the 'ChEBI ID' is stable, but why does it seem like the ChEBI ID's of some entries have changed after the last ChEBI release?
- What is the ChEBI Ontology?
- Why are some of the entries and relationships displayed in the Tree View of the ChEBI Ontology shown in blue while others are shown in grey?
ChEBI stands for 'Chemical Entities of Biological Interest'. It is a freely available database of 'small molecular entities', developed at the EBI. The term 'molecular entity' encompasses any constitutionally or isotopically distinct atom, molecule, ion, ion pair, radical, radical ion, complex, conformer, etc., identifiable as a separately distinguishable entity.
All compounds and other molecular entities within the database are either products of nature or synthetic products used to intervene in the processes of living organisms (either on purpose, as for drugs, or by accident, as for chemicals in the environment). However, only molecular entities not directly encoded by the genome are included, and thus in general nucleic acids, proteins and peptides derived from proteins by cleavage are not found within ChEBI.
ChEBI has imported its compound data from two main sources. These are:
1. IntEnz - the Integrated relational Enzyme database of the EBI, created in collaboration with the Swiss Institute of Bioinformatics (SIB). IntEnz contains the recommendations of the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology (NC-IUBMB) on the nomenclature and classification of enzyme-catalysed reactions.
2. KEGG LIGAND - Part of the Kyoto Encyclopedia of Genes and Genomes, LIGAND is a composite database, one part of which (COMPOUND) is a collection of biochemical compound structures.
ChEBI also incorporates data from other sources. These are listed in the ChEBI User Manual.
The ChEBI Name for an entity is the name for it which we recommend for use by the biological community. It is generally a traditional name but which has in cases been modified to enhance clarity, avoid ambiguity and follow more closely current IUPAC recommendations on chemical nomenclature.
This is a unique and stable identifier for an entity and may be cited by external users, for example, CHEBI:27732. It has no chemical significance.
This is a name provided for an entity based on current recommendations of the International Union of Pure and Applied Chemistry (IUPAC). In most cases a single IUPAC Name is provided for an entity.
7. How does ChEBI handle names of compounds which contain special characters (e.g. italics and Greek letters)?
All entries in ChEBI are fully coded using xml. However, when searching for compounds using the ChEBI search engine the xml tags may be ignored. For example, results for 3α-hydroxy-5α-pregnane will be obtained by typing 3alpha-hydroxy-5alpha-pregnane into the query box. More detailed information on searching, including the use of wildcards, is given in the separate Help facility on the ChEBI Search pages.
In many cases, one structure is not enough to grasp fully the complexity of a given entity. ChEBI annotators are therefore able to create and display as many structures as they deem necessary to depict a structure successfully. Clicking on the 'more structures' link (where this is present) will display the extra structures.
9. The structure of an entity is too complex to be viewed clearly on the ChEBI page. How do I magnify it?
Clicking on the 'Applet' button will open an interactive MarvinView applet which allows a structure to be manipulated. Double-clicking on the applet itself or selecting 'Window' in the dropdown menu opens this as a new window which can be maximised, allowing the structure to be viewed full-screen at a higher magnification. Other options in the dropdown menu allow different representations of the structure to be viewed, rotated, etc. Clicking on 'Image' restores the original image view.
Clicking on the Molfile link beneath a structural diagram allows the molfile for that structure to be saved.
Known as the InChI, this is a non-proprietary identifier for chemical substances that can be used in printed and electronic data sources thus enabling easier linking of diverse data compilations. Developed by IUPAC, it expresses chemical structures in terms of atomic connectivity, tautomeric state, isotopes, stereochemistry and electronic charge in order to produce a sequence of machine-readable characters unique to the respective molecule. Further information on the InChI is available at http://www.iupac.org/inchi/
SMILES (Simplified Molecular Input Line Entry System) is a simple but comprehensive chemical line notation, created in 1986 by David Weininger and further extended by Daylight Chemical Information Systems, Inc. SMILES specifically represents a valence model of a molecule and is widely used as a data exchange format. Further information on SMILES is available at http://www.daylight.com/smiles/
All entries in ChEBI are manually checked by a member of the curatorial team before being released for public access. The checking process involves reference to the original data sources and where necessary consultation of the primary literature references.
If you have any suggestions for entities which you believe should be included in ChEBI, please contact us using either the email form supplied or our Sourceforge project page (both accessible via the Contact ChEBI link on the left-hand menu). You will receive an automated acknowledgement by email which will be followed up at the earliest opportunity with a personal reply from a member of the ChEBI team.
Work continues on entering new data into ChEBI and on refining the ontology. Changes arising from this as well as from updates to the source databases are incorporated into ChEBI monthly.
16. Why does ChEBI seem to use the name of the free acid as the 'ChEBI name' while listing under 'Synonyms' the names of its associated anions?
This is a temporary measure. In ChEBI we try to represent the free acids and their associated anions as distinct entities, linked by "is conjugate base of" and "is conjugate acid of" relationships. However, the main data sources of ChEBI often do not make such distinction and frequently use the terms for anions as synonyms for free acids(e.g. KEGG COMPOUND:C00158 lists "Citrate" and "Citric acid" as synonyms.)
This is caffeine (CHEBI:27732).
18. Why are some synonyms and formulae fully formatted with italics, sub- and super-scripts, Greek characters, etc., while others appear as plain text?
ChEBI displays synonyms and formulae exactly as they are portrayed in the source databases. As different databases have different policies on how they display information, these are of necessity reflected in the ChEBI entries. In contrast the ChEBI Name and IUPAC Name(s) for a compound are always fully formatted.
We recommend to use something along these lines:
Degtyarenko, K., de Matos, P., Ennis, M., Hastings, J., Zbinden, M., McNaught, A., Alcántara, R., Darsow, M., Guedj, M. and Ashburner, M. (2008) ChEBI: a database and ontology for chemical entities of biological interest. Nucleic Acids Res. 36, D344–D350.
20. You say that the 'ChEBI ID' is stable, but why does it seem like the ChEBI ID's of some entries have changed after the last ChEBI release?
In order to decrease redundancy, those ChEBI entries that refer to the same molecular entity are being combined into one entry (merged). In the resulting merged entries, only one ChEBI ID is shown. However, it is still possible to retrieve the same entry using any of the original ChEBI IDs. For example, the URL CHEBI:8345 will still retrieve the entry for potassium(1+) even though it now has the identifier CHEBI:29103.
The ChEBI Ontology is a structured classification of the entities contained within ChEBI. Its structure is essentially that of a directed acyclic graph (DAG), which differs from a simple taxonomy in that a child term can have many parent terms. Additionally, a number of relationships are incorporated which are cyclic in nature. It comprises three separate sub-ontologies (Chemical Entity, Role (which includes Biological Role, Chemical Role and Application), and Subatomic Particle), employs a number of different relationships and offers the user a choice of two views: a 'Parents and Children View' (in which the types of relationship between an entry and its immediate parent or children are stated in words) and a 'Tree View' (a graphic which places the ChEBI entry into context within the overall ontology structure). Further information about the sub-ontologies, relationships and views are given in ChEBI User Manual.
22. Why are some of the entries and relationships displayed in the Tree View of the ChEBI Ontology shown in blue while others are shown in grey?
Entries and relationships which have been checked by a ChEBI annotator are shown in blue while preliminary (unchecked) ones are in grey. Clicking on a node within the tree will take the user to the ChEBI entry for that node. Unchecked ChEBI entries accessed by this route display the heading 'Preliminary ChEBI Entry' and the data shown therein must be treated with circumspection.