Chemical Entities of Biological Interest (ChEBI)

Page Contents

2. Sources

3. Data

4. Publication

1. Introduction

Chemical Entities of Biological Interest (ChEBI) is a freely available dictionary of molecular entities focused on ‘small’ chemical compounds. The term ‘molecular entity’ refers to any constitutionally or isotopically distinct atom, molecule, ion, ion pair, radical, radical ion, complex, conformer, etc., identifiable as a separately distinguishable entity. The molecular entities in question are either products of nature or synthetic products used to intervene in the processes of living organisms.

ChEBI incorporates an ontological classification, whereby the relationships between molecular entities or classes of entities and their parents and/or children are specified.

ChEBI uses nomenclature, symbolism and terminology endorsed by the following international scientific bodies:

International Union of Pure and Applied Chemistry (IUPAC)
Nomenclature Committee of the International Union of Biochemistry and Molecular Biology (NC-IUBMB)

Molecules directly encoded by the genome (e.g. nucleic acids, proteins and peptides derived from proteins by cleavage) are not as a rule included in ChEBI.

All data in the database is non-proprietary or is derived from a non-proprietary source. It is thus freely accessible and available to anyone. In addition, each data item is fully traceable and explicitly referenced to the original source.

The data on this website is available under the Creative Commons License (CC BY 4.0).

2. Sources

In order to create ChEBI, data from a number of sources were incorporated and subjected to merging procedures to eliminate redundancy.

Four of the main sources from which the data are drawn are:

IntEnz – the Integrated relational Enzyme database of the EBI. IntEnz is the master copy of the Enzyme Nomenclature, the recommendations of the NC-IUBMB on the Nomenclature and Classification of Enzyme-Catalysed Reactions.
KEGG COMPOUND – One part of the Kyoto Encyclopedia of Genes and Genomes LIGAND database, COMPOUND is a collection of biochemical compound structures.
PDBeChem – The service providing web access to the Chemical Component Dictionary of the wwPDB as this is loaded into the PDBe database at the EBI.
ChEMBL – A database of bioactive compounds, their quantitative properties and bioactivities, abstracted from the primary scientific literature. It is part of the ChEMBL resources at the EBI.

Other data sources are listed in the Reference Manual.

3. Data

ChEBI shows the following data fields:

ChEBI Identifer – the unique identifer
ChEBI Name – the name recommended for use in biological databases
ChEBI ASCII Name – the ChEBI name with any special characters rendered in ASCII format
Star rating – A rating based on the level of manual annotation
Structure – graphical representation(s) of a molecular structure and associated molfile(s), IUPAC International Chemical Identifier (InChI) and SMILES strings
Formula – Molecular formula
Charge
Average Mass
ChEBI Ontology
- Outgoing and incoming view
- An option of a tree view of the position of the entry within the ChEBI Ontology
IUPAC Name – name(s) generated according to recommendations of IUPAC
INN – International Nonproprietary Name, also known as generic name, assigned by the World Health Organization (WHO)
Synonyms – other names together with an indication of their source
Brand Name – a trade or proprietary name
Database Links – manually curated cross-references to other non-proprietary databases
Registry Number – CAS Registry Number, Beilstein Registry Number, Gmelin Registry Number (if available)
Citations – Publications which cite the entity along with hyperlinks to their entries

In addition, a separate page called 'Automatic Xrefs' contains automatically generated cross-references to a number of biological and chemical databases.

4. Publication

To cite ChEBI:

Hastings J, Owen G, Dekker A, Ennis M, Kale N, Muthukrishnan V, Turner S, Swainston N, Mendes P, Steinbeck C. (2016). ChEBI in 2016: Improved services and an expanding collection of metabolites. Nucleic Acids Res.

Other publications:

Hastings, J., de Matos, P., Dekker, A., Ennis, M., Harsha, B., Kale, N., Muthukrishnan, V., Owen, G., Turner, S., Williams, M., and Steinbeck, C. (2013) The ChEBI reference database and ontology for biologically relevant chemistry: enhancements for 2013. Nucleic Acids Res.
de Matos, P., Alcantara, R., Dekker, A., Ennis, M., Hastings, J., Haug, K., Spiteri, I., Turner, S., and Steinbeck, C. (2010) Chemical entities of biological interest: an update. Nucleic Acids Res.
Degtyarenko, K., Hastings, J., de Matos, P., and Ennis, M. (2009). ChEBI: an open bioinformatics and cheminformatics resource. Current protocols in bioinformatics / editoral board, Andreas D. Baxevanis ... [et al.], Chapter 14.
Degtyarenko, K., de Matos, P., Ennis, M., Hastings, J., Zbinden, M., McNaught, A., Alcántara, R., Darsow, M., Guedj, M. and Ashburner, M. (2008) ChEBI: a database and ontology for chemical entities of biological interest. Nucleic Acids Res. 36, D344–D350.

5. Acknowledgements

ChEBI was funded by BBSRC, grant agreement number BB/K019783/1 within the "Bioinformatics and biological resources" fund.

Previous funding:

ChEBI was funded by the European Commission under SLING, grant agreement number 226073 (Integrating Activity) within Research Infrastructures of the FP7 Capacities Specific Programme.
ChEBI was funded by BBSRC, grant agreement number BB/G022747/1 within the "Bioinformatics and biological resources" fund.
ChEBI was funded by the European Commission under FELICS, contract number 021902 (RII3) within the Research Infrastructure Action of the FP6 "Structuring the European Research Area" Programme.
The project was supported by the BioBabel grant (no. QLRT-CT-2001-00981) of the European Commission.

In addition we would like to acknowledge the following software support.

ClassyFire