About ChEBI

Introduction

ChEBI is an open-access database and ontology of chemical entities. The term ‘chemical entity’ in ChEBI refers to any constitutionally or isotopically distinct atom, molecule, ion, ion pair, radical, radical ion, complex, conformer, etc. The term also includes groups (part- molecular entities), chemical substances, and classes of molecular entities. The chemical entities in ChEBI are either naturally occurring molecules or synthetic compounds used to intervene in the processes of living organisms. Macromolecules directly encoded by the genome (e.g. nucleic acids, proteins and peptides derived from proteins by cleavage) are not as a rule included in ChEBI. The >195,000 entries in the database can be queried using text and structure searches (e.g. by name, molecular formula, InChI, or SMILES). For each entity ChEBI may include a wealth of other information such as literature citations, cross-references to other databases, and species data.

ChEBI uses the nomenclature, symbolism and terminology endorsed by the International Union of Pure and Applied Chemistry (IUPAC) and the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology (NC-IUBMB). ChEBI also incorporates an ontological classification, whereby the relationships between chemical entities or classes of entities and their parents and/or children are defined; this enables queries based for example on chemical class and role.


How is ChEBI used?

ChEBI is very widely used as a small molecule reference database by multiple resources worldwide and has steadily evolved and grown over the years. It is now a crucial part of the global biosciences and informatics infrastructure. In 2017, ChEBI was designated an ELIXIR core data resource and in 2022, it was selected as a Global core biodata resource in recognition of its fundamental importance to the wider biological and life sciences community. The list of databases that use ChEBI’s data is long and varied and includes, amongst others: Rhea, MetaboLights, UniProt, GO, IEDB, Reactome, PubChem, BioModels, IntAct, SwissLipids, Metaspace and the Ontology Lookup Service. For some of these resources, ChEBI is the sole source of accurate small molecule structural information linked to a stable identifier. Other resources use and import ChEBI’s unique chemical ontology, which when combined with other ontology terms provides powerful capabilities for data integration, hypothesis generation and reasoning.


Sources

ChEBI was first released in 2004. To create ChEBI, data from a number of sources were incorporated and subjected to merging procedures to eliminate redundancy.
Four of the main sources from which the data was derived are:


  • IntEnz – the Integrated relational Enzyme database and official version of the Enzyme Nomenclature, the recommendations of the NC-IUBMB on the Nomenclature and Classification of Enzyme-Catalysed Reactions. The database has now been deprecated.
  • KEGG COMPOUND – one of the four original databases that contains a collection of small molecules, biopolymers, and other chemical substances that are relevant to biological systems. It was introduced at the start of the KEGG project in 1995.
  • PDBeChem – A dictionary of chemical components (ligands, small molecules and monomers) referred to in PDB entries and maintained by wwPDB.
  • ChEMBL – A manually curated database of bioactive molecules with drug-like properties.

Over the past several years, data from other sources have also been included into ChEBI including the HMDB, LINCS, DrugCentral, Metabolomics Workbench and GlyGen databases. Apart from this, the main source of the data today are our submitters which include individual users and major databases such as MetaboLights and Rhea who submit new data via the submissions tool.


Data

Each ChEBI entry may include some or all of the following data fields:

  • ChEBI Identifer – A unique and stable identifer assigned by ChEBI
  • ChEBI Name – the name recommended for use in biological databases
  • ChEBI ASCII Name – the ChEBI name with any special characters rendered in ASCII format
  • Star rating – A rating based on the level of manual curation
  • Definition – A logical and natural language definition of the term
  • Last Modified – The date in which the entry was last modified
  • Submitter – The name of the user or resource who submitted the entry
  • Downloads – The option to download the data of the entry in several file formats
  • Structure – graphical representation(s) of the molecular structure and associated molfile(s), IUPAC International Chemical Identifier (InChI), SMILES strings, and WURCS (for carbohydrates).
  • Formula – Molecular formula
  • Net Charge
  • Average Mass
  • Monoisotopic Mass
  • Species of Metabolite – A table showing the species where the chemical entity is found
  • ChEBI Ontology
    • Outgoing and incoming ontology relations
    • A tree view which shows the position of the entry within the ChEBI Ontology
  • IUPAC Name – name(s) generated according to recommendations of IUPAC
  • INN – International Nonproprietary Name, also known as generic name, assigned by the World Health Organization (WHO)
  • Synonyms – other names together with an indication of their source
  • Brand Name – a trade or proprietary name
  • Database Links – manually curated cross-references to other non-proprietary databases
  • Registry Number – CAS Registry Number, Reaxys Registry Number, Gmelin Registry Number (if available)
  • Citations – Publications which cite the entity along with hyperlinks to the articles.

In addition, a separate page called 'Automatic Xrefs' contains automatically generated cross-references to a number of biological and chemical databases.


Licensing

All data in the ChEBI database is non-proprietary or is derived from a non-proprietary source. It is thus freely accessible and available to anyone. Each data item is fully traceable and explicitly referenced to the original source.

The data on this website is available under the Creative Commons License (CC BY 4.0), and governed by EMBL-EBI’s terms of use and Long-term data preservation policy.


Publications

To cite ChEBI:

Malik, A., Arsalan, M., Moreno, C., Mosquera, J., Félix, E., Kizilören, T., Muthukrishnan, V., Zdrazil, B., Leach, A. R., and O'Boyle, N. M. (2025). ChEBI: re-engineered for a sustainable future. Nucleic Acids Res.


Other publications:

  • Hastings J, Owen G, Dekker A, Ennis M, Kale N, Muthukrishnan V, Turner S, Swainston N, Mendes P, Steinbeck C. (2016). ChEBI in 2016: Improved services and an expanding collection of metabolites. Nucleic Acids Res.
  • Hastings, J., de Matos, P., Dekker, A., Ennis, M., Harsha, B., Kale, N., Muthukrishnan, V., Owen, G., Turner, S., Williams, M., and Steinbeck, C. (2013) The ChEBI reference database and ontology for biologically relevant chemistry: enhancements for 2013. Nucleic Acids Res.
  • de Matos, P., Alcantara, R., Dekker, A., Ennis, M., Hastings, J., Haug, K., Spiteri, I., Turner, S., and Steinbeck, C. (2010) Chemical entities of biological interest: an update. Nucleic Acids Res.
  • Degtyarenko, K., Hastings, J., de Matos, P., and Ennis, M. (2009). ChEBI: an open bioinformatics and cheminformatics resource. Current protocols in bioinformatics / editoral board, Andreas D. Baxevanis ... [et al.], Chapter 14.
  • Degtyarenko, K., de Matos, P., Ennis, M., Hastings, J., Zbinden, M., McNaught, A., Alcántara, R., Darsow, M., Guedj, M. and Ashburner, M. (2008) ChEBI: a database and ontology for chemical entities of biological interest. Nucleic Acids Res. 36, D344–D350.

Staying in Touch

Stay up to date with the latest ChEBI news and data releases by visiting the ChEBI blog, following us on X (formerly Twitter) and connecting with the EMBL-EBI Chemical Biology Services LinkedIn page.

Feel free to reach out to us via contact page or email (chebi-help [at] ebi [dot] ac [dot] uk) if you find any issues in the data, or need help with anything.


Acknowledgements

ChEBI is currently funded by the member states of EMBL and the BBSRC, grant agreement number BB/V018566/1 within the "Bioinformatics and biological resources" fund.

Previous funding:

  • ELIXIR staff exchange grant (EBI-2020-SEP12).
  • BBSRC, grant agreement number BB/K019783/1 within the "Bioinformatics and biological resources" fund.
  • European Commission under SLING, grant agreement number 226073 (Integrating Activity) within Research Infrastructures of the FP7 Capacities Specific Programme.
  • BBSRC, grant agreement number BB/G022747/1 within the "Bioinformatics and biological resources" fund.
  • European Commission under FELICS, contract number 021902 (RII3) within the Research Infrastructure Action of the FP6 "Structuring the European Research Area" Programme.
  • BioBabel grant (no. QLRT-CT-2001-00981) of the European Commission.

We would like to acknowledge the following software support:


Django
ElasticSearch
GitLab
GitHub
Python
Robot OBO
ClassyFire
Docker
Kubernetes
Nuxt
Luigi
Pandas
dbt
VueJS
Postgres
RdKit
Pronto
Ketcher
TS4NFDI