Chemistry Development Kit (CDK)

The Steinbeck group founded and develops the Chemistry Development Kit (CDK), the leading open-source Java library for structural chemo- and bioinformatics. The CDK covers a wide range of functionality needed for performing virtual compound screening, property prediction and many other tasks of molecular informatics. In addition to its virtues for developing open systems in structural bioinformatics, it is a valuable tool for teaching. With 90.000 non commenting code statements (NCSS) in over 9000 methods in 900 classes, the CDK provides a basis for studying hands-on examples for the standard algorithms used in handling and modifying molecular structures as well as for calculating their properties, written in a modern object-oriented language, using commonly accepted design patterns.


JChemPaint (JCP) is the editor and viewer for 2D chemical structures developed using CDK. It is implemented in several forms: a Java application and two varieties of Java applet. It can also be used as a component to embed in other applications. JCP has well tested an user-friendly interface. Its behaviour is consistent in the application and the applet.

JChemPaint offers:

  • Drawing and deletion of single, double, triple and stereo bonds
  • Ring templates (3-8 atoms) with one-click attachment
  • An extensive template library
  • Colouring of atom types, and other rendering settings
  • Editing of atomic charges, isotopes and hydrogen count
  • Loading and saving of structures in Chemical Markup Language (CML) and as MDL MOL files and SDF files (loading only)
  • Automated Structure Layout, also known as Structure Diagram Generation
  • Loading structures from the Internet using CAS or NSC number
  • Normalisation of structures, currently limited to aromaticity detection
  • Saving bitmap pictures of the structures
  • Saving structures as graphics (PNG, BMP, Scalable Vector Graphics (SVG))
  • Postscript printing

The amount of novel basic research to be performed while developing the CDK goes significantly beyond what is to be expected from what looks like a pure infrastructure project. Questions of how to perceive aromaticity, perform fingerprinting of structures or define pharmacophore queries, are often researched and published in CDK context for the first time.


OrChem is an Oracle database chemistry plug-in using the CDK. For chemistry databases, various commercial "cartridges" exist that facilitate searching and analyzing chemical data. OrChem also provides functionality like this, but is not a cartridge. It doesn't need Oracle's extensibility architecture because its Java components run as Java stored procedures inside the Oracle standard JVM (Aurora).


In collaboration with Jarl Wikbergs group at the University of Upsala, Sweden, we have founded the Bioclipse project to build a plug-in based, rich client desktop workbench for molecular informatics. Bioclipse won the JAX conference audience award for important European contribution to the development of Eclipse in 2006. On November 2007, the project was recognised with a jury prize in the 4th edition of the Trophées du Libre.

While Upsala currently expands Bioclipse's functionality towards Proteochemometrics capabilities, the Steinbeck group is working on plug-ins for spectrum handling, database editing and extension of Bioclipse's Systems Biology capabilities. The spectrum facilities are grouped in the Speclipse feature. The integration of an Systems Biology Markup Language (SBML) editor and the integration of metabolomics simulations will be the next step. Bioclipse is a state-of-the-art, user-friendly, open-desktop application for performing System Biology Simulations.


The JCAMP-DX project is the reference implemention of the IUPAC JCAMP-DX spectroscopy data standard. It implements a parser and a writer to convert JCAMP-DX files to Java objects.