ChEMBL Web Services
Web Service Update (May 2017)
A new version of the ChEMBL web services are now available, which expose significantly more data from the under underlying ChEMBL database and introduce new functionality. In addition to the web services that provide access to ChEMBL data, web services that provide access to commonly used cheminformatic methods are also available.
Web Services Documentation
Online interactive documentation is available for both ChEMBL web services:
ChEMBL Data Web Services
The updated ChEMBL web services are not compatible with the previous version of the ChEMBL web services. For this reason we still provide access to the earlier version and the older documentation. We encourage users to migrate any applications using the previous ChEMBL web services to the updated version described on this page.
Getting Started
To best way to get started is to have a look at some example URLs requesting data from the ChEMBL web services. The table below provides a list of examples and a description of the data being returned.
Resources
The following table provides a list of all ChEMBL web service resources currently available.
Supported formats
These are formats currently supported by the API:
Meta Data and Pagination
It is now possible to download all data from a specific ChEMBL web service resource. This is made possible by returning responses from the web services in 'pages', which can be navigated through using a 'page_meta' section. The 'page_meta' section includes information about total number of hits, total number of pages and links to the next and previous pages. An example 'page_meta' section is displayed below:
"page_meta": {
"limit": 20,
"next": "/chembl/api/data/activity.json?limit=20&offset=20",
"offset": 0,
"previous": null,
"total_count": 13520737
}
To download all ChEMBL activity endpoints (>13 million), the following URL can be used: https://www.ebi.ac.uk/chembl/api/data/activity. By inspecting the the 'page_meta' section the link to page 2 can be found, e.g. https://www.ebi.ac.uk/chembl/api/data/activity?limit=20&offset=20.
Filtering and Ordering
It is possible to apply search filters to all resource requests using a URL friendly query language. For example, it is possible to return all ChEMBL targets that contain the term 'kinase' in the pref_name attribute with the following URL: https://www.ebi.ac.uk/chembl/api/data/target?pref_name__contains=kinase.
The pattern for applying a filter is as follows:
https://www.ebi.ac.uk/chembl/api/data/[resource]?[field]__[filter_type]=[value]
Examples of other filter type are listed in the table below.
To order the results returned by a particular field the 'order=[field]' argument as added to a request. For example a user can sort targets based on the pref_name using the following URL: https://www.ebi.ac.uk/chembl/api/data/target?order_by=pref_name
The default ordering is in ascending order. To return the results in descending orede place a '-' before the field name: https://www.ebi.ac.uk/chembl/api/data/target?order_by=-pref_name
Note that it is possible combine order_by and filter arguments:https://www.ebi.ac.uk/chembl/api/data/target?pref_name__contains=kinase&order_by=-pref_name
Chemical Searching
The 'Substructure' and 'Similarity' web service resources allow for the chemical content of ChEMBL to be searched. Similar to the other resources, these search based resources except filtering, paging and ordering arguments. These methods accept SMILES, InChI Key and molecule ChEMBL_ID as arguments and in the case of similarity searches an additional identity cut-off is needed. Some example molecule searches are provided in the table below.
Searching with InChI key is only possible for InChI keys found in the ChEMBL database. The system does not try and convert InChI key to a chemical representation.
Molecule Images
The Image resource returns a graphical representation of a ChEMBL molecule. Unlike the other resources it does not except filtering and paging arguments, but does except image specific arguments. These are defined in the table below.
GET, POST and special characters
In GET request all the parameters has to be encoded into URL. Becasue there is a limitation of how long a URL can be it's often
more convenient to use POST requests instead. POST parameters are embedded into request body and can be of any size. This is
especially importand when retrievieng a long list of entities identified by (random) IDs.
ChEMBL API supports both GET and POST but since POST has a special meaning in REST protocol (CREATE), a special header has to be added to every POST request:
X-HTTP-Method-Override:GET
Another issue is character encoding. SMILES strings often contain characters (such as #, % or \) that have a special meaning in URLs.
This is why when using GET, all parameters should be percent-encoded.
One example is a following SMILES string:
[Na+].CO[C@@H](CCC#C\C=C/CCCC(C)CCCCC=C)C(=O)[O-]
Which can be encoded into URL in a following way: https://www.ebi.ac.uk/chembl/api/data/molecule/%5BNa+%5D.CO%5BC@@H%5D(CCC%23C%5CC=C/CCCC(C)CCCCC=C)C(=O)%5BO-%5D
Below is another examle of retrieving a molecule (CHEMBL1628285) that has the longes SMILES string currently stored in ChEMBL.
The original SMILES string is:
CCCCCCCCCCCCCCCC[NH2+]OC(CO)C(O)C(OC1OC(CO)C(O)C(O)C1O)C(O)CO.CCCCCCCCCCCCCCCC[NH2+]OC(CO)C(O)C(OC2OC(CO)C(O)C(O)C2O)C(O)CO.CCCCCCCCCCCCCCCC[NH2+]OC(CO)C(O)C(OC3OC(CO)C(O)C(O)C3O)C(O)CO.CCCCCCCCCCCCCCCC[NH2+]OC(CO)C(O)C(OC4OC(CO)C(O)C(O)C4O)C(O)CO.CCCCCCCCCCCCCCCC[NH2+]OC(CO)C(O)C(OC5OC(CO)C(O)C(O)C5O)C(O)CO.CCCCCCCCCCCCCCCC[NH2+]OC(CO)C(O)C(OC6OC(CO)C(O)C(O)C6O)C(O)CO.CCCCCCCCCCCCCCCC[NH2+]OC(CO)C(O)C(OC7OC(CO)C(O)C(O)C7O)C(O)CO.CCCCCCCCCCCCCCCC[NH2+]OC(CO)C(O)C(OC8OC(CO)C(O)C(O)C8O)C(O)CO.CCCCCCCCCCCCCCCC[NH2+]OC(CO)C(O)C(OC9OC(CO)C(O)C(O)C9O)C(O)CO.CCCCCCCCCCCCCCCC[NH2+]OC(CO)C(O)C(OC%10OC(CO)C(O)C(O)C%10O)C(O)CO.CCCCCCCCCCCCCCCC[NH2+]OC(CO)C(O)C(OC%11OC(CO)C(O)C(O)C%11O)C(O)CO.CCCCCCCCCCCCCCCC[NH2+]OC(CO)C(O)C(OC%12OC(CO)C(O)C(O)C%12O)C(O)CO.CCCCCCCCCC(C(=O)NCCc%13ccc(OP(=S)(Oc%14ccc(CCNC(=O)C(CCCCCCCCC)P(=O)(O)[O-])cc%14)N(C)\\N=C\\c%15ccc(Op%16(Oc%17ccc(\\C=N\\N(C)P(=S)(Oc%18ccc(CCNC(=O)C(CCCCCCCCC)P(=O)(O)[O-])cc%18)Oc%19ccc(CCNC(=O)C(CCCCCCCCC)P(=O)(O)[O-])cc%19)cc%17)np(Oc%20ccc(\\C=N\\N(C)P(=S)(Oc%21ccc(CCNC(=O)C(CCCCCCCCC)P(=O)(O)[O-])cc%21)Oc%22ccc(CCNC(=O)C(CCCCCCCCC)P(=O)(O)[O-])cc%22)cc%20)(Oc%23ccc(\\C=N\\N(C)P(=S)(Oc%24ccc(CCNC(=O)C(CCCCCCCCC)P(=O)(O)[O-])cc%24)Oc%25ccc(CCNC(=O)C(CCCCCCCCC)P(=O)(O)[O-])cc%25)cc%23)np(Oc%26ccc(\\C=N\\N(C)P(=S)(Oc%27ccc(CCNC(=O)C(CCCCCCCCC)P(=O)(O)[O-])cc%27)Oc%28ccc(CCNC(=O)C(CCCCCCCCC)P(=O)(O)[O-])cc%28)cc%26)(Oc%29ccc(\\C=N\\N(C)P(=S)(Oc%30ccc(CCNC(=O)C(CCCCCCCCC)P(=O)(O)[O-])cc%30)Oc%31ccc(CCNC(=O)C(CCCCCCCCC)P(=O)(O)[O-])cc%31)cc%29)n%16)cc%15)cc%13)P(=O)(O)[O-]
After encoding, the URL becomes this
CORS and JSONP
Both techniques are supported
Web Service Client
To help users get started with using the updated ChEMBL web services
the existing web service client has also been released. This is written
in the Python programming language and is available to install from Python
Package Index by typing:
pip install chembl_webresource_client
The client code is open and hosted on GitHub: https://github.com/chembl/chembl_webresource_client.
The following table provides some example use cases of the client.
| Use case |
Code |
| Search molecule by synonym |
from chembl_webresource_client.new_client import new_client
molecule = new_client.molecule
res = molecule.search('viagra')
|
| Search target by gene name... |
from chembl_webresource_client.new_client import new_client
target = new_client.target
gene_name = 'GABRB2'
res = target.search(gene_name)
|
| ... or directly in the target synonym field |
from chembl_webresource_client.new_client import new_client
target = new_client.target
gene_name = 'GABRB2'
res = target.filter(target_synonym__icontains=gene_name)
|
| Having a list of molecules ChEMBL IDs in a CSV file,
produce another CSV file that maps every compound ID
into a list of uniprot accession numbers and save the
mapping into output csv file. |
import csv
from chembl_webresource_client.new_client import new_client
# This will be our resulting structure mapping compound ChEMBL IDs into target uniprot IDs
compounds2targets = dict()
# First, let's just parse the csv file to extract compounds ChEMBL IDs:
with open('compounds_list.csv', 'rb') as csvfile:
reader = csv.reader(csvfile)
for row in reader:
compounds2targets[row[0]] = set()
# OK, we have our source IDs, let's process them in chunks:
chunk_size = 50
keys = compounds2targets.keys()
for i in range(0, len(keys), chunk_size):
# we jump from compounds to targets through activities:
activities = new_client.activity.filter(molecule_chembl_id__in=keys[i:i + chunk_size])
# extracting target ChEMBL IDs from activities:
for act in activities:
compounds2targets[act['molecule_chembl_id']].add(act['target_chembl_id'])
# OK, now our dictionary maps from compound ChEMBL IDs into target ChEMBL IDs
# We would like to replace target ChEMBL IDs with uniprot IDs
for key, val in compounds2targets.items():
# We don't know how many targets are assigned to a given compound so again it's
# better to process targets in chunks:
lval = list(val)
uniprots = set()
for i in range(0, len(val), chunk_size):
targets = new_client.target.filter(target_chembl_id__in=lval[i:i + chunk_size])
uniprots |= set(sum([[comp['accession'] for comp in t['target_components']] for t in targets],[]))
compounds2targets[key] = uniprots
# Finally write it to the output csv file
with open('compounds_2_targets.csv', 'wb') as csvfile:
writer = csv.writer(csvfile)
for key, val in compounds2targets.items():
writer.writerow([key] + list(val))
|
| Having a list of molecules ChEMBL IDs in a CSV file,
produce another CSV file that maps every compound ID into a
list of human gene names. |
import csv
from chembl_webresource_client.new_client import new_client
# This will be our resulting structure mapping compound ChEMBL IDs into target uniprot IDs
compounds2targets = dict()
# First, let's just parse the csv file to extract compounds ChEMBL IDs:
with open('compounds_list.csv', 'rb') as csvfile:
reader = csv.reader(csvfile)
for row in reader:
compounds2targets[row[0]] = set()
# OK, we have our source IDs, let's process them in chunks:
chunk_size = 50
keys = compounds2targets.keys()
for i in range(0, len(keys), chunk_size):
# we jump from compounds to targets through activities:
activities = new_client.activity.filter(molecule_chembl_id__in=keys[i:i + chunk_size])
# extracting target ChEMBL IDs from activities:
for act in activities:
compounds2targets[act['molecule_chembl_id']].add(act['target_chembl_id'])
# OK, now our dictionary maps from compound ChEMBL IDs into target ChEMBL IDs
# We would like to replace target ChEMBL IDs with uniprot IDs
for key, val in compounds2targets.items():
# We don't know how many targets are assigned to a given compound so again it's
# better to process targets in chunks:
lval = list(val)
genes = set()
for i in range(0, len(val), chunk_size):
targets = new_client.target.filter(target_chembl_id__in=lval[i:i + chunk_size])
for target in targets:
for component in target['target_components']:
for synonym in component['target_component_synonyms']:
if synonym['syn_type'] == "GENE_SYMBOL":
genes.add(synonym['component_synonym'])
compounds2targets[key] = genes
# Finally write it to the output csv file
with open('compounds_2_genes.csv', 'wb') as csvfile:
writer = csv.writer(csvfile)
for key, val in compounds2targets.items():
writer.writerow([key] + list(val))
|
| Find compounds similar to given SMILES query with similarity threshold of 85% |
from chembl_webresource_client.new_client import new_client
similarity = new_client.similarity
res = similarity.filter(smiles="CO[C@@H](CCC#C\C=C/CCCC(C)CCCCC=C)C(=O)[O-]", similarity=85)
|
| Find compounds similar to aspirin (CHEMBL25) with similarity threshold of 70% |
from chembl_webresource_client.new_client import new_client
molecule = new_client.molecule
similarity = new_client.similarity
aspirin_chembl_id = molecule.search('aspirin')[0]['molecule_chembl_id']
res = similarity.filter(chembl_id="CHEMBL25", similarity=70)
|
| Perform substructure search using SMILES |
from chembl_webresource_client.new_client import new_client
substructure = new_client.substructure
res = substructure.filter(smiles="CN(CCCN)c1cccc2ccccc12")
|
| Perform substructure search using ChEMBL ID |
from chembl_webresource_client.new_client import new_client
substructure = new_client.substructure
substructure.filter(chembl_id="CHEMBL25")
|
| Get a single molecule by ChEMBL ID |
from chembl_webresource_client.new_client import new_client
molecule = new_client.molecule
m1 = molecule.get('CHEMBL25')
|
| Get a single molecule by SMILES |
from chembl_webresource_client.new_client import new_client
molecule = new_client.molecule
m1 = molecule.get('CC(=O)Oc1ccccc1C(=O)O')
|
| Get a single molecule by InChi Key |
from chembl_webresource_client.new_client import new_client
molecule = new_client.molecule
molecule.get('BSYNRYMUTXBXSQ-UHFFFAOYSA-N')
|
| Get many compounds by their ChEMBL IDs |
from chembl_webresource_client.new_client import new_client
molecule = new_client.molecule
records = molecule.get(['CHEMBL6498', 'CHEMBL6499', 'CHEMBL6505'])
|
| Get many compounds by a list of SMILES |
from chembl_webresource_client.new_client import new_client
molecule = new_client.molecule
records = molecule.get(['CNC(=O)c1ccc(cc1)N(CC#C)Cc2ccc3nc(C)nc(O)c3c2',
'Cc1cc2SC(C)(C)CC(C)(C)c2cc1\\N=C(/S)\\Nc3ccc(cc3)S(=O)(=O)N',
'CC(C)C[C@H](NC(=O)[C@@H](NC(=O)[C@H](Cc1c[nH]c2ccccc12)NC(=O)[C@H]3CCCN3C(=O)C(CCCCN)CCCCN)C(C)(C)C)C(=O)O'])
|
| Get many compounds by a list of InChi Keys |
from chembl_webresource_client.new_client import new_client
molecule = new_client.molecule
records = molecule.get(['XSQLHVPPXBBUPP-UHFFFAOYSA-N', 'JXHVRXRRSSBGPY-UHFFFAOYSA-N', 'TUHYVXGNMOGVMR-GASGPIRDSA-N'])
|
| Obtain the pChEMBL value for compound |
from chembl_webresource_client.new_client import new_client
activities = new_client.activity
res = activities.filter(molecule_chembl_id="CHEMBL25", pchembl_value__isnull=False)
|
| Obtain the pChEMBL value for a specific compound AND a specific target |
from chembl_webresource_client.new_client import new_client
activities = new_client.activity
activities.filter(molecule_chembl_id="CHEMBL25", target_chembl_id="CHEMBL612545", pchembl_value__isnull=False)
|
| Get all approved drugs |
from chembl_webresource_client.new_client import new_client
molecule = new_client.molecule
approved_drugs = molecule.filter(max_phase=4)
|
| Get approved drugs for lung cancer |
from chembl_webresource_client.new_client import new_client
drug_indication = new_client.drug_indication
molecules = new_client.molecule
lung_cancer_ind = drug_indication.filter(efo_term__icontains="LUNG CARCINOMA")
lung_cancer_mols = molecules.filter(molecule_chembl_id__in=[x['molecule_chembl_id'] for x in lung_cancer_ind])
|
| Get all molecules in ChEMBL with no Rule-of-Five violations |
from chembl_webresource_client.new_client import new_client
molecule = new_client.molecule
no_violations = molecule.filter(molecule_properties__num_ro5_violations=0)
|
| Get all biotherapeutic molecules |
from chembl_webresource_client.new_client import new_client
molecule = new_client.molecule
biotherapeutics = molecule.filter(biotherapeutic__isnull=False)
|
| Return molecules with molecular weight <= 300 |
from chembl_webresource_client.new_client import new_client
molecule = new_client.molecule
light_molecules = molecule.filter(molecule_properties__mw_freebase__lte=300)
|
| Return molecules with molecular weight <= 300 AND pref_name ends with nib |
from chembl_webresource_client.new_client import new_client
molecule = new_client.molecule
light_nib_molecules = molecule.filter(molecule_properties__mw_freebase__lte=300).filter(pref_name__iendswith="nib")
|
| Get all Ki activities related to the hERG target |
from chembl_webresource_client.new_client import new_client
target = new_client.target
activity = new_client.activity
herg = target.search('herg')[0]
herg_activities = activity.filter(target_chembl_id=herg['target_chembl_id']).filter(standard_type="Ki")
|
| Get all activitvities related to the Open TG-GATES project |
from chembl_webresource_client.new_client import new_client
activity = new_client.activity
res = activity.search('"TG-GATES"')
|
| Get all activitvities for a specific target with assay type 'B' OR 'F' |
from chembl_webresource_client.new_client import new_client
activity = new_client.activity
res = activity.filter(target_chembl_id='CHEMBL3938', assay_type__iregex='(B|F)')
|
| Search for ADMET-reated inhibitor assays |
from chembl_webresource_client.new_client import new_client
assay = new_client.assay
res = assay.search('inhibitor').filter(assay_type='A')
|
| Get cell line by cellosaurus id |
from chembl_webresource_client.new_client import new_client
cell_line = new_client.cell_line
res = cell_line.filter(cellosaurus_id="CVCL_0417")
|
| Filter drugs by approval year and name |
from chembl_webresource_client.new_client import new_client
drug = new_client.drug
res = drug.filter(first_approval=1976).filter(usan_stem="-azosin")
|
| Get tissue by BTO ID |
from chembl_webresource_client.new_client import new_client
tissue = new_client.tissue
res = tissue.filter(bto_id="BTO:0001073")
|
| Get tissue by Caloha id |
from chembl_webresource_client.new_client import new_client
tissue = new_client.tissue
res = tissue.filter(caloha_id="TS-0490")
|
| Get tissue by Uberon id |
from chembl_webresource_client.new_client import new_client
tissue = new_client.tissue
res = tissue.filter(uberon_id="UBERON:0000173")
|
| Get tissue by name |
from chembl_webresource_client.new_client import new_client
tissue = new_client.tissue
res = tissue.filter(pref_name__istartswith='blood')
|
| Search documents for 'cytokine' |
from chembl_webresource_client.new_client import new_client
document = new_client.document
res = document.search('cytokine')
|
| Search for compound in Unichem |
from chembl_webresource_client.new_client import new_client
ret = unichem.get('AIN')
|
| Resolve InChi Key to Inchi using Unichem |
from chembl_webresource_client.unichem import unichem_client as unichem
ret = unichem.inchiFromKey('AAOVKJBEBIDNHE-UHFFFAOYSA-N')
|
| Convert SMILES to CTAB |
from chembl_webresource_client.unichem import unichem_client as unichem
aspirin = utils.smiles2ctab('O=C(Oc1ccccc1C(=O)O)C')
|
| Convert SMILES to image and image back to SMILES |
from chembl_webresource_client.utils import utils
aspirin = 'CC(=O)Oc1ccccc1C(=O)O'
im = utils.smiles2image(aspirin)
mol = utils.image2ctab(im)
smiles = utils.ctab2smiles(mol).split()[2]
self.assertEqual(smiles, aspirin)
|
| Compute fingerprints |
from chembl_webresource_client.utils import utils
aspirin = utils.smiles2ctab('O=C(Oc1ccccc1C(=O)O)C')
fingerprints = utils.sdf2fps(aspirin)
|
| Compute Maximal Common Substructure |
from chembl_webresource_client.utils import utils
smiles = ["O=C(NCc1cc(OC)c(O)cc1)CCCC/C=C/C(C)C", "CC(C)CCCCCC(=O)NCC1=CC(=C(C=C1)O)OC", "c1(C=O)cc(OC)c(O)cc1"]
mols = [utils.smiles2ctab(smile) for smile in smiles]
sdf = ''.join(mols)
result = utils.mcs(sdf)
|
| Compute various molecular descriptors |
from chembl_webresource_client.utils import utils
aspirin = utils.smiles2ctab('O=C(Oc1ccccc1C(=O)O)C')
num_atoms = json.loads(utils.getNumAtoms(aspirin))[0]
mol_wt = json.loads(utils.molWt(aspirin))[0]
log_p = json.loads(utils.logP(aspirin))[0]
tpsa = json.loads(utils.tpsa(aspirin))[0]
descriptors = json.loads(utils.descriptors(aspirin))[0]
|
| Standardize molecule |
from chembl_webresource_client.utils import utils
mol = utils.smiles2ctab("[Na]OC(=O)Cc1ccc(C[NH3+])cc1.c1nnn[n-]1.O")
st = utils.standardise(mol)
|
Example Queries
The table below provides a list of example searches a user may wish to carry out using the ChEMBL web services. The aim of the list is highlight the type of data that can be retrieved from ChEMBL using the web services. The examples can be adapted, extended and chained togther to build up more complex workflows.
| Description |
Example URL |
| Get all approved drugs |
https://www.ebi.ac.uk/chembl/api/data/molecule?max_phase=4 |
| Get all molecules in ChEMBL with no Rule-of-Five violations |
https://www.ebi.ac.uk/chembl/api/data/molecule?molecule_properties__num_ro5_violations=0 |
| Get all biotherapeutic molecules |
https://www.ebi.ac.uk/chembl/api/data/molecule?biotherapeutic__isnull=false |
| Get all functional/phenotypic assays (assay_type=F), from the literature (src_id 1) |
https://www.ebi.ac.uk/chembl/api/data/assay?assay_type=F&src_id=1 |
| Get all binding assays (assay_type=B), which also contain the term 'insulin' |
https://www.ebi.ac.uk/chembl/api/data/assay?assay_type=B&description__icontains=insulin |
| Get all Clearance activity values (standard_type=CL), for rat, mouse and human |
https://www.ebi.ac.uk/chembl/api/data/activity?standard_type=CL&target_organism__in=Homo%20sapiens,Rattus%20norvegicus,Mus%20musculus |
| Get all cell lines, which end with the term 'carcinoma' |
https://www.ebi.ac.uk/chembl/api/data/cell_line?cell_source_tissue__iendswith=carcinoma |
| Get all mechanism of action details muscarinic acetylcholine receptor antagonists |
https://www.ebi.ac.uk/chembl/api/data/mechanism?mechanism_of_action__icontains=Muscarinic%20acetylcholine%20receptor&action_type=ANTAGONIST |
| Get all targets (single protein, protein complexes, protein families etc.), which contain UniProt accession Q13936 |
https://www.ebi.ac.uk/chembl/api/data/target?target_components__accession=Q13936 |
| Get the entity type for CHEMBL1000 |
https://www.ebi.ac.uk/chembl/api/data/chembl_id_lookup?chembl_id=CHEMBL1000 |
Using the client vs making raw URL queries
There are two alternative methods to accessing the API: through the Python client or via raw URL queries.
The advangate of the client is that it hadles pagination and the whole HTTP layer (deciding between GET and POST, character encoding etc.).
It also caches results on the client side and provide very convenient interface, similar to Django Query Set.
The advantage of making a raw URL queries is that they can be excuted from any programming language. The following Jupyter (IPython)
notebook, compares both methods: https://github.com/chembl/mychembl/blob/master/ipython_notebooks/09_myChEMBL_web_services.ipynb.
Use Cases
The following use cases is provided as an example of how it is possible chain together ChEMBL web service calls to awnser to complicated questions usin ChEMBL data.
Investigating the potency of approved drugs against their efficacy targets
Since ChEMBL includes both mechanism of action information for approved drugs and pharmacology data from published assays, it is interesting to combine this information and investigate the potency of a drug against its efficacy target. This might be done either to confirm or refute the proposed target assignment, or to better understand how the in-vitro potency of a compound might relate to clinical efficacy or ADMET properties. The following example shows how this type of analysis could be carried out using the ChEMBL web services.
- Use the molecule end point to retrieve a list of approved drugs (max_phase=4):
https://www.ebi.ac.uk/chembl/api/data/molecule?max_phase=4
Using ChEMBL_20, this will retrieve 2795 drugs in total. We will use CHEMBL998 (loratadine) as an example, but the same workflow could be repeated for the others.
- Use the mechanism end point to retrieve the mechanism of action and target of each drug:
https://www.ebi.ac.uk/chembl/api/data/mechanism?molecule_chembl_id=CHEMBL998
Loratadine is reported to be a histamine H1 receptor antagonist, represented by the ChEMBL target CHEMBL231.
- Use the assay end point to identify any binding assays (assay_type=B) for the human histamine H1 receptor:
https://www.ebi.ac.uk/chembl/api/data/assay?target_chembl_id=CHEMBL231&relationship_type=D&assay_type=B
Note the relationship_type=D filter restricts the results to assays where we are confident that the human receptor was tested and not an orthologue. A total of 213 assays are identified. We will take CHEMBL1909156 as an example.
- Combine the results of the above queries to identify any potency measurements for loratadine in these assays:
https://www.ebi.ac.uk/chembl/api/data/activity?molecule_chembl_id=CHEMBL998&assay_chembl_id=CHEMBL1909156
This assay reports an IC50 value of 170nM and a Ki measurement of 20nM for loratadine against the histamine H1 receptor.
This process could be repeated for the other 212 assays by either iterating through them individually or, where a sufficiently small number of assays are returned, using the __in filter on the activity end point to retrieve several assays at once e.g.,
https://www.ebi.ac.uk/chembl/api/data/activity?molecule_chembl_id=CHEMBL998&assay_chembl_id__in=CHEMBL830379,CHEMBL1909156,CHEMBL882906,CHEMBL691450
An additional 2 assays are identified in this way, reporting an IC50 value of 290nM and a Ki value of 414nM. These values could be averaged, or the lowest taken, to give an indication of the average potency of the compound, or the assay conditions could be investigated further to try to identify which assay might be most reliable or informative (the ChEMBL identifier for the document from which the data are extracted is also provided by the activity end point).
More examples
As mentioned above, there is an IPython notebook with example API calls and the corresponsing Python client code https://github.com/chembl/mychembl/blob/master/ipython_notebooks/09_myChEMBL_web_services.ipynb. Additionally there is a comprehensive test suite,
covering amost all the functionality offered by the API. It can be found in the Python client library: https://github.com/chembl/chembl_webresource_client/blob/master/chembl_webresource_client/tests.py.
Cheminformatic Utils Web Services
In addition to the described data-focused web services, a utility web
service that provides RESTful access to commonly used cheminformatic
methods has been released. These web services have been developed under
the name 'ChEMBL Beaker' and provide users with access chemical
property calculators (e.g. molecular weight), structure format
converters (e.g. molfile to SMILES) and a structure standardiser. The
methods being exposed by this service are providing RESTful interface to
the RDKit cheminformatics library. These services also provide access to methods that extract chemical information from images. The OSRA library is used to expose this functionality. Online documentation is available is available.
Publications
There are a few publications, describing the API from the technical side as well as providing further examples and ideas for using it:
Getting help
If you have any API specific questions please send them to chembl-help@ebi.ac.uk.
Bugs
If you find a bug or any other problem, please consider creating a new issue on GitHub.
 |