ChEMBL logo


ChEMBL Statistics
  Loading Statistics...

Compounds Targets Assays Documents Cells Tissues Exact Match Activity Source Filter

Please enter a search term

Selected Source Assay Counts Activity Counts
Scientific Literature9631865635084 (39.21%)
Open TG-GATEs158199158199 (1.1%)
DrugMatrix113678350929 (2.44%)
TP-search Transporter Database35926765 (0.05%)
PubChem BioAssays29377559601 (52.6%)
BindingDB Database186899039 (0.69%)
FDA Approval Packages13861387 (0.01%)
Sanger Institute Genomics of Drug Sensitivity in Cancer71473169 (0.51%)
GSK Published Kinase Inhibitor Set456169451 (1.18%)
Drugs for Neglected Diseases Initiative (DNDi)23314452 (0.1%)
MMV Malaria Box13845158 (0.31%)
Open Source Malaria Screening22344 (0%)
WHO-TDR Malaria Screening165853 (0.04%)
St Jude Malaria Screening165456 (0.04%)
GSK Tuberculosis Screening151814 (0.01%)
AstraZeneca Deposited Data1511687 (0.08%)
Deposited Supplementary Bioactivity Data134817 (0.03%)
GSK Kinetoplastid Screening137235 (0.05%)
GSK Malaria Screening681198 (0.57%)
Novartis Malaria Screening627888 (0.19%)
St Jude Leishmania Screening642105 (0.29%)
Harvard Malaria Screening4111 (0%)
Gates Library compound collection269444 (0.48%)
Check/Uncheck All
ChEMBL Group   Research   Contact   Acknowledgements    

ChEMBL Team Research Interests

The underlying theme of our research interests is to provide infrastructure components built upon the ChEMBL databases for applied and translational drug discovery. All these research topics offer the opportunity to study and develop cutting-edge Open Source drug discovery technologies in a highly collaborative, diverse and international setting.
  1. Peptides as Drugs
    Many biological processes are modulated by peptide binding, and consequently many drugs are either peptides themselves, peptide derivatives, or are structural mimics for peptides. We are interested in understanding the pharmacological profiles of peptides, developing a series of 'rules' to guide selection of either peptide therapeutics, or the potential molecular targets for peptide therapeutic drugs. We are also interested in analyzing the thermodynamics and SAR of peptides binding the their macromolecular targets.

  2. Non-human Secreted Proteins as New Therapies.
    Surprisingly, several classes of very important human biological therapeutics are derived from non-human sources; for example, hirudin from medicinal leaches. This project will apply data-mining approaches to characterise the features for immunotolerance on non-host proteins, and then develop a series of approaches to identify potential candidate therapeutic proteins from a wide variety of organisms (e.g. Mammals, Ticks, Flukes, Nematodes, etc.). The project will integrate sequence and structure-based data mining methods along with KDD/data- mining methods to predict function, pharmacokinetic and affinity properties.

  3. Monoclonal Antibody Drug 'Rescue'
    Monoclonal antibody drugs (mAbs) are viewed as highly specific therapies, with low clinical failure rates. However, attrition for mAb therapies is often expensive and occurs at a late and expensive stage in their clinical progression, with success often crucially dependent on choice of disease model and trial design. As part of our CandiStore project, we have accumulated a unique set of clinical stage mAbs, and their targets ligands. This project will apply modelling, docking, text-mining and KDD approaches to understand and predict new clinical applications of previously failed mAbs. The project will also attempt to discover general features for success/failure of this important class of therapy.

  4. Automated Drug Design ('Robot-Chemist')
    The discovery and optimisation of novel, well tolerated drugs is becoming an increasingly important commercial challenge. We have assembled a large training set of SAR data in our StARlite database, and have produced proof-of-principle applications in areas such as bioisostere discovery, affinity optimisation, etc. This project will build a catalog of empirically observed and synthetically tractable transformations from a given chemical starting point, and attempt to objectively score their likely effect on bioactivity, in particular drug like properties such as metabolism, frank toxicity, absorption, etc.This project will provide an excellent introduction to the principles of medicinal chemistry from a compound design perspective, and also a firm grounding in a broad variety of KDD approaches.

  5. Drug Design Strategies for Robustness to Acquired Resistance.
    Acquired resistance, the selection of mutant forms of a target under the selective pressure of a cidal drug, is of increasing importance in both anti-microbial and anti-cancer targeted therapies. In the vast majority of cases, this resistance can be understood at a structural level, once the 3-D structure of the drug-target complex is established. This project will provide an integrative approach for the prediction of alternate functional forms of a target, under simulated evolutionary pressure of drug binding. Techniques such as comparative modelling, sequence analysis, Monte-Carlo simulation and QSAR approaches will be applied, and the methods then used at genome scale to identify targets and compound design strategies likely to be robust to acquired resistance. Depending on progress made during the major part of the project, the methods could be applied to understand differential drug response caused by genetic diversity/cSNPs in the human genome for currently approved therapies.

  6. Variation in drug targets
    Identify ...all... splice variants (and SNPs) in known drug targets. Map onto 3-D structure where known and identify those within known or potential drug binding domains/sites. Predict functional consequences of the variations where possible. This project would make use of wide variety of ChEMBL data and also involve analysis of many other EMBL- EBI resources - ePDB, UniProt, Ensembl, etc., it will also lead to the development of novel methods to assess consequences of sequence variations.

  7. Text mining to enrich ChEMBL functional assays
    Develop or apply ontologies around the existing functional assay data in chembl (phenotypes/diseases/physiological processes) to allow this resource to be searched and utilised more effectively. Use text mining in conjunction with these ontologies to identify additional 'relevant' abstracts to complement the existing resource (e.g., compounds associated with under-represented but important phenotypes or key disease pathways). After applying methods to assess their relevance, these abstracts could form an automated supplement to ChEMBL or be prioritised for data extraction into the main chembl resource.

  8. How do drugs affect gene expression?
    Survey and harvest literature experiments containing small molecule treatments in gene expression datasets (e.g. ArrayExpress) and identify up/down regulated genes. Map these onto pathways associated with the known drug targets (and diseases) for these compounds. How do the compounds affect the downstream pathways vs the drug target itself? Are there undocumented effects on pathways known to be associated with disease. Would make use of ChEMBL data and also other EMBL-EBI data (e.g. ArrayExpress and Reactome/Intact data).

  9. Data-mining to discover privileged med chem groups/scaffolds
    Is it possible to identify molecular fragments that have an inherent biological 'liability' and therefore are not likely to be found in drug molecules? This project is of fundamental importance in both drug lead identification and lead optimisation. This analyssi would use the med. chem., clinical candidate and launched drug databases of ChEMBL, and identify and analyse chemical fragments that consistently do not progress from test-tube to the clinic.

  10. In vitro to in vivo translation
    There is frequently no simple relationship between in-vivo and in- vitro bioassay data. To explain this effect, there is a hypothesis that 'free drug concentration', when considered in conjunction with in- vitro potency, is an important property in determining in-vivo efficacy. Examples would include in-vivo/in-vitro metabolism, brain concentration/pKi and efficacy, cell-based or whole blood activity versus protein affinity, QT prolongation and hERG affinity. Data from the ChEMBL database will be analysed to investigate if the in-vivo effects can be rationised in terms of free drug concentrations and in- vitro binding.