- Course overview
- Search within this course
- What is machine learning?
- ML in drug discovery: why now?
- ML in the drug discovery pipeline
- Getting started in ML using WEKA
- Identifying targets for cancer using gene expression profiles
- Other tools utilising ML or NLP for drug discovery
- Summary
- Your feedback
- Learn more
- References
Hands-on with WEKA
In this section, we will present two hands-on examples for using WEKA. The first example uses clustering to explore a CRISPR dataset, and the second example uses classification to identify cancer-causing genes.
Exploring targets using CRISPR-Cas9 screens
CRISPR are arrays of regularly spaced repeats first detected in prokaryotes [13, 14], where they play an important role in the bacterial immune response to virus [15]. Since its discovery in 1993, the CRISPR system has been thoroughly characterised at the biochemical, genetic and molecular level, culminating with its versatile capability for enabling genome editing in mammals and altering expressing of gene products (patent US08697359B1). For more details on the 20 year journey of CRISPR, read Lander, 2016 [16].
CRISPR holds the promise to revolutionise drug discovery [17] for its ease and precision to introduce changes to the DNA sequence in a high-throughput fashion. This system has been applied to a variety of disease models e.g. cancer cell lines, iPS neuronal cells, T cells, organoids and much more. In this section of the course, we will be using drug targets prioritised from the study on the systematic CRISPR-Cas9 screening in a 1,000 cancer cell line panel [18].