Machine and deep learning for biosynthetic gene clusters (BGCs) discovery
Trainer: Maaly Nassar
Overview: The session aims for training participants on using deep/machine learning models to discover a variety of biosynthetic gene clusters (BGCs) in literature and whole genome. BGCs are clusters of genes that produce secondary metabolites, such as antibiotics, antiviral, antitumor compounds, pollutant biodegrading enzymes, as well as antigens. This session will focus on BGCs involved in T-cell immunity. And, given the popularity of DNA vaccines, we will try to retrieve the DNA sequences of potential microbial antigens from their corresponding BGCs as vaccine targets.
Learning outcome
By the end of this session you will be able to:
- Understand the different types of biosynthetic gene clusters (BGCs) and their implications in environmental phenomena as well as health and disease.
- Understand the roles of saccharide BGCs in T-cell immunity and diseases.
- Get insights into different BGCs discover tools and how they were trained.
- Use python libraries as well as trained deep and machine learning models in identifying saccharide BGCs in whole genomes.
Materials:
- Presentation slides
- Gitlab from emerald: https://gitlab.com/maaly7/emerald_bgcs_annotations
- Gitlab for the BCG discovery: https://gitlab.com/maaly7/bgc_discovery_for_t_cell_immunology