Course at EMBL-EBI
Protein function prediction with machine learning and interactive analytics
Do you want to learn how to develop models to predict protein function? Do you want analyse and exploit the growing volume of biological data? Do you want to develop basic skills in novel machine learning approaches and big data technologies?
This workshop explores how to conduct functional annotation of proteins through machine learning (ML) approaches. Participants will gain an insight into existing public protein data resources; and how novel approaches can be used to analyse and explore these data to gain new understanding of protein function. The workshop will introduce Apache Spark and Apache Zeppelin; technologies for fast data processing and integrating analytics respectively.
Who is this course for?
This workshop is aimed at researchers and bioinformaticians from across industry and academia who are looking to leverage machine learning approaches in protein function prediction. It will guide participants through the use of big data to build analytical workflows on publically-available biological data.
Participants will require prior experience in the use of the command line interface and confidence in a programming language to fully benefit from the workshop. Please contact us if you have any questions about the course's suitability before you apply.
What will I learn?
Learning outcomes
After this course you should be able to:
- Search and locate protein data of interest
- Conduct interactive analytics and data transformation using machine learning approaches
- Create simple analytical workflows using publically-available data
- Discern new biological insights about protein function
- Develop models for predicting protein function
Course content
The workshop will cover the following topics:
- UniProt knowledgebase
- Apache Zeppelin framework
- Apache Spark
- Machine learning approaches
- Protein functional annotation
Trainers
EMBL-EBI, UK
EMBL-EBI, UK
EMBL-EBI, UK
Programme
Day 1 – Wednesday 29 May 2019 |
||
---|---|---|
08:30 | Shuttle from Cambridge Station | |
09:00-09:15 | Arrival and registration | |
09:15-09:45 | Welcome and introduction to workshop | Tom Hancocks |
09:45-10:45 | Functional annotation of proteins in UniProt | Hermann Zellner |
10:45-11:00 | Break | |
11:00-12:30 | Protein data retrieval | Herman Zellner & Rabie Saidi |
12:30-13:00 | Lunch | |
13:00-15:00 | Introduction to machine learning and big data | Rabie Saidi |
15:00-15:15 | Break | |
15:15-16:00 | Introduction to Spark & Zeppelin | Rabie Saidi |
16:00-17:00 | Spark & Zeppelin practical | Rabie Saidi |
17:00 | End of day | |
17:15 | Shuttle to Cambridge Station |
Day 2 – Thursday 30 May 2019 |
||
---|---|---|
08:30 | Shuttle from Cambridge Station | |
09:00-09:15 | Arrival | |
09:15-10:30 | Data transformation | Rabie Saidi |
10:30-10:45 | Break | |
10:45-12:30 | Exploratory data analysis | Rabie Saidi |
12:30-13:00 | Lunch | |
13:00-15:00 | Creation and application of prediction models with Spark/MLlib | Rabie Saidi |
15:00-15:30 | Break | |
15:30-16:45 | Creation and application of prediction models with Spark/MLlib | Rabie Saidi |
16:45-17:00 | Course wrap-up and feedback | Tom Hancocks |
17:00 | End of day | |
17:15 | Shuttle to Cambridge Station |
Attendance at this workshop is allocated on a first come, first served basis.
Please note that registration closes two weeks prior to the course, so please register as soon as you can. Once you have registered and we have received payment we can provide a letter of support should you require a visa to travel to the UK. Applying for a visa can take several weeks, and it might not be possible to be granted a visa if you register just before the closing date. If you are unable to attend, then please notify us as quickly as possible so that we can offer your place to someone else.
Once you have registered please send Marina Pujol (mpujol@ebi.ac.uk) a picture of yourself and a Microsoft Word (.docx) document containing three short paragraphs with a biography, work history and description of your current research interests; each paragraph should be no more than 100 words.
The registration fee covers your lunch, refreshments and a shuttle between Cambridge Station and the Wellcome Genome Campus. Accommodation is not included and you will need to make your own arrangements.
Further learning
A short webinar summarising the main content of this workshop is available here: