Course at EMBL-EBI
Practical biocuration
This three-day course will introduce concepts vital to biocuration, including metadata and ontologies, the role of identifiers, and extracting data from literature.
During the course, there will be a focus on practical skills required by biocurators, including programming for handling large biological datasets, and querying databases. Participants will have the opportunity to use these skills in a real-life practical biocuration exercise during the group projects.
There will also be an opportunity to network with biocurators and learn more about their roles and data resources.
Group projects
A major element of this course is a group project, where participants will be placed in small groups to work together on a challenge set by trainers from EMBL-EBI and external institutes. This allows people to explore the role of biocurators and provide participants with hands-on experience in biocuration. The group work will culminate in a presentation session involving all participants on the final day of the course, giving an opportunity for wider discussion in the whole course group.
Groups are mentored and supported by the trainers who set the initial challenge, but the groups will be responsible for driving their projects forward, with all members expected to take an active role. Groups are pre-organised before the course.
Basic outlines of the projects on offer this year are given below. In your registration you must indicate your first and second choice of project. Not all projects may be offered, and final decisions on which projects will be run during the course will be made based on the number of applicants per project.
Group project: Curating a protein complex
In this project, a new protein complex has been published and it is your role to find out as much information as possible about the complex. Using literature and bioinformatics resources, you will provide an overview of the protein complex including a full list of the complex members, a description of the role of each protein and an overview of the role of the complex. You will use resources such as UniProt, PDB and EuropePMC.
Project mentor: Michele Magrane
Group project: Exploring data and metadata from a genome-wide association study
The GWAS Catalog is a richly-annotated database of human genome-wide association studies, which analyse associations between genetic variants and a disease or other trait of interest in a sample of individuals from a particular population. In this mini project you will examine a GWAS publication in detail, to extract information about the traits and samples under investigation and decide how to represent these metadata using standardised vocabularies and ontologies. You will also use simple command line tools to look at a GWAS dataset and explore the role of standard formats and quality control in biocuration.
Project mentor: Elliot Sollis
Group project: Using alignments to improve Rfam families: a case of curating non-coding RNA families
Rfam is the database of non-coding RNAs (ncRNAs) families. These families are built out of a sequence alignment, metadata describing the family and an infernal model. Each family is built by curating alignments from publications or user submitted alignments. In this project we will demonstrate the process of updating an existing family with an improved alignment. We will examine the reports generated during curation. In this process, we will examine the phylogenetic distribution, sequence alignment and model results to determine how the family should be updated to reflect the improved alignment. Additionally, we will show how to connect this information to resources like Wikipedia.
Project mentor: Nancy Ontiveros
Who is this course for?
Anyone interested in becoming a biocurator, new biocurators or biologists planning to use biocuration skills in their research.
What will I learn?
Learning outcomes
After the course you should be able to:
- Discuss the varied role of biocurators and different types of biocuration
- Describe how metadata and ontologies are used by biological data resources
- Extract biological data from literature
- Explain some challenges that arise in biocuration
- Use some tools and techniques for the handling of biological data
Course content
During this course you will learn about:
- What biocuration is
- The different types of biological databases and how they can relate to each other
- The role of metadata and ontologies
- Literature mining for biocuration
- Handling datasets with SQL and Python
- Tools and techniques used by biocurators for biological data handling
Trainers
Dayane Rodrigues Araujo
EMBL-EBI Sarah Dyer
EMBL-EBI George Georghiou
Novartis Matthew Jeffryes
EMBL-EBI Antonia Nilsson Lock
EMBL-EBI Michele Magrane
EMBL-EBI Aleena Mushtaq
EMBL-EBI Claire O'Donovan
EMBL-EBI Nancy Ontiveros Palacios
EMBL-EBI Pedro Raposo
EMBL-EBI Elena Speretta
EMBL-EBI Elliot Sollis
EMBL-EBI Anna Swan
EMBL-EBI Krishna Kumar Tiwari
EMBL-EBI Simone Weyand
EMBL-EBI Aleix Puig
EMBL-EBI
Programme
Time Topic Trainer Day one – Tuesday 16 May 2023 09:00 Bus from Cambridge 09:45 – 10:15 Arrival and registration 10:15 – 11:30 Course introduction and icebreaker Dayane Rodrigues Araújo and Anna Swan 11:30 – 12:30 Introduction to biocuration Claire O'Donovan 12:30 — 13:30 Lunch 13:30 – 15:00 Metadata and ontologies Antonia Lock and Aleix Puig 15:00 — 15:30 Break 15:30 – 17:00 Identifiers and mapping Sarah Dyer and Aleena Mushtaq 17:00 – 18:15 Drinks and networking 18:30 — 20:00 Dinner 20:00 Bus to Cambridge Day two – Wednesday 17 May 2023 08:45 Bus from Cambridge 09:30 – 10:30 Extracting data from literature Matthew Jeffryes 10:30 — 11:00 Break 11:00 – 12:00 Extracting data from literature Matthew Jeffryes 12:00 – 13:00 Challenges in biocuration George Georghiou 13:00 — 14:00 Lunch 14:00 – 16:00 Programming for biocuration Pedro Raposo and Elena Speretta 16:00 — 16:30 Break 16:30 – 18:00 Introduction to mini projects Dayane Rodrigues Araújo, Anna Swan, and project mentors 18:30 — 20:30 Dinner 20:30 Bus to Cambridge Day three – Thursday 18 May 2023 08:45 Bus from Cambridge 09:30 – 11:00 Mini project: group work Project mentors 11:00 — 11:30 Break 11:30 – 13:00 Mini project: group work Project mentors 13:00 — 14:00 Lunch 14:00 – 14:30 Mini project: finalise presentations Project mentors 14:30 – 15:30 Mini project: presentations All 15:30 – 16:00 Course wrap up Dayane Rodrigues Araújo and Anna Swan 16:15 Bus to Cambridge
Please read our page on application support before starting your application. In order to be considered for a place on this course, you must complete the online application form.
Places will be allocated on a first-come first-served basis, so please register soon to avoid disappointment. The deadline to submit your application is at 23:59 on Monday 6 February 2023. Incomplete registrations will not be processed.
The registration fee of £75.00 includes:
- Catering as detailed on the course programme, including two dinners
- A daily shuttle service to/from Cambridge
- Bespoke course handbook with links to all course materials
- Use of a computer in the EMBL-EBI training suite throughout the course
Catering
The course includes catering as detailed on the programme tab. Registrants will be asked for any dietary requirements and allergies upon registration.
Course materials
The course materials from the course will be made available online after the course. They will provide a mixture of presentations, and practicals from the course.
EMBL-EBI
EMBL-EBI
Novartis
EMBL-EBI
EMBL-EBI
EMBL-EBI
EMBL-EBI
EMBL-EBI
EMBL-EBI
EMBL-EBI
EMBL-EBI
EMBL-EBI
EMBL-EBI
EMBL-EBI
EMBL-EBI
EMBL-EBI
Programme
| Time | Topic | Trainer |
| Day one – Tuesday 16 May 2023 | ||
| 09:00 | Bus from Cambridge | |
| 09:45 – 10:15 | Arrival and registration | |
| 10:15 – 11:30 | Course introduction and icebreaker | Dayane Rodrigues Araújo and Anna Swan |
| 11:30 – 12:30 | Introduction to biocuration | Claire O'Donovan |
| 12:30 — 13:30 | Lunch | |
| 13:30 – 15:00 | Metadata and ontologies | Antonia Lock and Aleix Puig |
| 15:00 — 15:30 | Break | |
| 15:30 – 17:00 | Identifiers and mapping | Sarah Dyer and Aleena Mushtaq |
| 17:00 – 18:15 | Drinks and networking | |
| 18:30 — 20:00 | Dinner | |
| 20:00 | Bus to Cambridge | |
| Day two – Wednesday 17 May 2023 | ||
| 08:45 | Bus from Cambridge | |
| 09:30 – 10:30 | Extracting data from literature | Matthew Jeffryes |
| 10:30 — 11:00 | Break | |
| 11:00 – 12:00 | Extracting data from literature | Matthew Jeffryes |
| 12:00 – 13:00 | Challenges in biocuration | George Georghiou |
| 13:00 — 14:00 | Lunch | |
| 14:00 – 16:00 | Programming for biocuration | Pedro Raposo and Elena Speretta |
| 16:00 — 16:30 | Break | |
| 16:30 – 18:00 | Introduction to mini projects | Dayane Rodrigues Araújo, Anna Swan, and project mentors |
| 18:30 — 20:30 | Dinner | |
| 20:30 | Bus to Cambridge | |
| Day three – Thursday 18 May 2023 | ||
| 08:45 | Bus from Cambridge | |
| 09:30 – 11:00 | Mini project: group work | Project mentors |
| 11:00 — 11:30 | Break | |
| 11:30 – 13:00 | Mini project: group work | Project mentors |
| 13:00 — 14:00 | Lunch | |
| 14:00 – 14:30 | Mini project: finalise presentations | Project mentors |
| 14:30 – 15:30 | Mini project: presentations | All |
| 15:30 – 16:00 | Course wrap up | Dayane Rodrigues Araújo and Anna Swan |
| 16:15 | Bus to Cambridge | |
Please read our page on application support before starting your application. In order to be considered for a place on this course, you must complete the online application form.
Places will be allocated on a first-come first-served basis, so please register soon to avoid disappointment. The deadline to submit your application is at 23:59 on Monday 6 February 2023. Incomplete registrations will not be processed.
The registration fee of £75.00 includes:
- Catering as detailed on the course programme, including two dinners
- A daily shuttle service to/from Cambridge
- Bespoke course handbook with links to all course materials
- Use of a computer in the EMBL-EBI training suite throughout the course
Catering
The course includes catering as detailed on the programme tab. Registrants will be asked for any dietary requirements and allergies upon registration.
Course materials
The course materials from the course will be made available online after the course. They will provide a mixture of presentations, and practicals from the course.