Course at EMBL-EBI
Finding evidence in research publications
This workshop introduces tools and approaches used to discover biologically relevant data in the research literature. Participants will be introduced to the basics of programmatic analysis of scientific literature and explore the principles of dictionary-based text-mining, explained using relevant case studies. The workshop has a strong focus on practical exercises and group project work to give participants hands-on experience, tackling biologically relevant problems based on what they have learnt in this workshop.
Do you want to identify publicly available datasets cited in research literature to use them in your future analyses? Do you want to develop basic skills to programmatically access and analyse scientific literature? Do you want to learn about the basics of text analytics to make the most of literature review?
Published research contains a wealth of new data and evidence. Automating literature review and text analysis can help extract valuable knowledge from millions of research publications. Modern tools for literature analysis can help you identify potential drug targets, predict host-pathogen interactions, or infer growth regulators for crops, based on published findings.
Who is this course for?
This workshop is aimed at life science researchers, who are interested in extracting data and evidence from research literature. It will help those who want to identify cited datasets for reuse, further analyses, background research, or as supporting data for own hypotheses. The workshop would also be of interest to those who are applying or planning to apply literature analysis/text-mining in their own research projects.
Participants will benefit from an undergraduate level knowledge of biology. Participants should ideally have some bioinformatics experience and/or basic understanding of programmatic access. Please note that this workshop requires no prior knowledge of text analytics or computer programming skills.
Regardless of your current knowledge, we encourage participants to explore this short series of recorded webinars on an introduction to programmatic access.
What will I learn?
Learning outcomes
At the end of the workshop you should be able to:
- Search and locate publicly-available open-access datasets from the scientific literature
- Use programmatic tools to search for data in the literature
- Appreciate the value and limitations of text-mining approaches
- Develop basic sets of heuristics for text analytics
- Apply the information you have discovered in the context of your own research
Course content
During this workshop you will learn about:
- How researchers share and cite data
- How to search for data cited in the literature
- Basics of dictionary-based text mining
- Basics of ontologies
- Programmatic tools to find data and evidence in the literature
Tools and resources covered include:
Please note that this workshop does not cover text mining.
Trainers
Tom Hancocks
EMBL-EBI, UK Dayane Rodrigues Araujo
EMBL-EBI, UK Michele Magrane
EMBL-EBI, UK Kalpana Paneerselvam
EMBL-EBI, UK Nurul Nadzirin
EMBL-EBI, UK Zoe Pendlington
EMBL-EBI, UK Maaly Nassar
EMBL-EBI, UK Xiao Yang
EMBL-EBI, UK
Programme
Day 1 – Tuesday 10 September 2019
Finding data cited in research literature
08:30
Shuttle from Bus Stop 5, Cambridge Station
09:00-09:15
Arrival and registration
09:15-09:30
Welcome and introduction to the workshop
Tom Hancocks
09:30-10:30
Introduction to EMBL-EBI data resources
Tom Hancocks
10:30-10:45
Break
10:45-11:15
How researchers share and cite data
Michelle Magrane
11:15-12:30
Search tools for data cited in the literature
Dayane Araujo
12:30-13:00
Lunch
13:00-13:45
Searching for datasets on a particular research topic
Dayane Araujo
13:45-14:00
Introduction to REST API
Dayane Araujo
14:00-14:30
Introduction to Europe PMC API
Dayane Araujo
14:30-14:45
REST API data module
Dayane Araujo
14:45-15:00
PDBe: WESTLIFE project
Nurul Nadzirin
15:00-15:15
Break
15:15-17:00
Programmatic search for datasets in research articles - project
Dayane Araujo & Maaly Nassar
17:00
End of day
17:15
Shuttle to Bus Stop 5, Cambridge Station
Day 2 – Wednesday 11 September 2019
Scientific literature as big data: an introduction to text analysis
08:30
Shuttle from Bus Stop 5, Cambridge Station
09:00-09:15
Arrival and registration
09:15-09:20
Scientific literature as big data - the value of text and data mining
Dayane Araujo
09:20-09:35
Finding protein-protein interaction evidence in the literature
Kalpana Paneerselvam
09:35-10:00
Europe PMC Annotations platform
Dayane Araujo
10:00-10:45
Annotation reviewing
Dayane Araujo
10:45-11:00
Break
11:00-11:30
Introduction to Annotations API
Dayane Araujo
11:30-12:30
Project work and feedback
All
12:30-13:00
Lunch
13:00-13:10
Limitations of text mining
Xiao Yang & Dayane Araujo
13:10-13:40
Basics of dictionary-based text mining
Xiao Yang & Dayane Araujo
13:40-14:10
Introduction to ontologies
Aravind Venkatesan
14:10-14:30
Open Targets case study
Zoe Pendlington
14:30-15:00
DIY text mining project
Xiao Yang & Dayane Araujo
15:15-15:30
Break
15:30-16:45
DIY text mining project
Xiao Yang & Dayane Araujo
16:45-17:00
Feedback and wrap-up
Tom Hancocks
17:00
End of workshop
17:15
Shuttle to Bus Stop 5, Cambridge Station
Attendance at this workshop is allocated on a first come, first served basis.
The registration fee covers your lunch, refreshments and a shuttle between Shuttle to Bus Stop 5, Cambridge Station and the Wellcome Genome Campus. Accommodation is not included and you will need to make your own arrangements.
Please note that registration closes two weeks prior to the workshop, so please register as soon as you can. Once you have registered and we have received payment we can provide a letter of support should you require a visa to travel to the UK. Applying for a visa can take several weeks, and it might not be possible to be granted a visa if you register just before the closing date. If you are unable to attend, then please notify us as quickly as possible so that we can offer your place to someone else.
Once you have registered please send Meredith Willmott (meredith@ebi.ac.uk) a picture of yourself and a Microsoft Word (.docx) document containing three short paragraphs with a biography, work history and description of your current research interests; each paragraph should be no more than 100 words.
Further reading
More information on the topics covered in the workshop can be found in these papers:
- Kafkas Ş, Kim JH, McEntyre JR. Database citation in full text biomedical articles. PloS one. 2013 ;8(5):e63184.
- Kafkas Ş, Dunham I, McEntyre J. Literature evidence in open targets - a target validation platform. Journal of biomedical semantics. 2017 Jun;8(1):20.
Additional information can also be viewed in these webinars:
EMBL-EBI, UK
EMBL-EBI, UK
EMBL-EBI, UK
EMBL-EBI, UK
EMBL-EBI, UK
EMBL-EBI, UK
EMBL-EBI, UK
EMBL-EBI, UK
Programme
Day 1 – Tuesday 10 September 2019 |
||
---|---|---|
Finding data cited in research literature | ||
08:30 | Shuttle from Bus Stop 5, Cambridge Station | |
09:00-09:15 | Arrival and registration | |
09:15-09:30 | Welcome and introduction to the workshop | Tom Hancocks |
09:30-10:30 | Introduction to EMBL-EBI data resources | Tom Hancocks |
10:30-10:45 | Break | |
10:45-11:15 | How researchers share and cite data | Michelle Magrane |
11:15-12:30 | Search tools for data cited in the literature | Dayane Araujo |
12:30-13:00 | Lunch | |
13:00-13:45 | Searching for datasets on a particular research topic | Dayane Araujo |
13:45-14:00 | Introduction to REST API | Dayane Araujo |
14:00-14:30 | Introduction to Europe PMC API | Dayane Araujo |
14:30-14:45 | REST API data module | Dayane Araujo |
14:45-15:00 | PDBe: WESTLIFE project | Nurul Nadzirin |
15:00-15:15 | Break | |
15:15-17:00 | Programmatic search for datasets in research articles - project | Dayane Araujo & Maaly Nassar |
17:00 | End of day | |
17:15 | Shuttle to Bus Stop 5, Cambridge Station |
Day 2 – Wednesday 11 September 2019 |
||
---|---|---|
Scientific literature as big data: an introduction to text analysis | ||
08:30 | Shuttle from Bus Stop 5, Cambridge Station | |
09:00-09:15 | Arrival and registration | |
09:15-09:20 | Scientific literature as big data - the value of text and data mining | Dayane Araujo |
09:20-09:35 | Finding protein-protein interaction evidence in the literature | Kalpana Paneerselvam |
09:35-10:00 | Europe PMC Annotations platform | Dayane Araujo |
10:00-10:45 | Annotation reviewing | Dayane Araujo |
10:45-11:00 | Break | |
11:00-11:30 | Introduction to Annotations API | Dayane Araujo |
11:30-12:30 | Project work and feedback | All |
12:30-13:00 | Lunch | |
13:00-13:10 | Limitations of text mining | Xiao Yang & Dayane Araujo |
13:10-13:40 | Basics of dictionary-based text mining | Xiao Yang & Dayane Araujo |
13:40-14:10 | Introduction to ontologies | Aravind Venkatesan |
14:10-14:30 | Open Targets case study | Zoe Pendlington |
14:30-15:00 | DIY text mining project | Xiao Yang & Dayane Araujo |
15:15-15:30 | Break | |
15:30-16:45 | DIY text mining project | Xiao Yang & Dayane Araujo |
16:45-17:00 | Feedback and wrap-up | Tom Hancocks |
17:00 | End of workshop | |
17:15 | Shuttle to Bus Stop 5, Cambridge Station |
Attendance at this workshop is allocated on a first come, first served basis.
The registration fee covers your lunch, refreshments and a shuttle between Shuttle to Bus Stop 5, Cambridge Station and the Wellcome Genome Campus. Accommodation is not included and you will need to make your own arrangements.
Please note that registration closes two weeks prior to the workshop, so please register as soon as you can. Once you have registered and we have received payment we can provide a letter of support should you require a visa to travel to the UK. Applying for a visa can take several weeks, and it might not be possible to be granted a visa if you register just before the closing date. If you are unable to attend, then please notify us as quickly as possible so that we can offer your place to someone else.
Once you have registered please send Meredith Willmott (meredith@ebi.ac.uk) a picture of yourself and a Microsoft Word (.docx) document containing three short paragraphs with a biography, work history and description of your current research interests; each paragraph should be no more than 100 words.
Further reading
More information on the topics covered in the workshop can be found in these papers:
- Kafkas Ş, Kim JH, McEntyre JR. Database citation in full text biomedical articles. PloS one. 2013 ;8(5):e63184.
- Kafkas Ş, Dunham I, McEntyre J. Literature evidence in open targets - a target validation platform. Journal of biomedical semantics. 2017 Jun;8(1):20.
Additional information can also be viewed in these webinars: