Finding Evidence in Research Publications

Date:

 Tuesday 10 Wednesday 11 September 2019

Venue: 

European Bioinformatics Institute (EMBL-EBI) - Training Room 1 - Wellcome Genome Campus, Hinxton, Cambridge,  CB10 1SD, United Kingdom

Application opens: 

Thursday 21 March 2019

Application deadline: 

Friday 23 August 2019

Participation: 

First come, first served

Contact: 

Meredith Willmott

Registration fee: 

£90.00

Overview

Do you want to identify publicly available datasets cited in research literature to use them in your future analyses? Do you want to develop basic skills to programmatically access and analyse scientific literature? Do you want to learn about the basics of text analytics to make the most of literature review?

Published research contains a wealth of new data and evidence. Automating literature review and text analysis can help extract valuable knowledge from millions of research publications. Modern tools for literature analysis can help you identify potential drug targets, predict host-pathogen interactions, or infer growth regulators for crops, based on published findings.

This two-day workshop introduces tools and approaches used to discover biologically relevant data in the research literature. Participants will be introduced to the basics of programmatic analysis of scientific literature and explore the principles of dictionary-based text-mining, explained using relevant case studies.

The workshop has a strong focus on practical exercises and group project work to give participants hands-on experience, tackling biologically relevant problems based on what they have learnt in this workshop.

Audience

The workshop is aimed at life science researchers, who are interested in extracting data and evidence from research literature. It will help those, who want to identify cited datasets for reuse, further analyses, background research, or as supporting data for own hypotheses. The workshop would also be of interest to those who are applying or planning to apply literature analysis/text-mining in their own research projects.

Participants will benefit from an undergraduate level knowledge of biology. Participants should ideally have some bioinformatics experience and/or basic understanding of programmatic access. Please note that this workshop requires no prior knowledge of text analytics or computer programming skills.

Regardless of your current knowledge, we encourage particiapants to explore this short series of recorded webinars on an introduction to programmatic access.

 

Syllabus, tools and resources

During this workshop you will learn about:

  • How researchers share and cite data
  • How to search for data cited in the literature
  • Basics of dictionary-based text mining
  • Basics of ontologies
  • Programmatic tools to find data and evidence in the literature

Tools and resources covered include: 

Please note that this workshop does not cover text mining.

Outcomes

At the end of the workshop you should be able to:

  • Search and locate publicly-available open-access datasets from the scientific literature
  • Use programmatic tools to search for data in the literature
  • Appreciate the value and limitations of text-mining approaches
  • Develop basic sets of heuristics for text analytics
  • Apply the information you have discovered in the context of your own research
     

Programme

Day 1 – Tuesday 10 September 2019
Finding data cited in research literature
08:30 Shuttle to campus  
09:00-09:15 Arrival and registration  
09:15-09:30 Welcome and introduction to the workshop Tom Hancocks
09:30-10:30 Introduction to EMBL-EBI data resources

Tom Hancocks

10:30-10:45 Break  
10:45-11:15 How researchers share and cite data Michelle Magrane
11:15-12:30 Search tools for data cited in the literature Dayane Araujo
12:30-13:00 Lunch  
13:00-13:45 Searching for datasets on a particular research topic Dayane Araujo
13:45-15:00 Introduction to programmatic data searches Dayane Araujo & Nurul Nadzirin
15:00-15:15 Break  
15:15-17:00 Programmatic search for datasets in research articles Dayane Araujo
17:00 End of day  
17:15 Shuttle to Station Road, Cambridge  
Day 2 – Wednesday 11 September 2019
Scientific literature as big data: an introduction to text analysis
08:30 Shuttle to campus  
09:00-09:15 Arrival  
09:15-09:20 Scientific literature as big data - the value of text and data mining Dayane Araujo
09:20-09:35 Finding protein-protein interaction evidence in the literature IntAct Team
09:35-10:45 Europe PMC Annotations platform Dayane Araujo
10:45-11:00 Break  
11:00-12:30 Introduction to Annotations API Dayane Araujo
12:30-13:00 Lunch  
13:00-13:40 Introduction to text analysis Xiao Yang & Dayane Araujo
13:40-14:10    
14:30-15:15 Open session on text mining Dayane Araujo
     
15:15-15:30 Break  
15:30-16:45 Open session on text mining Dayane Araujo
16:45-17:00 Feedback and wrap-up Tom Hancocks
17:00 End of workshop  
17:15 Shuttle to Cambridge Station