Date:Tuesday 10 - Wednesday 11 September 2019
Venue:European Bioinformatics Institute (EMBL-EBI) - Training Room 1 - Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, United Kingdom
Application opens:Thursday 21 March 2019
Application deadline:Friday 23 August 2019
Participation:First come, first served
Do you want to identify publicly available datasets cited in research literature to use them in your future analyses? Do you want to develop basic skills to programmatically access and analyse scientific literature? Do you want to learn about the basics of text analytics to make the most of literature review?
Published research contains a wealth of new data and evidence. Automating literature review and text analysis can help extract valuable knowledge from millions of research publications. Modern tools for literature analysis can help you identify potential drug targets, predict host-pathogen interactions, or infer growth regulators for crops, based on published findings.
This two-day workshop introduces tools and approaches used to discover biologically relevant data in the research literature. Participants will be introduced to the basics of programmatic analysis of scientific literature and explore the principles of dictionary-based text-mining, explained using relevant case studies.
The workshop has a strong focus on practical exercises and group project work to give participants hands-on experience, tackling biologically relevant problems based on what they have learnt in this workshop.
The workshop is aimed at life science researchers, who are interested in extracting data and evidence from research literature. It will help those, who want to identify cited datasets for reuse, further analyses, background research, or as supporting data for own hypotheses. The workshop would also be of interest to those who are applying or planning to apply literature analysis/text-mining in their own research projects.
Participants will benefit from an undergraduate level knowledge of biology. Participants should ideally have some bioinformatics experience and/or basic understanding of programmatic access. Please note that this workshop requires no prior knowledge of text analytics or computer programming skills.
Regardless of your current knowledge, we encourage particiapants to explore this short series of recorded webinars on an introduction to programmatic access.
Syllabus, tools and resources
During this workshop you will learn about:
- How researchers share and cite data
- How to search for data cited in the literature
- Basics of dictionary-based text mining
- Basics of ontologies
- Programmatic tools to find data and evidence in the literature
Tools and resources covered include:
Please note that this workshop does not cover text mining.
At the end of the workshop you should be able to:
- Search and locate publicly-available open-access datasets from the scientific literature
- Use programmatic tools to search for data in the literature
- Appreciate the value and limitations of text-mining approaches
- Develop basic sets of heuristics for text analytics
- Apply the information you have discovered in the context of your own research
More information on the topics covered in the workshop can be found in these papers:
- Kafkas Ş, Kim JH, McEntyre JR. Database citation in full text biomedical articles. PloS one. 2013 ;8(5):e63184.
- Kafkas Ş, Dunham I, McEntyre J. Literature evidence in open targets - a target validation platform. Journal of biomedical semantics. 2017 Jun;8(1):20.
Additional information can also be viewed in these webinars:
Day 1 – Tuesday 10 September 2019
|Finding data cited in research literature|
|08:30||Shuttle to campus|
|09:00-09:15||Arrival and registration|
|09:15-09:30||Welcome and introduction to the workshop||Tom Hancocks|
|09:30-10:30||Introduction to EMBL-EBI data resources||
|10:45-11:15||How researchers share and cite data||Michelle Magrane|
|11:15-12:30||Search tools for data cited in the literature||Dayane Araujo|
|13:00-13:45||Searching for datasets on a particular research topic||Dayane Araujo|
|13:45-15:00||Introduction to programmatic data searches||Dayane Araujo & Nurul Nadzirin|
|15:15-17:00||Programmatic search for datasets in research articles||Dayane Araujo|
|17:00||End of day|
|17:15||Shuttle to Station Road, Cambridge|
Day 2 – Wednesday 11 September 2019
|Scientific literature as big data: an introduction to text analysis|
|08:30||Shuttle to campus|
|09:15-09:20||Scientific literature as big data - the value of text and data mining||Dayane Araujo|
|09:20-09:35||Finding protein-protein interaction evidence in the literature||IntAct Team|
|09:35-10:45||Europe PMC Annotations platform||Dayane Araujo|
|11:00-12:30||Introduction to Annotations API||Dayane Araujo|
|13:00-13:40||Introduction to text analysis||Xiao Yang & Dayane Araujo|
|14:30-15:15||Open session on text mining||Dayane Araujo|
|15:30-16:45||Open session on text mining||Dayane Araujo|
|16:45-17:00||Feedback and wrap-up||Tom Hancocks|
|17:00||End of workshop|
|17:15||Shuttle to Cambridge Station|