spacer

Semantic Enrichment of the Scientific Literature Workshop

30, 31 March and 1 April 2009

Dear Associates,

We would like to invite you to the EBI workshop on Semantic Enrichment of the Scientific Literature which is co-organised and sponsored by the BootStrep project (contact Dietrich Rebholz-Schuhmann, rebholz@ebi.ac.uk) and the EBI Industry Programme (contact Dominic Clark, clark@ebi.ac.uk).

The workshop venue is the Wellcome Trust Conference Centre, Hinxton, Cambridge, UK.

Motivation
Over the last 10 years, innovation has changed the ways in which scientific publications are gathered and delivered to the public. Since the start of the electronic era:

  • publishers have moved from paper presentation of their content to electronic delivery;
  • the US National Library of Medicine (NLM) has opened up its archives and delivered Medline abstracts to the public in electronic form;
  • open access publishers have been making their content freely available;
  • curation teams are increasingly working with the publishers to gather ever more data and provide it to the public and
  • proposals have been made that authors should contribute more details to their manuscript (FEBS letter experiment).

These changes require novel ways to capture and deliver the content to the public and to exchange the content and the annotations between different sites. This has led to increased activities to capture more information from the authors, to align it with the bioinformatics data resources, to deliver the content as part of the scientific literature and to improve the interoperability between existing automatic systems for text processing and exploitation.
This workshop will focus on semantic enrichment of the scientific literature. To this end, workshop participants will have the opportunity to hear about and discuss solutions that capture information from the authors directly and that deliver documents with their annotations. Furthermore, we will discuss the needs of different user groups for the benefits from the scientific literature of the future, e.g. librarians, researchers, automatic text processing and data mining research community, ontologists, others.

Participants
The participants in this workshop are all users who profit from better information retrieval (e.g., librarians and information scientists supporting industrial researchers) and information provision (e.g., bioinformatics research community). In addition, members of the text mining research community, members of publishing companies and industrial users of scientific information, curation teams and teams working on ontological or terminological resources.

Intended Outcomes
The intended outcomes of the workshop are:

  • to exchange views on the reuse of literature in all possible ways and to the benefit of all involved parties;
  • to have a better understanding for the infrastructure requirements coming from the automatic gathering and distribution of semantically enriched text (open standards and connectivity);
  • to exchange views on what contributions could come from the publishers with regards to better exploitation of the publicly available literature and
  • to assess current solutions for the gathering of semantic details from the authors while writing their scientific manuscripts.

Programme Overview

The workshop programme has been divided into three separate days – each with its own focus.
The foci are as follows:

  • March 30th: Reliable factual data from the literature based on ontological resources (open meeting, Francis Crick Auditorium). Presentations and discussions on advanced solutions to model ontological resources for gathering facts from the literature and for integration into a fact database: gene regulatory events as a working example.
  • March 31st: Semantic Enrichment of the literature for the benefit of all users (open meeting, Francis Crick Auditorium). Existing solutions for the processing and standardization of annotations in the scientific literature: authoring solutions, access to data through publishers, requirements from curators and “information scientists” in pharmaceutical and other industrial companies.
  • April 1st: Efficient exchange of scientific literature for automatic exploitation and reuse of information (participation by invitation only, James Watson Pavilion). This is a focussed discussion forum that will address a number of issues and opportunities with key stakeholder groups.

Acknowledgement
We are grateful to Ian Dix (AstraZeneca) and Ian Harrow (Pfizer) for their advice in the construction of the workshop programme.  This workshop is sponsored by the EC STREP project ‘BootStrep’ (FP6-028099, www.bootstrep.org ) and by the industry program at the EBI (http://www.ebi.ac.uk/industry/ind-prog-index.html).

Agenda:

Time Agenda
Monday 30 March 2009 - Reliable factual data from the literature based on ontological resources (open meeting, Francis Crick Auditorium).
Presentations and discussions on advanced solutions to model ontological resources for gathering facts from the literature and for integration into a fact database: gene regulatory events as a working example.

09.00 Registration Tea/Coffee
10.30 Welcome and Introductions (Dietrich Rebholz‐Schuhmann)
10.30

Session 1: Semantic representation of Gene Regulatory Events (GREs)
• [10.30] Keynote: "Refine and PathText, which combines text Mining with Pathways" (Junichi Tsujii, NaCTeM, Manchester, UK and University of Tokyo, Tokyo, Japan)
• [11.00] "Gene regulation ontology: design and exploitation for information extraction" (Jung‐Jae Kim, EBI, Hinxton, Cambridge, UK)
• [11.30] "OregAnnO: curated gene regulatory events" (Stephen Montgomery, Sanger, Wellcome Trust Genome Campus, UK)
• [12.00] "Ontology development for information extraction (ODIE) from clinical text" (Wendy Chapman, University of Pittsburgh)

12.30 Lunch
13.45

Session 2: Identification of Gene Regulation Events in the scientific literature
(BootStrep)
• [13.45] "BioLexicon" (Simonetta Montemagni, CNR, Pisa)
• [14.15] "Identification of gene regulatory events from the literature" (Friedrich‐Schiller University, Jena, Germany)
• [14.45] "Language Resource Assessment for Information Access" (Su Jian, InfoComm Research, Singapore)

15.30 Tea/Coffee
16.00

Session 3: Clinical data and Novel publishing solutions
• [16.00] "Exploiting semantic technologies to build an application ontology " (James Malone, EMBL-EBI)
• [16.00] "Embedding semantic data during manuscript authoring" (Lynn Fink, University of California San Diego)
• [16.30] "PaperMaker: consistency analysis of published manuscripts" (Piotr Pezik, EBI, Hinxton, Cambridge, UK)

17.00 Closing remark and Discussion (Dietrich Rebholz-Schuhmann)
18.15 Pre-diner drinks and networking
19.30 Dinner, Conference Centre Restaurant

Tuesday 31 March 2009 - Semantic Enrichment of the literature for the benefit of all users (open meeting, Francis Crick Auditorium).

Existing solutions for the processing and standardization of annotations in the scientific literature: authoring solutions, access to data through publishers, requirements from curators and “information scientists” in pharmaceutical and other industrial companies.


08.30 Registration Open (tea/coffee)
09.15 Welcome and Introductions (Dietrich Rebholz-Schuhmann & Dominic Clark)
09.30

Session 1: User’s and publishers perspective of future publishing

  • 09.30 Keynote: Completing the Circuit: Publishing with Linked Data, Eric Neumann, Clinical Semantics Group
  • 10.00 The Needs of Information Scientists in Pharma Companies, Jasen Chooramun, AstraZeneca
  • 10.30 The Needs of Bioinformaticians in Pharma Companies, Ian Harrow, Phoebe Roberts, Pfizer
  • 11.00 Tea/Coffee
    11.30

    Session 1 cont:

  • 11.30 Elixir: European Infrastructure to Support Interoperability Between Text Repositories and Biological Databases, Alfonso Valencia, CNIO, Madrid, Spain
  • 12.00 CALBC + UKPMC: Producing a Large Scale Annotated Corpus for Standardized Literature, Dietrich Rebholz-Schuhmann, EBI, UK
  • 12.30 Intelligent Information Management , Stefano Bertolo, European Commission, Luxembourg
  • 12.45 Lunch
    14.00

    Session 2: Semantic Enrichment

  • 14.00 Keynote: Reconciling Annotations Made to Different Terminologies: The Role of Terminology Integration Services, Olivier Bodenreider, NLM

  • 14.30 From Text to Knowledge and Back. Semantic Enrichment for Knowledge Discovery, Sophia Ananiadou, NaCTeM, Manchester
  • 15.00 On the Semantics of Semantic Enrichment - Conceptual Resources for Text Mining Analytics and Information Access, Udo Hahn, Friedrich-Schiller University, Jena, Germany
  • 15.30 Tea/Coffee
    16.00

    Session 3: New Solutions for Publishing

  • 16.00 Keynote: Semantic Enrichment of Elsevier's Literature, Anita de Waard, Elsevier, Amsterdam, NI
  • 16.30 WikiGene: Novel Ways to Publish Information on the Web , Robert Hoffmann, MIT
  • 17.00 Use of Electronic Healthcare Records and Biomedical Literature/Databases for the Early Detection of Adverse Drugs Events (EU-ADR), Erik van Mulligen, Erasmus Medical Center, Rotterdam, NI
  • 17.30 Closing remark and Discussion (Dietrich Rebholz-Schuhmann)
    18.30 Departure for Workshop Dinner
    19.00 Workshop Dinner, The Cricketers, Clavering
    spacer
    spacer