Xtreme Reasoning Jamboree (XR-Jam)Validation of assertions in different biomedical data resources
This project is a small-scale jamboree for about one year to explore different solutions for the validation of biomedical knowledge based on different information technology based solutions. The main goal is to establish interoperability between biomedical data resources that has not been achieved in the past using ontological resources. In the next steps, different information retrieval, data mining and reasoning techniques are applied to validate information that is available.
Typical questions are:
- In which parts of the proteome can the functional annotation of the protein-interaction partners be confirmed from different resources?
- If we annotate diseases with different kinds of phenotypes across selected species based on related anatomical structures. In which species do we find parts of the protein-interaction network that correlate across species?
Can we formulate a chain of cause-consequence relations from the molecular level to the phenotypic level that would explain the development of a selected disease under the condition that the gene is involved?
- If we extract assertions from the scientific literature, how many of these assertions can be validated against the biomedical data resources?
- The project could set off with a focus to immunological questions, since part of the funding is contributed by the CALBC project which leads to the annotation and normalisation of all Medline abstracts on immunology for integration into a Triple Store. A larger focus would be the validation of different gene-disease associations across all data resources. Preliminary work on the integration of biomedical data into triple stores has been finalized in the Rebholz group in 2010.
Other questions that would be relevant to the project are:
- How interoperable are all available ontological resources altogether? Which modifications improve interoperability and the outcome of the work for the before-mentioned questions?
- Can we use reasoning to improve the retrieval of information across all “relevant” biomedical data resources?
What are typical questions that can be answered using reasoning in combination with ontological resources in the biomedical research infrastructure that could not be answered through other retrieval methods?
- What reasoners perform best at a large scale? What setup of semantic resources improves the performance of reasoning overall and what data resources are required to answer the before-mentioned questions?
The project will be setup as an interdisciplinary research project between bioinformaticians and biologists, logicians and ontologists, computer scientists, and computational linguists. Participants in the project can receive funding from my research team for the research work, for travel to conferences and for publications.