Poster Abstracts for Category D: Databases & database integration


Poster D01
GreenExpress: A Web-Based Bioinformatics Infrastructure for Plant Genomic Information
Asaf Madi, Alexandra Sirota, Tzvika Keilin, Tzvika Kushnirski, Keren Alfasi, Elad Balmor, Anat Katz, Eytan Shvimmer, Hanne Volpin
Agricultural Research Organization - Volcani Center
Abstract:
GreenExpress, a bioinformatic infrastructure that integrates publicly available data with proprietary data gathered from the Volcani Center researchers, provides the user with access to an array of tools. Among these are gene searches by user provided expression profiles and a unique tool that performs virtual RACE of a sequence using the existing database of contigs. Mining the data revealed that organ absent and organ specific gene expression may be a result of interactions of the two gene sets. In addition, results from various statistical analyses of the plant genome data will be presented

Contact: asafmadi [at] hotmail.com

Keywords: Plant Data Integration, Gene Expression, SSR


Poster D02
The Israeli Gene Bank Documentation Center: A Novel Infrastructure for Knowledge Discovery
Alexandra Sirota, Hanne Volpin, Asaf Madi, Rivka Hadas
Agricultural Research Organization - Volcani Center
Abstract:
The Israel Gene Bank (IGB) Documentation Center integrates proprietary and publicly available data from sources like GRIN, PIW and Rotem. It presents extensive species information at the IGB website with the aim of giving immediate insights into species nomenclature, collection information, species characterization, medicinal uses, and cultivation information. Knowledge discovery tools were applied to the data within this novel bioinformatics infrastructure, and the results from two approaches that enable prediction of candidate plants to be screened for beneficial compounds are discussed.

Contact: alexandra_sirota [at] hotmail.com

Keywords: Medicinal Uses, Wild Crop Relatives


Poster D03
APID, an Integrated Web Platform to Explore and Evaluate Protein-Protein Interaction Networks
Carlos Prieto, Javier De Las Rivas
Cancer Research Center (CIC USAL-CSIC)
Abstract:
Nowadays the assessment of the reliability and broader coverage of the interactome network are two of the main research areas in protein interactions. With these purposes APID (Agile Protein Interaction DataAnalyzer, http://bioinfow.dep.usal.es/apid/) has been developed to analyze and integrate in a comparative web platform the main currently known data about protein-protein interactions. At present (June 2006) the bioinformatic tool encloses 38271 proteins and 128730 interactions, including an important set of human proteins (7774).

Contact: jrivas [at] usal.es

Keywords: Protein Interactions, Databases, Networks


Poster D04
A Locus Specific Gene Variant Database for Autosomal Dominant Polycystic Kidney Disease: PKDB
Alexander Gout (1,2), Neilson Martin (3), David Ravine (2)
(1) The Walter and Eliza Hall Institute of Medical Resarch; (2) Western Australian Institute for Medical Genetics, School of Medicine and Pharmacology, UWA; (3) School of Psychology, Division of Health Sciences, Curtin University of Technology
Abstract:
Autosomal dominant polycystic kidney disease arises from mutations in the PKD1 and PKD2 genes. The Polycystic Kidney Disease Mutation Database (PKDB) is an internet-accessible relational database containing comprehensive information about germline and somatic disease-causing variants within the two genes, as well as polymorphisms and variants of indeterminate pathogenicity. The database has been launched with curated data comprising 1015 PKD1 and PKD2 gene variant reports. An advanced search facility has been developed to allow online interrogation of the data (URL: http://www.pkd.waimr.uwa.e

Contact: gout [at] wehi.edu.au

Keywords: ADPKD, PKD1, PKD2, Mutation Database


Poster D05
XML Format for Codon Usage Data
Denis Shestakov (1), Tapio Salakoski (2)
(1) Turku Centre for Computer Science, University of Turku; (2) Department of Information Technology, University of Turku
Abstract:
In this work, we present CUData XML Schema (an XML format) to represent calculated codon frequencies and codon bias indices. Our motivation is to provide researchers involved in codon usage studies with a convenient self-describing format which is both program- and user-friendly. In general, the CUData XML Schema is capable of representing occurrences of overlapping or non-overlapping words of variable length in nucleotide sequences. The proposed format can also represent most known codon usage-based measures. Description of the format, examples, and data converters are available.

Contact: denshe [at] utu.fi

Keywords: Codon Usage, XML Format


Poster D07
GeneDecks: Gene-Set Analyses of GeneCards Annotations
S. Ron (1), L. Strichman-Almashanu (1), M. Shmoish (1), O. Greenshpan (1), A. Sirota (1), A. Madi (1), T. Iny-Stein (1), N. Rosen (1), I. Dalah (1), O. Shmueli (1), M. Safran (2), Y. Aumann (3), D. Lancet (1)
(1) Department of Molecular Genetics, Weizmann Institute of Science; (2) Department of Biological Services, Weizmann Institute of Science; (3) Department of Computer Science, Bar Ilan University
Abstract:
Analysis of high-throughput experiments is dependent on the annotation quality of the studied genes. GeneDecks, an annotation analysis tool for gene sets, based on the GeneCards human gene database, is aimed to broaden descriptor analysis as well as to discover relationships between descriptor pairs that highlight the roles of a gene set. Using descriptors from GO, mutant phenotypes, normal tissue expression and chromosomal location we present preliminary analysis of two sets of schizophrenia and diabetes related genes, showing that GeneDecks could effectively discover unique biological themes

Contact: shany.ron [at] weizmann.ac.il

Keywords: Enrichment, Annotation, Data-Mining


Poster D08
JAFA: A Protein Function Prediction Meta-Server
Tim Harder (1,2), Iddo Friedberg (2), Adam Godzik (2)
(1) University of Applied Sciences, Bingen, Germany; (2) Burnham Institute for Medical Research, San Diego, CA, USA
Abstract:
The Joined Assembly of Function Annotations (JAFA) is a function prediction meta-server. JAFA accepts protein sequences as input, queries several different function prediction programs, and merges and scores the predictions using a simple weighted plurality vote. In this manner, JAFA synergizes several partial predictions into a more complete picture. JAFA can query any function prediction server which reports its results using Gene Ontology.

Contact: tharder [at] fh-bingen.de

Keywords: Automated Function Prediction, GO


Poster D10
Functional Annotation with Blast2GO v.2: A Comprehensive Gene Ontology Based Framework for Function Analysis
Stefan Goetz (1), Ana Conesa (2), Juan Miguel Garcia-Gomez (1), Manuel Talon (2), Montserrat Robles (1)
(1) Grupo de Informatica Biomedica, ITACA, Universidad Politecnica de Valencia, Valencia, Spain; (2) Centro de Genomica, Instituto Valenciano de Investigaciones Agrarias, Moncada, Valencia, Spain
Abstract:
Blast2GO (B2G) is a comprehensive Gene Ontology based framework for functional annotation and analysis. We present here the diversity of functionalities included into the software's second version.
B2G covers from similarity searches, data-source mappings and annotation tasks up to the function analysis, the whole process in a transparent manner. Major improvements and new features like high performance graph visualisation, a diversity of data input options and extended annotation possibilities turned B2G into a complete solution for generating and analysing functional annotation.

Contact: stefang [at] fis.upv.es

Keywords: Blast2GO, Gene Ontology, Annotation, Function


Poster D11
CoryneRegNet: An Integrative Bioinformatics Platform for the Analysis of Transcription Factors and Regulatory Networks
Jan Baumbach, Sven Rahmen, Andreas Tauch
Bielefeld University
Abstract:
CoryneRegNet is an ontology-based data warehouse designed to facilitate the genome-wide reconstruction of transcriptional regulatory networks of bacteria. It integrates the transcription factor binding site matching software PoSSuMsearch, network visualization as graphs and network comparison features.

Contact: jan.baumbach [at] cebitec.uni-bielefeld.de

Keywords: CoryneRegNet, Database, PSSM, PWM, Network


Poster D13
WormBase - A Comprehensive Resource for C. elegans
Michael Han (1), Todd Harris (2), William Spooner (3), Lincoln Stein (2), Richard Durbin (1), The WormBase Consortium
(1) Wellcome Trust Sanger Institute; (2) Cold Spring Harbor Laboratory; (3) European Bioinformatics Institute
Abstract:
WormBase is a comprehensive resource of experimental and computational data from Caenorhabditis elegans. In addition to the presentation and collection of data, it is continually refined by manual curation. Data is freely available via websites, web services and by direct access to the database. In addition, flatfiles of different datasets are available for download.

Contact: mh6 [at] sanger.ac.uk

Keywords: Model Organism, Database, C. elegans


Poster D14
Two Web Applications: A Tool for Protein Family Analysis and a Resource for Genomic Research
Florian Odronitz, Martin Kollmar
Max-Planck-Institute for Biophysical Chemistry, Goettingen, Germany
Abstract:
We present two web applications with a database backend and a user-friendly interface.
The first tool supports researchers in analyzing large protein families. The sequence-centric system has a set of import/export functions and can trigger different external programs automatically when data is entered or changed. The web interface offers different ways to browse the database and search for records. Information on sequences, domain composition, species, taxonomy and publications can be retrieved. Statistics on protein families can be displayed as tables and graphs. All output is generated on each request.
The second web application provides up-to-date information on sequencing projects (genomic and EST) and the corresponding species. It offers a modular search mechanism to construct complex queries easily. Search requests can also be saved to the database and rerun on a regular basis to monitor changes in the database. The results of a search can be viewed using a set of specialized perspectives.

Contact: mako [at] nmr.mpibpc.mpg.de

Keywords: Database, Web Application, Genomics, Sequences


Poster D15
Predicting Homologous Complexes in Large Protein-Protein Interaction Networks
Jae-Hun Choi, Jong-Min Park, Soo-Jun Park
Bioinformatics team, ETRI
Abstract:
In this paper, we propose a method for predicting biological complexes, in which interactions among proteins can be described specifically. The prediction suggests candidate complexes for a target species, which are homologically transformed from complexes experimented on well-known other species such as yeast, drosophila, mouse etc.

Contact: jhchoi [at] etri.re.kr

Keywords: Complex, PPI, Homology


Poster D17
Implementation of a Dynamic Query Processing Model to Support Biological Data Integration
Myungeun Lim, Myung-Guen Chung, Yong-Ho Lee, Soo-Jun Park
Electronics and Telecommunications Research Institute
Abstract:
Numerous biological data are being flowed out from laboratories and they are being accessed through the web. As these data have heterogeneity in their formats and expressions, data integration in biological domain becomes a big issue. We provided a querying method to distributed data sources. We suggested a way of expressing dynamic user's needs to data sources and implemented query processing system to execute the query. This model analyzes the query and extract data real time using wrappers. Using that model, we implemented a prototype integration system based on mediator model.

Contact: melim [at] etri.re.kr

Keywords: Data Integration, Query Processing


Poster D19
SABIO-RK: Integrating Reaction Kinetics Data for Systems Biology
Olga Krebs, Andreas Weidemann, Ulrike Wittig, Renate Kania, Martin Golebiewski, Saqib Mir, Isabel Rojas
Scientific Databases and Visualization Group, EML Research gGmbH
Abstract:
The mathematical modelling and simulation of biochemical reaction networks requires experimental data about reaction kinetics, describing the dynamics of the reactions with their respective parameters determined under certain environmental conditions. To address these needs we have developed an integrated database system, SABIO-RK (System for the Analysis of Biochemical Pathways - Reaction Kinetics), to store and offer access to information about metabolic pathways, biochemical reactions and their kinetics in a comprehensive, standardised and highly interrelated manner.

Contact: olga.krebs [at] eml-r.villa-bosch.de

Keywords: Database, Reaction Kinetics, Pathway


Poster D20
DigraBase - A Multigraph-based Data Store for Integrating Biological Data
Darren Otgaar, Junaid Gamieldien, Fernando Martinez, Dan Jacobson
National Bioinformatics Network, South Africa
Abstract:
We present DigraBase, a system for large-scale integration and annotation of distributed biological data. We have designed a multigraph-based data store for satisfying complex integration tasks across heterogeneous datasets. A graph-theoretic query engine has been constructed to assist the biologist in testing biological hypotheses in silico before committing to wet lab studies. DigraBase is a powerful tool for making sense of high-throughput biological data.

Contact: darren [at] nbn.ac.za

Keywords: Graph, Integration, Knowledge Base, Query


Poster D21   Late-Breaking Results
The Seventh Layer of Clinical Genomics - An Infrastructure for Using Genomic Data in Healthcare
Amnon Shabo, Dolev Dotan, Ohad Greenshpan
IBM Research Lab in Haifa, Israel
Abstract:
Clinical Genomics is an interdisciplinary field dealing with the use of genetic and genomic data in actual clinical practice. Current Clinical Genomics IT solutions lack a layer of interrelations between specific data items in both domains, and offer the storage of clinical and genomic data in a side-by-side fashion,correlated only by a patient ID. Clinical Genomics Level Seven (CGL7)is a set of web services being developed at the IBM Haifa Research Lab, endeavoring to create those missing interrelations and provide higher level infrastructure for clinical decision support applications. CGL7 follows the "encapsulate & bubble-up" paradigm,which underlies the HL7 (Health Level Seven) Clinical Genomics Standards. Its main features are:(1)encapsulating raw genomic data in clinical information constructs,(2)annotating the genetic data of the patient using ontological methods and computational biology methods,(3)bubbling-up the most clinically-significant items and (4)linking them with known clinical phenotypes found in publicly-available sources as well as with clinical phenotypes found in the patient's electronic health record. CGL7 is designed to serve as a specialized infrastructure layer for decision support applications such as case-based reasoning applications that look for similar cases to support clinical decision at the point of care.

Contact: shabo [at] il.ibm.com


Poster D22
MetaboBase: A Computational Platform for Metabolomics Data Integration, Storage and Interpretation
Ilya Venger (1,2), Sergey Gerzon (1), Ilana Rogachev (1), Sergey Malitsky (1), Asaph Aharoni (1)
(1) Department of Plant Sciences, The Weizmann Institute of Science, Rehovot, Israel; (2) Department of Molecular Genetics, The Weizmann Institute of Science, Rehovot, Israel
Abstract:
Metabolomics data represents the concentrations of all the small molecules inside a living cell under a specific condition. To date, there is no general platform that unifies data acquired from different experimental platforms. We are developing MetaboBase - a database for the storage and analysis of metabolomics information. This platform will incorporate the storage of raw metabolomics data, standardized description of samples and information on known metabolites or metabolic markers. The processing capabilities of the program will connect between raw metabolomics data and known metabolites.

Contact: asaph.aharoni [at] weizmann.ac.il

Keywords: Metabolomics, Database, GC-MS, LC-MS