The greatest challenge facing the molecular biology
community today is to make sense of the wealth of
data that has been produced by the genome sequencing projects.
Traditionally, molecular biology research was
carried out entirely at the experimental laboratory
bench but the huge increase in the scale of data
being produced in this genomic era has seen a
need to incorporate computers into this research
process.
Sequence generation, and its subsequent storage, interpretation and analysis are entirely
computer dependent tasks. However, the molecular biology of an organism is a very complex
issue with research being carried out at different levels including the genome, proteome,
transcriptome and metabalome levels. Following on from the explosion in volume of genomic data,
similar increase in data have been observed in the fields of proteomics, transcriptomics and metabalomics.
The first challenge facing the bioinformatics community today is the intelligent and
efficient storage of this mass of data. It is then their responsibility to provide
easy and reliable access to this data. The data itself is meaningless before analysis
and the sheer volume present makes it impossible for even a trained biologist to begin
to interpret it manually. Therefore, incisive computer tools must be developed to allow the
extraction of meaningful biological information.
There are three central biological processes
around which bioinformatics tools must be developed:
DNA sequence determines protein sequence
Protein sequence determines protein structure
Protein structure determines protein function
The integration of information learned about these key biological processes should allow
us to achieve the long term goal of the complete understanding of the biology of organisms.