Data integration is a longstanding challenge for bioinformatics (Figure 18) but can be a tremendously powerful means of gathering evidence for or against a hypothesis. For example, integrating data from transcriptomics, proteomics and metabolomics experiments can help to build evidence that a particular pathway is involved in a disease, or in resistance to a drug.

Some of the challenges associated with data integration:
Many data resources - many to maintain, new ones appearing, only 20% have a sustainable future, it's not always easy to find them.
Different query interfaces.
Variable results - formats, schemas, data content.
Redundancy and inconsistency can cause issues for data integration.
Figure 18 Some of the challenges associated with data integration. Based on a figure provided by Sandra Orchard.

As with systems modelling, data integration helps you to generate hypotheses, but must be combined with experimental approaches to test your hypothesis.

Mark as complete