Annotating data with CMPO

Annotating datasets

As we have already seen there are many benefits to annotating data with ontology terms. Whilst a daunting prospect for those new to ontologies, annotating data is being made easier with simple tools designed to be used by biologists.

EMBL-EBI provides the Webulous tool suite for collaborative ontology development, allowing users to to populate ontology design patterns via simple user interfaces. Webulous has been designed to support a scenario where knowledge about a domain is collected in spreadsheets and transformed into statements within an ontology according to pre-defined ontology design patterns. CMPO is available via a Webulous Google Add-on for Google Spreadsheets allowing annotation of your own data using drop-down validation boxes. Populated templates are then sent back to a server for processing. This approach can be used by domain experts to annotate their own data and generate a basic ontology themselves.

Automating annotation

Of course, you may want to annotate much larger datasets that would take a long time to complete by hand. To speed up the annotation of high-throughput -omics data we need to explore ways of automating the process. ZOOMA from EMBL-EBI is one such method; it aims to provide an automated service to take free-text data descriptors and then map them computationally to existing ontology terms. This removes the need for the user to conduct the mapping process.


Figure 8 Example ZOOMA interface with free-text and EFO term input.