Enriched metadata fields in MGnify based on text-mining of associated publications

MGnify logo white text on a black background

Enriched metadata fields in MGnify based on text-mining of associated publications

18 Nov 2021 - 17:52

Automated metadata annotations derived from microbiome publications are now available within the MGnify resource.Word cloud of automated annotation terms

One of the major limitations in comparing microbiome datasets is often the lack of contextual metadata relating to the sample or the experimental methods used to obtain the sequence data. To address this, we have partnered with Europe PMC to automatically extract relevant metadata terms from the publications associated with microbiome datasets, thus improving the range and depth of metadata available to the user. The annotations include terms relating to sequencing platform, extraction kit, primers, environment of sample, and many more.

MGnify publication page with metadata treeThe annotations are generated by the EMERALD (Enriching MEtagenomics Results using Artificial intelligence and Literature Data) project which is a collaboration between MGnify and Europe PMC using text-mining approaches to improve microbiome metadata. Within Europe PMC, key metadata terms are identified in publications, marked-up within the text, and linked to relevant ontologies where possible. These annotated terms are then pulled into MGnify and summarised within the publication pages, as well as on the sample pages.

Within MGnify the annotations are presented as a hierarchical tree, allowing the user to navigate the terms, and in each case see the term in the context of the surrounding text as a snippet from the publication. It is then possible to jump directly from this snippet to the correct position in the publication on Europe PMC to investigate the annotated term in more detail.

You can read more about this new feature in our blog post.

Subscribe to the e-mail newsletter
Get a monthly round-up of the hottest news and features from EMBL, straight to your inbox.
Or stay updated with the RSS feed (EMBL-EBI only).