Home to the Service Desk, IT Security and departmental support services.
Migrating to the cloud has always proven problematic for the life science research community due to the need to access huge data-sets spread across different storage types in different locations, from a variety of different cloud environments.
These use cases were the primary reason why, in 2016, EMBL-EBI became a partner in the EC-funded Helix Nebula Science Cloud project. Helix Nebula Science Cloud is a pre-commercial procurement project bringing together public research organisations and commercial cloud providers in pursuit of a hybrid cloud platform to support data-intensive research activities. One of the most interesting products to emerge as part of the 3-year long project is Onedata.

Based in Poland and now in its sixth year, Onedata was formed by a group of developers with use cases from the scientific research community to create a decentralised open source hybrid cloud platform to solve large scale data transfer and processing problems. It is a high-performance data management solution that offers unified data access across globally distributed environments and multiple types of underlying storage, allowing users to share, collaborate and perform computations on the stored data without modifying their applications.
Onedata provides the following features:
For any research organisation these four features are essential. Onedata can provide a single virtual storage system on top of multiple physical file systems distributed over any type of storage. It offers direct, block-level data access and on-the-fly data prefetching for efficient data intensive computations. A tailored service for multi-cloud, grid, HPC and desktop environment, which is compatible with enterprise grade Linux distributions.
And if that’s not enough, it’s also fully open source. You can pay for support if you want it, but the tool itself is freely available.
There are many scenarios in which Onedata could be useful to EMBL-EBI:
Within the Technical Services Cluster, the Virtualisation & Cloud team are testing Onedata as a potential tool for providing Embassy Cloud virtual machines with access to the global filesystems at EMBL-EBI. Currently, these filesystems cannot be mounted on Embassy for security reasons, but Onedata’s security model would allow us to manage the access effectively enough for this to become a reality.
Work is also ongoing to provide Onedata access to pods inside Kubernetes deployments in a seamless and robust manner.
Onedata has a REST API for managing data, users and groups. This API can integrate with the ELIXIR AAI infrastructure using industry-standard protocols, eliminating the need to manage local accounts on every system. User permissions can also be managed at the user or group level.
The TSC’s Cloud Consultants are working towards a pre-production system, with the intention of having a production system available sometime next year.
Author: C.D. Tiwari (cdtiwari@ebi.ac.uk), Cloud Bioinformatics Application Architect