What is EMPIAR?

EMPIAR, the Electron Microscopy Public Image Archive, is a free public resource for raw two-dimensional image data from 3D bioimaging experiments as well as certain 3D bioimaging datasets. For example, accepted are 3D datasets obtained using transmission or scanning electron microscopy and electron or soft X-ray tomography.

Why do we need EMPIAR?

EMPIAR provides a way to easily access state-of-the-art raw data that underpins 3D cryo-EM structures of biomacromolecules and molecular machines. It complements the Electron Microscopy Data Bank (EMDB), where corresponding 3D structures are stored, and PDB, which stores atomic models of macromolecular structures. It is powered by BioImage Archive, a resource that stores and distributes biological images that are useful to life-science researchers (Figure 1). EMPIAR data has been used to facilitate development and validation of methods, which will lead to better 3D structures.

Figure 1 Distribution of data between archives.

The development of EMPIAR was prompted by calls from the electron microscopy community on the urgent need for archiving raw image data related to EMDB structures. Notably, two workshops run by EMBL-EBI and the Open Microscopy Environment (OME) kick-started the process and in 2014 EMBL-EBI received funding to develop the resource.

The ability to store data in EMPIAR is important as it enables new science, and facilitates and accelerates method development in this rapidly evolving field. This way the data can be reused multiple times and it improves the number of citations of the original study.

EMPIAR has been designed to handle very large datasets with sizes in the terabyte range. EMPIAR uses Aspera software plugin and supports Globus to enable the safe transfer of large datasets. As of March 2020 EMPIAR contains 282 entries averaging ~900 GB in size, with 62 entries exceeding 1TB each.