Policies and Processing Procedures

Policies and Processing Procedures

Authored by the EMDB team
Version: 1.0
Date: 2020/01/03
Valid from: 2020/01/03

Preface

This document details the policies and procedures governing the Electron Microscopy Data Bank (EMDB), a public archive for three-dimensional electron cryo-microscopy reconstructions and tomograms of bio-macromolecules and their complexes as well as of cellular components and cells. Since the archive organises its data into EMDB entries, this document also aims to cover the policies regarding what constitutes an entry, as well as the requirements for entry depositions into the archive. It also documents the underlying entry data processing, the current processing procedures and the public release of information and data. EMDB aims to update its processing procedures in line with the evolving field of three-dimensional transmission electron cryo-microscopy (3DEM), structure determination techniques and practices, and annotation methods. Since EMDB is a Worldwide Protein Data Bank (wwPDB) core archive, exceptions to the EMDB policies for issues not covered in this document will be considered on a case-by-case basis by the wwPDB principal investigators. Any EMDB policy or procedure issues that may arise should refer to the date and version provided above and be raised by contacting the wwPDB staff at deposit-help@mail.wwpdb.org.

Table of contents

  1. Definitions
    1. Abbreviations
    2. Terms
  2. Introduction

I. Definitions

A. Abbreviations

Abbreviation Meaning
3DEM 3D (Transmission) Electron cryo-Microscopy
3DSEM 3D Scanning Electron Microscopy
EC Electron Crystallography
EM Electron Microscopy, usually understood to be transmission electron cryo-microscopy or cryo-tomography
EMDB Electron Microscopy Data Bank
EMPIAR Electron Microscopy Public Image Archive
ET Electron cryo-Tomography
FSC Fourier Shell Correlation
HD Helical Diffraction
PDB Protein Data Bank
PI Principal Investigator
SP Single Particle
SXT Soft X-ray Tomography
wwPDB Worldwide Protein Data Bank

B. Terms

Term Description
Atomic model Atomic coordinates that are determined by experimental measurements on sample specimens containing biological macromolecules
Author(s) Anyone designated by the PI(s) that has in any way contributed to the EMDB entry
Depositor Someone who is associated with a specific EMDB submission and who has permission to modify the deposition. The depositor may or may not be the PI. There is only one depositor for any given deposition
EM experiment Experiment(s) that result(s) in a final EM map
EM map All EM reconstructions, tomograms, volume maps, electric-potential maps, etc. are collectively referred to as “maps”
EMDB header file An XML file with entry metadata that conforms to the EMDB data model
EMDB metadata Any data that describes and gives additional information about an EM experiment
EMPIAR archive The authoritative public copy of the EMPIAR archive as maintained by and at EMBL-EBI and from which any and all public EMPIAR files can be downloaded. There may be official mirror sites of the public archive
Half-maps A pair of EM maps derived from the original dataset that result from the division of the available data into two non-overlapping half-datasets that are independently processed
OneDep The deposition and annotation system used by wwPDB
OneDep communication module A module within the OneDep system used for all communications between wwPDB and entry contact authors
PI(s) The PI of an EMDB entry is scientifically responsible for the study that generated the data in that entry. There may be more than one PI
Primary map The primary map is the EM map that is shown and described in the associated literature publication
Single-particle reconstruction An EM sub-technique that uses images of single particles to generate a 3D reconstruction
Electron Tomography An EM sub-technique whereby a tilt-series of images is used to generate a 3D reconstruction
wwPDB PIs The joint PIs of all core wwPDB member organisations

II. Introduction

The EMDB archives 3DEM reconstructions and tomograms resulting from several types of 3DEM methods (see 1.2 for the full list of accepted EM sub-techniques). The archive accepts and distributes 3DEM volume maps, auxiliary files, and metadata from EM experiments and organises them as entries. EMDB entries can be deposited with (map-model entries) or without (map-only entries) associated atomic coordinates through the wwPDB deposition and annotation system OneDep. This allows simultaneous deposition of EM maps to the EMDB archive and any corresponding atomic models to the PDB archive. Please refer to the OneDep deposition page for more details. Policies and procedures pertaining to atomic models are set out in the PDB policy document.

1. Entry requirements

1.1 Entry representation

An EMDB entry must contain one EM map, referred to as the primary map, resulting from an EM experiment. An image representing the primary map must be included in the entry. The entry may also contain other EM maps (referred to as additional maps) as well as auxiliary files. The full list of possible categories, allowed number of files for a category and sub-categories for additional maps, as well as auxiliary files of an EMDB entry are listed in the table below.

Category (quantity allowed) Sub-categories
Additional maps
Raw map (1) Unfiltered, unsharpened and unmasked raw map on which the primary map is based
Half-maps (1 pair) Unmasked raw half-maps for single-particle analysis and single-particle-based helical reconstructions
Masks (any) Primary/raw map mask, segmentation/focused refinement mask and half-map mask
Segmentations (any) Segments (3D regions-of-interest)
Other EM maps (any) Difference maps, maps showing alternative conformations and/or compositions, maps with differential processing (e.g. filtering, sharpening and masking)
Auxiliary files
FSC (any) Half-map FSC, Map-model FSC, Cross-validation FSCs
Structure factors (1) Structure factors
Layer-lines (1) Layer-lines

For each EMDB entry (with the exception of ET) it is recommended (likely to become mandatory in future) to provide at least four EM maps (items 1, 2, and 3) and all should be tagged/categorised:

  1. Primary map
  2. Raw map
  3. Unmasked half-maps
  4. Optional masks and the above auxiliary file categories

1.2 Entry acceptance

EMDB archives data from EM experiments using sample specimens of bio-macromolecules and their complexes, cellular components and cells. In order to accept a deposition, EMDB minimally requires the following data:

  1. Deposition data required
  2. Three-dimensional (3D) EM map
  3. Data-collection details
  4. Sample-composition information
  5. EM map-determination details
  6. Information about the authors

EM deposition data must result from one of these EM sub-techniques:

  1. Accepted EM sub-techniques
  2. 2D or 3D electron crystallography
  3. Helical reconstruction (single-particle-based or diffraction-based)
  4. Single-particle analysis
  5. Subtomogram averaging
  6. Tomography

An EMDB entry describes a single EM experiment. EM data should be collected on a single specimen, but may derive from multiple data-collection sessions.

It is mandatory to provide an EM map for an entry. More than one EM map may be provided (see 1.1), but only one is designated as the primary map.

EMDB archives 3DEM maps, whereas raw 2D image data from the same experiment can be deposited to the Electron Microscopy Public Image Archive (EMPIAR). EMPIAR also archives 3D maps from certain imaging modalities that are not accommodated by EMDB (e.g., 3DSEM, SXT).

For map-model entries, it is a requirement that the primary map and its derived atomic model are in the same orientation and position in coordinate space (i.e. the atomic model fits into the EM map).

1.3 Composite entry requirement

A composite map is a map constructed by combining two or more EM maps. It can be deposited as the primary map of an EMDB entry if the constituent maps are derived from the same EM specimen under the same conditions. A composite EM map will be accepted by EMDB on the condition that the raw consensus map from which these components originate is also deposited as a separate EMDB entry. Constituent EM maps must also be deposited as separate EMDB entries together with their individual raw maps.

2. Entry authorship

2.1 PI(s)

The PI(s) is responsible for the EM experiment that generated the EMDB data. It is the responsibility of the PI(s), either by themselves or via the designated contact author, to:

  • Upload and provide entry data and metadata or delegate this task to the depositor (see 2.2)
  • Assign entry, contact and citation authors for an entry
  • Ensure consent for data deposition to EMDB has been given by all entry authors (see 2.3)
  • If there is more than one PI for an entry, come to a mutual decision on all issues regarding the entry
  • Resolve any conflicts between entry authors
  • Request any changes to an EMDB entry
  • Approve the final version of all EMDB data
  • Notify wwPDB if an entry is to be made obsolete or withdrawn
  • Agree to public distribution of all entry data without restriction

Ultimately, it is the PI(s) who is responsible for the EMDB entry. Therefore, the PI(s) and anyone listed as a contact author in the entry are informed about deposition, status changes and release via email through the OneDep communication module. In some cases, the PI(s) may be contacted with questions directly.

2.2 Depositor

The depositor is responsible for entry submission. It is the responsibility of the depositor to:

  • Follow instructions from the PI(s)
  • Provide PI(s) contact information
  • Provide a verifiable email address to the OneDep system to initiate deposition
  • Provide all mandatory deposition data and metadata related to the EM experiment
  • Present all information accurately (entry, contact and citation author names and roles, entry title, experimental and publication details, etc.)

2.3 Entry author(s)

Entry authors are associated with an EMDB entry as assigned by the PI(s) and listed by the depositor during submission. Entry-author details must include last name and initials. A structural genomics centre can be listed as an entry author.

2.4 Contact author(s)

Contact authors are responsible for correspondence with wwPDB regarding the EMDB entry. All PIs are automatically also EMDB contact authors. At least one contact author, designated by the PI(s), should be "responsible for correspondence" with wwPDB. It is the responsibility of the latter contact author to:

  • Respond to any queries following deposition in a timely manner
  • Inform wwPDB of any desired changes to an EMDB entry on behalf of the PI(s)
  • Notify wwPDB when the primary citation is published

Anyone listed as a contact author in the entry is informed about deposition, modifications, status changes, release, obsoletion and withdrawal.

2.5 Citation author(s)

Citation authors are authors of the primary publication associated with an EMDB entry.

3. Deposition requirements

3.1 Deposition-data requirements

It is highly recommended that the EM data and metadata deposited are as complete as possible. There are specific deposition requirements for map, figure, FSC data and other EM data. These are outlined in the following subsections.

3.1.1 Map requirements

EM maps (primary map, masks, half-maps and any other additional maps) must be deposited in either CCP4 or MRC format. Deposited EM maps in the MRC format are converted to the CCP4 format following the CCP4 definition. EMDB distributes all maps in the CCP4 format.

3.1.2 Figure requirements

It is mandatory for every EMDB entry to have a figure representative of the entry; the depositor is required to provide such a figure which is not subject to any copyright restrictions. The figure format must be JPEG, GIF, TIFF, PNG or any other conventional image format.

3.1.3 FSC data requirements

If FSC data is provided for an EMDB entry, it must conform to the FSC XML schema. FSC data should be given as floating point values for Fourier shell correlation (y-coordinates; dimensionless numbers in the range [-1,+1]) against spatial frequency (x-coordinates; dimension of Å-1) up to the Nyquist frequency. More than one FSC file may be deposited for each EMDB entry (see 1.1). The deposited FSC file must relate to one of the deposited EM maps in the entry.

EMDB accepts half-map FSC, map-model FSC, cross-validation FSCwork and cross-validation FSCtest files.

3.1.4 Other EM data requirements

For the EC, SP and HD sub-techniques, additional data may be deposited. EC and SP depositions may include a structure factor file that must be in mmCIF or MTZ format, and must at least include h, k, l, F, Sigma(F) (and/or I and Sigma(I)). HD entries may be supplemented with layer-line information.

4. Deposition and accession codes

4.1 Deposition code

A deposition code is provided as soon as a depositor creates a new EMDB or joint EMDB and PDB deposition session. It is a 10-digit number preceded by “D_” (D_xxxxxxxxxx, where x is a digit from 0-9). This code is a unique identifier used internally by the OneDep deposition and annotation system and should not be used to refer to the entry publicly.

4.2 EMDB accession code

An EMDB accession code is issued upon submission of a OneDep deposition. It consists of the string “EMD-” followed by a number with at least 4-digits (EMD-xxxx...x, where x is a digit from 0-9). The accession code is automatically issued and sent in an email to the contact author(s) of the entry and can be used to refer to the entry publicly (e.g., in publications). It is not possible to request a specific EMDB accession code or a range of EMDB codes. In future, EMDB accession code digits will expand incrementally in the number of digits as the available range of codes is exhausted. EMDB codes already assigned will remain unchanged (i.e., they will not be left-padded with zeros). Accession codes of maps in the EMDB archive and their associated atomic models in the PDB archive will be cross-referenced in both archives.

5. Processing procedures

5.1 Deposition process

Data deposition to EMDB consists of the following steps:

  • Maps are converted into standard CCP4 format
  • Data, metadata and information are provided in OneDep by the depositor
  • Data and metadata are validated by the OneDep system
  • wwPDB staff assist where there are issues with the deposition

Once the deposition has been submitted, the depositor may request entry changes by communication with the wwPDB biocurators.

5.1.1 EMDB data privacy

Upon release, EM data and metadata of an EMDB entry are distributed publicly without restriction.

5.1.2 Personal data privacy

The contact details of a depositor that are mandatory for an EMDB entry are first and last names, email address, phone number, ORCID iD, type of organisation and country. wwPDB, and by implication EMDB, keeps these contact details private and secure. Last name and initials of the entry author(s) and citation author(s) are also mandatory during deposition and are publicly available in EMDB. In the course of deposition, citation author(s) may also provide ORCiD identifier information which is recorded in the entry’s header file.

5.2 Biocuration process

Deposited data enter the biocuration process which is carried out by wwPDB staff and includes the following steps:

  • Check for discrepancies between the EM data submitted and the metadata provided (e.g., mismatches in resolution reported in the metadata and inferred from FSC data)
  • Evaluation of EM data through visual inspection (e.g. whether an appropriate contour level is provided)
  • Checking that the primary map and its derived atomic model are in the same coordinate space (map-model entries)

If discrepancies are found during biocuration, they are communicated to the contact author(s) for resolution. Any issues that prevent successful processing of the EMDB entry will lead to a change of the status code to WAIT (see 5.4.2). If an issue cannot be resolved between wwPDB annotation staff and the contact author(s), the matter will be referred to the wwPDB PIs for a decision.

When biocuration is complete, the EMDB entry is sent to the contact author(s) for final review and approval. If no communication is received within three weeks, it is assumed that the entry is approved. Any issues will be resolved through communication with wwPDB staff. Entries where issues are identified by the wwPDB biocuration staff or from the contents of the wwPDB validation report, will be discussed with the authors in order to resolve them. An entry for which these issues cannot be resolved will be withdrawn upon expiration of the one-year hold unless a publication describing the entry is available. In that case, the entry will be released with a caveat record. If a publication describing a recently withdrawn entry appears in the literature, the withdrawn status of an entry may be reversed by wwPDB.

5.3 Validation process

EM validation reports are generated to provide quality assessment of EM maps and any PDB atomic models within a deposition. It is highly recommended that the official validation report generated after biocuration is provided to journal editors and referees as part of the manuscript submission and review process. The reports are date-stamped and display the wwPDB logo. For more information see the wwPDB Validation Reports home page.

5.4 Release

The release of an entry depends upon the fulfillment of the release requirements, the release instruction provided and the current status of an entry. These three aspects of the release will be described in further detail, followed by an outline of the release process.

5.4.1 Release requirements

An EMDB entry can progress to the release phase within OneDep under the following conditions:

  • All deposition requirements have been met (see section 3)
  • The EMDB entry is correctly completed and successfully validated
  • All metadata are consistent with descriptions of the data
  • Biocuration is complete

For entries where an atomic model is also deposited with a primary map, it is a requirement that the coordinates satisfy the PDB policy conditions.

5.4.2 Release instruction

The release instruction is given within the deposition session. It is selected by the depositor during deposition. EMDB strongly encourages release instructions that lead to the release of entry data as soon as possible rather than those that put the data on hold. Descriptions of the available release instructions and their associated release codes are given below.

Immediate Release (REL) entries are to be released when the EMDB entry is approved.

Hold until PUBlication (HPUB) entries are placed on hold until publication or until one year from the date of deposition, whichever comes first.

Hold (HOLD) entries are placed on hold for up to one year from the date of deposition.

REL entries are scheduled for release after authors have approved the processed files. It is normal practice for authors to review and approve curated entries before they are released. If the contact author does not reply within three weeks after the validation report is made available to them, and assuming that there are no outstanding issues with the deposition, wwPDB will deem the entry to have been approved by the authors. If at that point there are no outstanding issues with the entry, the entry will be released. Entries can be released without citation information and updated with this information at a later date.

HPUB/HOLD entries will be released either when release is requested by the authors or by a journal, or when the wwPDB becomes aware of a publication describing the entry.

HPUB/HOLD entries cannot be held for more than one year beyond the date of deposition. If an entry remains unreleased at the end of the hold period, it must either be released or withdrawn. Ten months after deposition, the authors of unreleased entries will be contacted to provide an answer as to whether they wish to release or withdraw the entry before the one-year anniversary of the deposition date.

Once made aware of a publication (electronic or print, whichever is published sooner) describing an EMDB entry, wwPDB will neither delay the release, nor permit the withdrawal, of that entry. Submissions to public preprint archives are deemed to be publications.

Publication dates and citation information can be received from authors, some journals, and the user community. In addition, scanning of the literature for publications is carried out. While every effort to track citations and release entries is made in a timely manner, it is ultimately the responsibility of the authors to notify wwPDB when publication occurs.

Authors may withdraw their unreleased entries, provided the publication citing the entry has not been published. Withdrawn entries will remain in the list of unreleased entries in the EMDB archive (status WDRN).

An entry for which these issues cannot be resolved will be withdrawn upon expiration of the one-year hold unless a publication describing the entry is available. In that case, the entry will be released by wwPDB staff.

In addition to the above instructions provided at the time of deposition, there are two further instructions regarding release (Withdrawn and Obsolete) that can be assigned to an entry.

Withdrawn (WDRN). An EMDB entry can only be withdrawn if it has not been released. Withdrawn entries have no data and metadata in the public EMDB archive. The assigned EMDB code will not be re-used.

Obsolete (OBS). An EMDB entry can be made obsolete if its data or metadata is already in the public archive. Any updates to a primary map will require obsoleting the current EMDB entry and depositing a new one. Upon obsoletion, the entry is moved to a separate area in the public EMDB archive and thus remains publicly accessible.

5.4.3 Entry status

The EMDB entry status describes the current state of an entry that is assigned within OneDep. The following status codes are possible:

Status Code Description
AUTH The entry has been annotated and is awaiting approval
HOLD The entry is on hold for a year (see 5.4.2)
HPUB The entry is on hold until publication (see 5.4.2)
OBS The entry has been made obsolete
POLC The entry is awaiting a policy decision; no processing will take place until a decision is made
PROC The entry has been submitted and is being processed
REFI The entry is awaiting the publication of an associated peer-reviewed journal article as the EM map is derived from archived raw data by another PI (*)
REL The entry has been publicly released (see 5.4.2)
REPL The entry primary map has been replaced and re-submitted, but is not processed
WAIT The entry is awaiting further information before processing can be completed
WDRN The entry has been withdrawn before public release (see 5.4.2)

(*) EM maps derived from data generated by other research groups (e.g., from data made publicly accessible in the EMPIAR archive) may be deposited, but will not be processed or released until an associated peer-reviewed publication becomes available. The same policy applies if an existing EMDB entry generated by another research group is used to produce a new model deposition. A model derived from an existing EMDB entry and generated by authors other than those of the entry, can reference the original EM map only if an associated peer-reviewed publication for the model is available. The model will then be referenced in the EMDB entry.

5.4.4 Release process

Releasing an entry involves making the data and/or metadata available in the public EMDB archive. The release process includes the following steps:

  • After successful processing, a notification requesting final review of the entry is sent to the contact author(s). This notification is sent in all release instruction cases. If after three weeks the contact author(s) has not raised any issues, the EMDB entry is considered to be approved and the status is updated according to the requested release instruction. If the contact author(s) has any issues with the EMDB entry, these can be raised with the wwPDB staff. Once approved the EMDB entry status is changed to that of the release instruction and an email notification is automatically sent to the contact author(s).
  • Upon release, the files associated with the EMDB entry will be made available in the public archive.
  • The deadline for release requests is 12:00 noon on Thursday (local time at processing site).
  • EMDB weekly release is every Wednesday at 00:00 UTC
  • EMDB entries are available from the wwPDB distribution sites (EMDB, RCSB and PDBj). Instructions to retrieve EMDB data files and metadata are posted on the wwPDB website.

6. Modification of entries

6.1 Entry changes before release

Before public release, most issues can be resolved by communicating with wwPDB staff. After submission of an EMDB entry, the deposition is locked for further modifications and changes. Additional data or information may be required, in which case the deposition may be temporarily unlocked by wwPDB staff. This option is used sparingly to avoid creating inconsistencies in the metadata.

6.2 Entry changes after release

In some cases, it may be necessary to modify an entry after release. This may apply to both the data and the metadata (e.g., update of the primary citation details). However, the primary map of an EMDB entry cannot be changed after release. Instead, the entry would need to be made obsolete and then replaced by an entirely new deposition. Occasionally, additional data files may need to be added or unintentionally uploaded files may need to be removed. Such changes should be discussed with wwPDB staff and should be justified by the contact author(s). Depending on the nature of the changes, it may take some time for them to be reflected in the public archive.

7. Removal of entries

7.1 Entry obsoletion

An EMDB entry can only be made obsolete after it has been released. Obsoleting an entry changes the status code of the deposition to OBS and moves its files out of the active part of the public archive into a separate area of the archive where the EMDB entry will still be publicly accessible. Entry obsoletion can be requested, and should be justified, by the contact author(s). Once acted upon, an email confirming the obsoletion is sent to the contact author(s). Obsoletion can be followed by superseding, whereby a new primary map is deposited. Superseding results in the assignment of a new EMDB accession code. In the case of map-model entries, obsoletion of the map will lead to obsoletion of the derived atomic model.

7.2 Entry withdrawal

If no data or metadata for an EMDB entry is publicly available, the entry can be withdrawn on request of the contact author(s). An entry that is withdrawn will never be released and no data and metadata will become publicly available. In OneDep internal records, the status code of the entry will be changed to WDRN. A list of all withdrawn EMDB entry IDs is available in the public archive. The contact author(s) cannot withdraw an entry once wwPDB is aware that the primary citation of the corresponding EMDB entry is published (see 5.4.2). Instead, the EMDB entry would need to be released, made obsolete, and then superseded (see 7.1).

7.3 Entry removal in unusual circumstances

Circumstances may arise in which the integrity, correctness, ownership or provenance of data is called into question. In such unusual circumstances, EMDB may receive a request to make the entry obsolete.

Examples of cases in which this might occur are:

  • The publication describing an EM map is retracted by (some of) the contact authors, their home institute, or the journal in which it was published
  • An official investigative body (e.g. the Office of Research Integrity in the USA) recommends retraction

In cases like these, efforts will be made to communicate with entry authors regarding the matter before a decision is made by the wwPDB PIs.