Electron Microscopy Data Bank

Documentation

Summary

EMDB data model
EMDB header data model
- Version 1.9
- Version 3.0
EMDB segmentation data model
Policies
Search engine
Chart builder
FAQ
Deposition
- General OneDep deposition
- Composite map

EMDB map data model

The EM Data Bank (EMDB) accepts and distributes 3D map volumes derived from several types of EM reconstruction methods, including single particle averaging, helical averaging, 2D crystallography, and tomography. Since its inception in 2002, the EMDB map distribution format has followed CCP4 definition (CCP4 map format) , which is widely recognized by software packages used by the structural biology community. CCP4 map format is closely related to the MRC map format used in the 3DEM community (MRC map format); CCP4 is slightly more restrictive, in that voxel positions are limited to a grid that includes the Cartesian coordinate origin (0,0,0). Further details can be found here.

EMDB header data model

Every EMDB entry has a header file containing meta data (e.g., sample, detector, microscope, image processing) describing the experiment. The header file is an XML file and the structure and content of the header file is described by a XSD data model. With a highly dynamic field such as cryo-EM there is a constant need to adapt and modify the schema to keep it up-to-date with the most recent developments. We consult extensively with the EM community regarding such issues and version the schema according to the policy described here.

Data model version 1.9

This has been a long-term stable version of the data model. It was be replaced in 2018 with an updated model but XML header files in version 1.9 continues to be distributed in parallel for at least one year to give EMDB users ample time to switch. It should be noted that the generation of the version 1.9 header files will be on a best effort basis but involves a back translation from recent versions that are richer in content and will therefore not contain all the information that can be found in the more recent versions.

Download schema
Browse schema documentation
Download Python code to facilitate reading and writing XML version 1.9 header files

Data model version 3.0 (current model)

This data model replaced version 1.9, however header files corresponding to both data models will be distributed in parallel with the view of stopping the distribution of the version 1.9 files in 2019 once users have had a chance to adopt version 3.0.

This version adds a number of features including:

An improved description of direct electron detectors, specimen preparation and tomography experiments.
A hierarchal description of the overall sample composition in combination with a low-level description of the macromolecular composition to allow the description of both molecular and cellular samples.
Specific data items describing the half-maps and segmentations included with the entry.

Download schema
Browse schema documentation
Download Python code to facilitate reading and writing XML version 1.9 header files

EMDB segmentation data model

Segmentation is the decomposition of 3D volumes into regions that can be associated with defined objects. Following several consultations with the EM community (Patwardhan et al., 2012; Patwardhan et al., 2014; Patwardhan et al., 2017), the EMDB is in the process of developing tools to support deposition of volume segmentations with structured biological annotation which is here defined as the association of data with identifiers (e.g., accession codes from UniProt) and ontologies taken from well established bioinformatics resources. To our knowledge, none of the segmentation formats widely used in electron microscopy and related fields currently support structured biological annotation. Third party use of segmentations is further impeded by the prevalence of segmentation file formats and their lack of interoperability. EMDB therefore proposed an open segmentation file format called EMDB-SFF to capture basic segmentation data from application-specific segmentation file formats and provide the means for structured biological annotation. In this way, EMDB-SFF will not only enable depositions of segmentations but also act as a file interchange format between different applications and facilitate analysis of 3D reconstructions. Furthermore EMDB-SFF supports the description of multiple transforms for a segment, thus allowing a segment to be used to describe the placement of a sub-tomogram average onto a tomographic reconstruction.

Model

EMDB-SFF files have the follow features:

Segmentation metadata:
- name
- version (of schema)
- details (free-form text)
- global external references, e.g. specimen scientific identifier
- bounding box
- primary descriptor contained i.e. one of ‘three_d_volume’, ‘mesh_list’, or ‘shape_primitive_list’ (see schema documentation)
- list of software used to create the segmentation (name, version, processing details)
- list of transforms referenced by segments e.g. transform to place the sub-tomogram average in the tomogram
Hierarchical ordering of segments through the use of segment IDs and parent IDs;
Four geometrical representations of segments (volumes, contours, meshes, shapes);
Can store subtomogram averages and how they map into the parent tomogram through the use of transforms;
List of associated external references per segment;
List of associated complexes and macromolecules in a related EMDB entry

Each segment in a segmentation can consist of two types of descriptors:

textual descriptors;
geometric descriptors.

Textual descriptors consist of either free-form text or standardised terms. Standard terms should be provided from a [published] ontology or list of identifiers.

Geometric descriptors can take one or more of the following representations:

‘three_d_volume’ for 3D volumes;
‘mesh_list’ for lists of meshes each of which consists of a set of vertices and polygons;
lists of shape primitives (ellipsoid, cuboid, cone, cylinder).

Documentation

Download

The current schema (version 0.8.0.dev1) is available here.

Documentation

Complete documentation of the schema is available here.

Auxiliary Tools

sfftk-rw

sfftk-rw is a Python toolkit for reading and writing EMDB-SFF files only. It is part of a family of tools designed to work with EMDB-SFF files.

sfftk-rw has the following utilities:

convert - interconvert between XML, HDF5 and JSON file formats of the EMDB-SFF data model;
view - view a file summary

The full documentation is available at readthedocs.

Download

The latest version runs only on Python 3 (version 0.7.1) and may be installed using pip install sfftk-rw. Alternatively, feel free to obtain the source code from Github.

sfftk

sfftk provides a shell command and a Python API to process EMDB-SFF files.

The following utilities are available using sfftk:

convert - Conversion of application-specific segmentation file formats to EMDB-SFF. Currently, sfftk supports the following formats:
- AmiraMesh (.am)
- Amira HyperSurface (.surf)
- Segger (.seg)
- EMDB Map masks (.map)
- Stereolithography (.stl)
- IMOD (.mod)
notes - Annotation of EMDB-SFF files.
view - Brief summaries of segmentation files.

Read the full documentation here.

Download

The latest development version (version 0.5.5.dev1) of sfftk may be downloaded/installed from PyPI or the source may be obtained from GitHub.

Publications

Quick links

Recent Entries

(Show all)

EMD-42357 [1/85]

Membrane protein enzyme in state 1

View Entry

EMD-41379 [2/85]

Human mixed 13S proteasome assembly intermediate

View Entry

EMD-41378 [3/85]

Human pre 13S proteasome assembly intermediate

View Entry

EMD-42085 [4/85]

Diversity-generating retroelement (DGR) ribonucleoprotein - Resting state 1c

View Entry

EMD-42077 [5/85]

Diversity-generating retroelement (DGR) ribonucleoprotein reverse transcriptase - Active state (N-occupied)

View Entry

EMD-41380 [6/85]

Human premature 20S proteasome assembly intermediate

View Entry

EMD-42350 [7/85]

Structure of amplified aSyn filament by using seed amplification assay (SAA) from MSA patient CSF.

View Entry

EMD-42081 [8/85]

Diversity-generating retroelement (DGR) ribonucleoprotein reverse transcriptase - Active State (N-empty) 1b

View Entry

EMD-43109 [9/85]

Asymmetric unit of bacteriophage PhiM1 mature capsid

View Entry

EMD-43111 [10/85]

Pectobacterium phage PhiM1 ejectosome C8 map

View Entry

EMD-42437 [11/85]

Composite map of PIC_delta_TFIIK form2

View Entry

EMD-43132 [12/85]

Asymmetric composite map of Pectobacterium phage PhiM1

View Entry

EMD-45802 [13/85]

Cryo-EM structure of porcine brain ventricles cilia doublet microtubule (48-nm periodicity)

View Entry

EMD-47180 [14/85]

Cryo-EM structure of recombinant R254H ACTA1 phalloidin-stabilized F-actin

View Entry

EMD-43127 [15/85]

C6 nozzle and fibre complex of the mature bacteriophage PhiM1 particle

View Entry

EMD-47179 [16/85]

Cryo-EM structure of recombinant wildtype ACTA1 phalloidin-stabilized F-actin

View Entry

EMD-42082 [17/85]

Diversity-generating retroelement (DGR) ribonucleoprotein reverse transcriptase - Resting State 1b

View Entry

EMD-41381 [18/85]

Premature human 20S proteasome assembly intermediate 37C

View Entry

EMD-42084 [19/85]

Diversity-generating retroelement (DGR) ribonucleoprotein reverse transcriptase - Resting State 1a

View Entry

EMD-43112 [20/85]

C12 portal and adaptor complex of the mature bacteriophage PhiM1 particle

View Entry

EMD-46995 [21/85]

Mycobacterial supercomplex malate:quinone oxidoreductase assembly

View Entry

EMD-43255 [22/85]

Protective effect of human non-neutralizing cross-reactive spike antibodies elicited by SARS-CoV-2 mRNA vaccination

View Entry

EMD-42083 [23/85]

Diversity-generating retroelement (DGR) ribonucleoprotein reverse transcriptase - Pre-active State 2

View Entry

EMD-44516 [24/85]

Structure of V.cholera DdmDE (2D:1E) in complex with DNA

View Entry

EMD-43135 [25/85]

Pectobacterium phage PhiM1 D12 capsid dimer map

View Entry

EMD-42078 [26/85]

Diversity-generating retroelement (DGR) ribonucleoprotein reverse transcriptase - Pre-active State 1a

View Entry

EMD-46790 [27/85]

Asymmetric reconstruction of the PhiM1 tail and ejectosome complexes.

View Entry

EMD-42438 [28/85]

Composite map of PICdeltaTFIIK form1

View Entry

EMD-43110 [29/85]

C4 pre-infection ejectosome of the mature bacteriophage PhiM1 particle

View Entry

EMD-45165 [30/85]

Cryo-EM structure of native dystrophin-glycoprotein complex (DGC)

View Entry

EMD-42080 [31/85]

Diversity-generating retroelement (DGR) ribonucleoprotein reverse transcriptase - Pre-active state 1b

View Entry

EMD-42079 [32/85]

Diversity-generating retroelement (DGR) ribonucleoprotein reverse transcriptase- Active state (N-empty) 1a

View Entry

EMD-41377 [33/85]

Human proteasome alpha ring assembly intermediate

View Entry

EMD-47097 [34/85]

Mycolicibacterium smegmatis MmpL5-AcpM structure

View Entry

EMD-38768 [35/85]

Ternary structure of dVemCas12e-sgRNA-dsDNA

View Entry

EMD-38856 [36/85]

Ternary structure of dLesCas12e-sgRNA-dsDNA

View Entry

EMD-37794 [37/85]

Cryo-EM Structure of Mouse TLR4/MD-2/DLAM3 Complex

View Entry

EMD-38178 [38/85]

HURP (428-534)-alpha-tubulin-beta-tubulin complex

View Entry

EMD-38181 [39/85]

HURP (428-534)-alpha-tubulin-beta-tubulin complex

View Entry

EMD-38397 [40/85]

Intact MAP of defence-associated sirtuin 2 (DSR2) H171A protein in complex with DSAD1 (DSR anti-defence 1)

View Entry

EMD-38179 [41/85]

Focus refinement of HURP-alpha-beta-tubulin

View Entry

EMD-38434 [42/85]

Core region of the human acetyl-CoA carboxylase 1 filament in complex with acetyl-CoA (ACC1-inact)

View Entry

EMD-37818 [43/85]

Cryo-EM structure of human DDB1-DCAF16

View Entry

EMD-38433 [44/85]

Citrate-induced filament of human acetyl-coenzyme A carboxylase 1 (ACC1-citrate)

View Entry

EMD-38302 [45/85]

Cryo-EM structure of defence-associated sirtuin 2 (DSR2) H171A protein in complex with DSR anti-defence 1(DSAD1)

View Entry

EMD-37831 [46/85]

Cryo-EM Structure of Human TLR4/MD-2/DLAM3 Complex

View Entry

EMD-38432 [47/85]

Core region of the citrate-induced human acetyl-CoA carboxylase 1 filament (ACC1-citrate)

View Entry

EMD-38435 [48/85]

Human acetyl-CoA carboxylase 1 filament in complex with acetyl-CoA (ACC1-inact)

View Entry

EMD-37816 [49/85]

Cryo-EM structure of ZnT8(M50-D369,D110N, D224N)

View Entry

EMD-38297 [50/85]

Cryo-EM structure of defence-associatedsirtuin 2 (DSR2) H171A protein

View Entry

EMD-38303 [51/85]

Cryo-EM structure of defence-associatedsirtuin 2 (DSR2) H171A protein in complex with SPR phage tail tube protein

View Entry

EMD-37838 [52/85]

Cryo-EM structure of jasmonic acid transporter ABCG16 in outward conformation

View Entry

EMD-39461 [53/85]

Cryo-EM structure of jasmonic acid transporter ABCG16 in digitonin

View Entry

EMD-38180 [54/85]

Focus refinement HURP-alpha-beta tubulin

View Entry

EMD-37836 [55/85]

Cryo-EM structure of jasmonic acid transporter ABCG16 bound to JA

View Entry

EMD-37839 [56/85]

Cryo-EM structure of jasmonic acid transporter ABCG16

View Entry

EMD-37837 [57/85]

Cryo-EM structure of jasmonic acid transporter ABCG16 in occluded conformation

View Entry

EMD-37264 [58/85]

membrane proteins

View Entry

EMD-37803 [59/85]

Cryo-EM Structure of Mouse TLR4/MD-2/DLAM5 Complex

View Entry

EMD-37840 [60/85]

Cryo-EM structure of AtABCG16

View Entry

EMD-39208 [61/85]

Cryo-EM structure of SARS-CoV-2 prototype RBD in complex with raccoon dog ACE2 (local refinement)

View Entry

EMD-37797 [62/85]

Cryo-EM Structure of Glycine receptor subunit alpha-3:GLRA3 from Biortus

View Entry

EMD-61550 [63/85]

Cryo-EM structure of the CUL1-RBX1-SKP1-FBXO4 SCF ubiquition ligase complex

View Entry

EMD-37822 [64/85]

the structure of BtSY1_RBD/hACE2 protein

View Entry

EMD-37796 [65/85]

Cryo-EM structure of the human Excitatory amino acid transporter 3: EAAT3 mutant (N178T, N195T) from Biortus

View Entry

EMD-37780 [66/85]

Cryo-EM structure of the C-terminal half of the human Leucine Rich Repeat Kinase 2 (LRRK2-RCKW) as a trimer

View Entry

EMD-39229 [67/85]

Cryo-EM structure of SARS-CoV-2 alpha variant spike protein in complex with raccoon dog ACE2 (local refinement)

View Entry

EMD-37782 [68/85]

Cryo-EM structure of the C-terminal half of the human Leucine Rich Repeat Kinase 2 (LRRK2-RCKW) after particle subtraction

View Entry

EMD-51634 [69/85]

PfMSP3 in complex with mAb MP3.01

View Entry

EMD-18652 [70/85]

Cryo-EM structure of the apo yeast Ceramide Synthase

View Entry

EMD-51132 [71/85]

human 80S ribosome bound by a SKI2-exosome complex

View Entry

EMD-19882 [72/85]

Cryo-EM Structure of Jumping Spider Rhodopsin-1 bound to a Giq heterotrimer

View Entry

EMD-18793 [73/85]

Cryo-EM density map of DNMT3A1-DNMT3L on a human H2AKc119ub nucleosome at 5.1 A resolution

View Entry

EMD-18643 [74/85]

Structure of s. pombe RNA polymerase II in complex with DSIF and Rat1/Rai1

View Entry

EMD-19078 [75/85]

Saccharomyces cerevisiae Prp43 helicase in complex with Pxr1

View Entry

EMD-18633 [76/85]

Portal protein of empty Haloferax tailed virus 1.

View Entry

EMD-18653 [77/85]

Cryo-EM structure of the FB-bound yeast Ceramide Synthase

View Entry

EMD-19348 [78/85]

Cryo-EM structure of a Foamy Virus fusion glycoprotein in the postfusion conformation

View Entry

EMD-18778 [79/85]

Structure of DNMT3A1 UDR region bound to H2AK119ub nucleosome

View Entry

EMD-51461 [80/85]

Tomogram of aggregate in AgDD-sfGFP-expressing HEK293 cell 10 min post aggregation induction

View Entry

EMD-19883 [81/85]

Cryo-EM Structure of Jumping Spider Rhodopsin-1 bound to a Giq heterotrimer

View Entry

EMD-18642 [82/85]

Portal capsid interface of full Haloferax tailed virus 1.

View Entry

EMD-51460 [83/85]

Tomogram of aggregate in AgDD-sfGFP-expressing HEK293 cell 6 h post aggregation induction

View Entry

EMD-19347 [84/85]

Cryo-EM structure of a Foamy Virus fusion glycoprotein stabilized in the prefusion conformation

View Entry

EMD-47165 [85/85]

Cryo-EM structure of DENV2 NS5 in complex with Stem Loop A (SLA)

View Entry