Documentation
Summary
- EMDB data model
- EMDB header data model
- EMDB segmentation data model
- Policies
- Search engine
- Chart builder
- FAQ
- Deposition
EMDB map data model
The EM Data Bank (EMDB) accepts and distributes 3D map volumes derived from several types of EM reconstruction methods, including single particle averaging, helical averaging, 2D crystallography, and tomography. Since its inception in 2002, the EMDB map distribution format has followed CCP4 definition (CCP4 map format) , which is widely recognized by software packages used by the structural biology community. CCP4 map format is closely related to the MRC map format used in the 3DEM community (MRC map format); CCP4 is slightly more restrictive, in that voxel positions are limited to a grid that includes the Cartesian coordinate origin (0,0,0). Further details can be found here.
EMDB header data model
Every EMDB entry has a header file containing meta data (e.g., sample, detector, microscope, image processing) describing the experiment. The header file is an XML file and the structure and content of the header file is described by a XSD data model. With a highly dynamic field such as cryo-EM there is a constant need to adapt and modify the schema to keep it up-to-date with the most recent developments. We consult extensively with the EM community regarding such issues and version the schema according to the policy described here.
Data model version 1.9
This has been a long-term stable version of the data model. It was be replaced in 2018 with an updated model but XML header files in version 1.9 continues to be distributed in parallel for at least one year to give EMDB users ample time to switch. It should be noted that the generation of the version 1.9 header files will be on a best effort basis but involves a back translation from recent versions that are richer in content and will therefore not contain all the information that can be found in the more recent versions.
Download schema
Browse schema documentation
Download Python code to facilitate reading and writing XML version 1.9 header files
Data model version 3.0 (current model)
This data model replaced version 1.9, however header files corresponding to both data models will be distributed in parallel with the view of stopping the distribution of the version 1.9 files in 2019 once users have had a chance to adopt version 3.0.
This version adds a number of features including:
- An improved description of direct electron detectors, specimen preparation and tomography experiments.
- A hierarchal description of the overall sample composition in combination with a low-level description of the macromolecular composition to allow the description of both molecular and cellular samples.
- Specific data items describing the half-maps and segmentations included with the entry.
Download schema
Browse schema documentation
Download Python code to facilitate reading and writing XML version 1.9 header files
EMDB segmentation data model
Segmentation is the decomposition of 3D volumes into regions that can be associated with defined objects. Following several consultations with the EM community (Patwardhan et al., 2012; Patwardhan et al., 2014; Patwardhan et al., 2017), the EMDB is in the process of developing tools to support deposition of volume segmentations with structured biological annotation which is here defined as the association of data with identifiers (e.g., accession codes from UniProt) and ontologies taken from well established bioinformatics resources. To our knowledge, none of the segmentation formats widely used in electron microscopy and related fields currently support structured biological annotation. Third party use of segmentations is further impeded by the prevalence of segmentation file formats and their lack of interoperability. EMDB therefore proposed an open segmentation file format called EMDB-SFF to capture basic segmentation data from application-specific segmentation file formats and provide the means for structured biological annotation. In this way, EMDB-SFF will not only enable depositions of segmentations but also act as a file interchange format between different applications and facilitate analysis of 3D reconstructions. Furthermore EMDB-SFF supports the description of multiple transforms for a segment, thus allowing a segment to be used to describe the placement of a sub-tomogram average onto a tomographic reconstruction.
Model
EMDB-SFF files have the follow features:
- Segmentation metadata:
- name
- version (of schema)
- details (free-form text)
- global external references, e.g. specimen scientific identifier
- bounding box
- primary descriptor contained i.e. one of ‘three_d_volume’, ‘mesh_list’, or ‘shape_primitive_list’ (see schema documentation)
- list of software used to create the segmentation (name, version, processing details)
- list of transforms referenced by segments e.g. transform to place the sub-tomogram average in the tomogram
- Hierarchical ordering of segments through the use of segment IDs and parent IDs;
- Four geometrical representations of segments (volumes, contours, meshes, shapes);
- Can store subtomogram averages and how they map into the parent tomogram through the use of transforms;
- List of associated external references per segment;
- List of associated complexes and macromolecules in a related EMDB entry
Each segment in a segmentation can consist of two types of descriptors:
- textual descriptors;
- geometric descriptors.
Textual descriptors consist of either free-form text or standardised terms. Standard terms should be provided from a [published] ontology or list of identifiers.
Geometric descriptors can take one or more of the following representations:
- ‘three_d_volume’ for 3D volumes;
- ‘mesh_list’ for lists of meshes each of which consists of a set of vertices and polygons;
- lists of shape primitives (ellipsoid, cuboid, cone, cylinder).
Documentation
Download
The current schema (version 0.8.0.dev1) is available here.
Documentation
Complete documentation of the schema is available here.
Auxiliary Tools
sfftk-rw
sfftk-rw is a Python toolkit for reading and writing EMDB-SFF files only. It is part of a family of tools designed to work with EMDB-SFF files.
sfftk-rw has the following utilities:
- convert - interconvert between XML, HDF5 and JSON file formats of the EMDB-SFF data model;
- view - view a file summary
The full documentation is available at readthedocs.
Download
The latest version runs only on Python 3 (version 0.7.1) and may be installed using pip install sfftk-rw
. Alternatively, feel free to obtain the source code from Github.
sfftk
sfftk provides a shell command and a Python API to process EMDB-SFF files.
The following utilities are available using sfftk:
- convert - Conversion of application-specific segmentation file formats to EMDB-SFF. Currently, sfftk supports the following formats:
- AmiraMesh (.am)
- Amira HyperSurface (.surf)
- Segger (.seg)
- EMDB Map masks (.map)
- Stereolithography (.stl)
- IMOD (.mod)
- notes - Annotation of EMDB-SFF files.
- view - Brief summaries of segmentation files.
Read the full documentation here.
Download
The latest development version (version 0.5.5.dev1) of sfftk may be downloaded/installed from PyPI or the source may be obtained from GitHub.
Publications
- Patwardhan, Ardan, Robert Brandt, Sarah J. Butcher, Lucy Collinson, David Gault, Kay Grünewald, Corey Hecksel et al. Building bridges between cellular and molecular structural biology. eLife 6 (2017).
- Patwardhan, Ardan, Alun Ashton, Robert Brandt, Sarah Butcher, Raffaella Carzaniga, Wah Chiu, Lucy Collinson et al. A 3D cellular context for the macromolecular world. Nature structural & molecular biology 21, no. 10 (2014): 841-845.
- Patwardhan, Ardan, José-Maria Carazo, Bridget Carragher, Richard Henderson, J. Bernard Heymann, Emma Hill, Grant J. Jensen et al. Data management challenges in three-dimensional EM. Nature structural & molecular biology 19, no. 12 (2012): 1203-1207.
Quick links
Recent Entries
(Show all)Helical reconstruction of yeast eisosome protein Pil1 bound to membrane composed of lipid mixture +PIP2/-sterol (DOPC, DOPE, DOPS, PI(4,5)P2 50:20:20:10)
Structure of heteromeric amyloid filament of TDP-43 and AXNA11 from FTLD-TDP Type C (variant 1)
Human terminal uridylyltransferase 7 (TUT7/ZCCHC6) bound with pre-let7g miRNA and Lin28A - complex 2
Structure of the non-mitochondrial citrate synthase from Ananas comosus
70S Escherichia coli ribosome with Paenilamicin B2 bound with A- and P-site tRNA.
Focused refined map of the Anaphase-promoting complex/cyclosome (APC/C) with mask 3
CryoEM map of tau PHF sarkosyl-extracted from a human AD patient (associated with in situ tomography)
Sub-tomogram average of the RSV M lattice from native virions released from RSV-infected BEAS-2B cells cultured on EM grids
SARS-CoV-2 spike omicron (BA.1) RBD ectodomain dimer-of-trimers in complex with SC27 Fabs
Sub-tomogram average of two pairs of RSV F trimers from the surface of native virions released from RSV-infected BEAS-2B cells cultured on EM grids
CryoEM structure of neutralizing antibody HC84.26 in complex with Hepatitis C virus envelope glycoprotein E2
80S ribosome with angiogenin, in vitro assembled complex without substrate tRNA
Structure of the flotillin complex in a native membrane environment
Cryo-EM map of LKB1-STRADalpha-MO25alpha from TFS Glacios with Gatan K3 detector at 200 keV
Crosslinked 6-deoxyerythronolide B synthase (DEBS) Module 1 in complex with antibody fragment 1B2: Crosslinked State 1
80S ribosome bound with angiogenin and complex of eEF1A and Ala-tRNAAla
Sub-tomogram average of two pairs of RSV F trimers from the surface of native virions released from RSV-infected BEAS-2B cells cultured on EM grids
Cryo-EM structure of the mammalian peptide transporter PepT2 bound to cloxacillin, pose 2
Sub-tomogram average of two pairs of RSV F trimers from the surface of native virions released from RSV-infected BEAS-2B cells cultured on EM grids
Cryo-EM structure of the mammalian peptide transporter PepT2 bound to cloxacillin, pose 1
Cryo-EM structure of the mammalian peptide transporter PepT2 bound to amoxicillin
Cryo-EM structure of heparosan synthase 2 from Pasteurella multocida with polysaccharide in the GlcNAc-T active site
Cryo-EM structure of the mammalian peptide transporter PepT2 bound to cefadroxil
Cryo-EM map of LKB1-STRADalpha-MO25alpha from TFS Glacios with Gatan Alpine detector at 200 keV
Sub-tomogram average of a pair of RSV F trimers from native virions released from RSV-infected BEAS-2B cells cultured on EM grids
CryoEM structure of neutralizing antibodies CBH-7 and HC84.26 in complex with Hepatitis C virus envelope glycoprotein E2
SARS-CoV-2 spike omicron (BA.1) ectodomain trimer in complex with SC27 Fab, local refinement
KS-AT core of 6-deoxyerythronolide B synthase (DEBS) Module 3 crosslinked with its elongation ACP partner
Cryo-EM map of LKB1-STRADalpha-MO25alpha from TFS Glacios with Gatan Alpine detector at 120 keV
Crosslinked 6-deoxyerythronolide B synthase (DEBS) Module 1 in complex with antibody fragment 1B2: Crosslinked Intra-State 1
80S ribosome with angiogenin and ternary complex in rabbit reticulocyte lysates
Cryo-EM structure of human claudin-4 complex with Clostridium perfringens enterotoxin
Cryo-EM structure of Kaposi's Sarcoma-Associated Herpesvirus-G Protein-Coupled Receptor (KSHV-GPCR)in complex with CXC chemokine CXCL1
Cryo-EM structure of the human nucleosome containing the H3.1 E97K mutant
Cryo-EM structure of 30S ribosome with cleaved AP-mRNA bound complex-II (Body 1)
SARS-CoV-2 Omicron XBB.1.5 RBD complexed with human ACE2 and S304
Cryo-EM structure of 30S ribosome with cleaved AP-mRNA bound complex (Body 2)
The Cryo-EM structure of IL-12, receptor subunit beta-1 and receptor subunit beta-2 complex, local refinement
Cryo-EM structure of human nucleosome core particle composed of the Widom 601 DNA sequence
Cryo-EM structure of 30S ribosome with cleaved AP-mRNA bound complex I
Cryo-EM structure of 30S ribosome with cleaved AP-mRNA bound complex (Body 1)
Cryo-EM structure of an active Kaposi's Sarcoma-Associated Herpesvirus-G Protein-Coupled Receptor (KSHV-GPCR) in complex with Gi protein
Cryo-EM structure of 30S ribosome with cleaved AP-mRNA bound complex-II (Body 2)
Cryo-EM structure of Coxsackievirus B1 A-particle in complex with nAb 8A10 (CVB1-A:8A10)
Cryo-EM structure of 30S ribosome with cleaved AP-mRNA bound complex-II
The Cryo-EM structure of IL-12, receptor subunit beta-1 and receptor subunit beta-2 complex
cryo-EM structure of Staphylococcus aureus(ATCC 29213) 70S ribosome in complex with MCX-190.
A local Cryo-EM structure of Bitter taste receptor TAS2R14 with Gi complex
A Cryo-EM structure of Bitter taste receptor TAS2R14 with Gi complex
Cryo-EM structure of Staphylococcus aureus (15B196) 50S ribosome in complex with MCX-190.
Cryo-EM structure of Staphylococcus aureus 70S ribosome (strain 15B196) in complex with MCX-190.
cryo-EM structure of Staphylococcus aureus(ATCC 29213) 50S ribosome in complex with MCX-190.
Structure of heteromeric amyloid filament of TDP-43 and AXNA11 from FTLD-TDP Type C (variant 2)
Human DNA methyltransferase 1 productive DNA complex in presence of H3Ub2-peptide
Cryo-EM Structure of the R388 plasmid conjugative pilus reveals a helical polymer characterised by an unusual pilin/phospholipid binary complex
Inhibitor-free outward-open structure of Drosophila dopamine transporter
Structure of human terminal uridylyltransferase 7 (hTUT7/ZCCHC6) bound with pre-let7g miRNA and UTPalphaS
70S Escherichia coli ribosome with Paenilamicin B2 bound with hybrid A/P- and hybrid P/E-tRNA.
IFTB1 (distal region) in retrograde Intraflagellar transport trains
Helical reconstruction of yeast eisosome protein Pil1 bound to membrane composed of lipid mixture -PIP2/+sterol (DOPC, DOPE, DOPS, cholesterol 30:20:20:30)
IFTB1 (proximal region) in retrograde Intraflagellar transport trains
Cryo-EM structure of Streptococcus pneumoniae NADPH oxidase F397A mutant in complex with NADPH
Human terminal uridylyltransferase 7 (TUT7/ZCCHC6) bound with pre-let7g miRNA and Lin28A - complex 1
Structure of mammalian Pol II-DSIF-SPT6-PAF1-TFIIS-hexasome elongation complex (Hexasome focused map)
Structure of IFTA and IFTB in Retrograde Intraflagellar transport trains
Stretched state - Native eisosome lattice bound to plasma membrane microdomain
Cryo-EM structure of Streptococcus pneumoniae NADPH oxidase in complex with NADPH
Helical reconstruction of yeast eisosome protein Pil1 bound to membrane composed of lipid mixture +PIP2/+sterol (DOPC, DOPE, DOPS, cholesterol, PI(4,5)P2 35:20:20:15:10)
Compact state - Native eisosome lattice bound to plasma membrane microdomain
Outward-open structure of Drosophila dopamine transporter bound to an atypical non-competitive inhibitor
Structure of activated elongation complex with hexasome (Pol II focused)
Cryo-EM structure of stably reduced Streptococcus pneumoniae NADPH oxidase in complex with NADH