spacer
spacer

IIMS Workshop - 3D_EM Data Format



Electron Microscopy Data Base (EMD): 3D-EM Macromolecular Structure Database

An Electron Microscopy Database has been set up at the European Bioinformatics Institute (EBI) under funding from the European Union project IIMS (http://www.ebi.ac.uk/pdbe/docs/IIMS.html)

What is the EMD aim?

EMD allows the management, organisation and dissemination of data on the structures of biological macromolecules solved by three-dimensional electron microscopy

The NEW 3D-EM database will provide a facility for storing volume maps, relevant textual descriptors and data files containing figures and sections. Where applicable the database will also contain layer-line data and structure factor files.

The database model has been integrated with the Macromolecular structure database at the EBI that also contains the PDB.

Will the data deposited in the EMD be accessible to anybody?

Yes. A release lock-in period can be placed on the 3D-EM map (up to 4 years) by the author, while the descriptive information will be released immediately (after it has been reviewed by the authors).

How can I deposit data in the EMD database?

The deposition system is now activated at:
http://www.ebi.ac.uk/pdbe-emdep/emdep

What is the information that can be deposited together with the 3D-EM map?

Apart from the 3D-EM map, other complementary information will be stored:

Textual descriptors:

Complementary data files:

A. Three orthogonal slices of the map (as image files, for easy
    visualization of the data)

    These will be mandatory and for immediate release. The author will
    choose the slices to be provided.

B. Supplementary figures (for illustrating important aspects)

C. 3D masks (for iso-surface rendering purposes; binary volume format)

    These will be optional and for immediate release.

D. Structure factors (only crystallography) and layer line data (only
    helical reconstruction)


What is the EMD map file format?

3D-EM volumes will be stored as binary files in CCP4 format.


During deposition, other EM map formats can be uploaded. These uploaded volume files will be automatically converted to CCP4 via the em2em program (http://www.imagescience.de/em2em/welcome.htm ). For very large volume files an ftp direct access can be allowed by notifying the EMD at: emdep@ebi.ac.uk.


Textual information

Together with the 3D-EM volume file, a set of textual annotations are included in the Electron Microscopy Database (EMD). This textual information is available for download in XML format. An EMD XML entry file is not only a well-formed XML document, but it also conforms to the EMD XML Schema.

For a complete definition of the EMD XML Schema, you can access:

  • XML Schema definition file (current version)
  • XML Schema on-line documentation

  • The information relevant to an EMD entry will be contained in a element, and will be uniquely identified by its accession code. The general layout of an EMD entry file will be as follows:

    <?xml version="1.0" enconding="UTF-8"?>
    <emd_entry ...>
       <deposition> ... </deposition>
       <map> ... </map>
       <sample> ... </sample>
       <experiment> ... </experiment>
       <processing> ... </processing>
       <supplement> ... </supplement>
    </emd_entry>

    The contents of the main sections of information within an EMD entry, which have been included in the example above, are described in the following sections:

    1. Deposition
    2. Map
    3. Sample
      1. Sample component
        1. Biochemical entity
    4. Experiment
      1. Sample preparation
      2. Vitrification
      3. Image recording
      4. Image scanning
    5. Processing
      1. Reconstruction
      2. Data set
    6. Supplemental material

    1. Deposition:

    Contains context information relevant to the EMD entry record
    XML tag <deposition>
    XPath location: /emd_entry/deposition
    Child elements:

    <title>

    Title: descriptive title for the EMD entry.

    <entry_release_date>

    Entry release date:
    date of release of the EMD entry

    <map_release_date>

    Map release date:
    date of release of 3D-EM volume file

    <author_list>

    List of authors:
    list of authors of the deposition.
    >> to be further described

    <primary_reference>

    Primary reference:
    citation of the primary publication related to the 3D-EM
    experiment
    >> to be further described
    Example:
     
    <deposition>
       <title>3D volume of the DnaB helicase</title>
       <entry_release_date>2002-08-13</entry_release_date>
       <map_release_date>2002-08-13</map_release_date>
       <author_list num_auth="2">
          <author order="1">
             <last_name>San Martin</last_name>
             <name_initials>C.</name_initials>
          </author>
          <author order="2">
             <last_name>Carazo</last_name>
             <name_initials>J.M.</name_initials>
          </author>
       </author_list>
       <primary_reference pub_type="journal_article" published="1">
          <journal_article pmdid="9562559">
             <year>1998</year>
             <journal>Structure</journal>
             <volume>6</volume>
             <issue>4</issue>
             <first_page>501</first_page>
             <last_page>509</last_page>
             <article_title>Three-dimensional reconstructions from cryoelectron microscopy 
             images reveal an intimate complex between helicase DnaB and its loading partner
             DnaC
             </article_title>
             <authors>San Martin C, Radermacher M, Wolpensinger B, Engel A, Miles CS, 
             Dixon NE, Carazo JM
             </authors>
          </journal_article>
       </primary_reference>
    </deposition>


    2. Map
    XML tag <map>
    XPath location: /emd_entry/map
    Child elements:

    <file>

    File:
    name of the binary file where the 3D-EM volume is stored
    <data_type>
    Data type:
    logical data type used for storing values of the 3D-EM volume (4-byte float, 2-byte integer, etc.).
    <num_row>  <num_col> <num_sec>
    Number of values (columns, rows and sections):
    number of elements in the 3D-EM volume along each dimension.
    <spacing_row> <spacing_col> <spacing_sec>
    Spacing [Å] (columns, rows and sections):
    expressed in Angstroms, length of each value element in the 3D-EM volume along the three dimensions.
    <origin_row> <origin_col> <origin_sec>
    Origin [Å] (columns, rows and sections)
    {not included for the moment}.
    <minimum> <maximum> <average> <std>
    Data value statistics:
    minimum, maximum, average and standard deviation values of the 3D-EM volume
    <auth_threshold>
    Author's threshold:
    threshold value proposed by the author to perform a surface rendering of the 3D-EM volume
    <enforced_symmetry>
    Enforced symmetry:
    symmetry constraints applied to the data values of the 3D-EM volume
    >> to be further described
    Example:
     
    <map>
           <file class="map" format="ccp4">emd000100.map</file>
           <data_type>float</data_type>
           <num_row>50</num_row>
           <num_col>50</num_col>
           <num_sec>50</num_sec>
           <spacing_row units="A">3.8</spacing_row>
           <spacing_col units="A">3.8</spacing_col>
           <spacing_sec units="A">3.8</spacing_sec>
           <origin_row units="A">95.0</origin_row>
           <origin_col units="A">95.0</origin_col>
           <origin_sec units="A">95.0</origin_sec>
           <minimum>-9.1411295</minimum>
           <maximum>11.271612</maximum>
           <average>0.0001</average>
           <std>2.9369693</std>
           <auth_threshold>5.0</auth_threshold>
           <enforced_symmetry>
              <details>six-fold simmetry around Z axis</details>
           </enforced_symmetry>
    </map>


    3. Sample
    Describes the nature of the biological sample studied, corresponding to the
    reconstructed 3D-EM map
    Corresponds to XML tag <sample>
    XPath location: /emd_entry/sample
    Child elements:

    <name>
    Name:
    descriptive name for the biological sample.
    <aggregation_state>
    Aggregation state:
    either single particle, icosahedral particle, helix, or 2D crystal.
    <mol_wt_theo> <mol_wt_exp> <mol_wt_method>
    Molecular weight [MDa]:
    theoretical and experimental molecular weight of the sample. Details on the method used for experimental determination can be provided.
    <details>
    Details:
    any other relevant information on the biological sample.
    <sample_component_list>
    List of sample components:
    A sample is further characterised by the detailed description of each of its distinct components (corresponding to child elements <sample_component>).

    3.1. Sample component

    A sample component  is either a cellular component, a protein, a nucleic
    acid, a virus, a ligand or a label.
    Corresponds to XML tag: <sample_component>
    XPath location: /emd_entry/sample/sample_component_list/sample_component
    Child elements:

    <name>
    Name:
    descriptive name for the sample component (curated, using names found in relevant databases; see Nomenclature)
    <auth_name>
    Author's name:
    descriptive name for the sample component given by the author
    <details>
    Details:
    any other relevant information on the sample component.
    <nomenclature>
    Nomenclature:
    includes cross-references to relevant databases for nomenclature (GO for cellular components InterPro for proteins, and ICTVdb for viruses)
    >> to be further described
    <natural_source>
    Natural source:
    for components directly isolated or purified from their source
    >> to be further described
    <comp_degree>
    Compositional degree:
    <virus>
    Virus:
    in the case of a viral component, further information should be provided.
    Child elements:

    <empty>
    Empty viral particle:
    are the viral particles in the sample used to obtain the 3D-EM voluem empty? (i.e. they don't contain nucleic acid)
    <enveloped>
    Enveloped:
    have the viral particles a lipid envelope?
    <entity_list>
    List of biochemical entities:
    If the sample component can be further described in terms of its biochemical composition, then a list of distinct biochemical  items (or <entity>) is provided.
    3.1.1. Biochemical entity
    Each of the biochemical distinct elements of a given sample component.
    Corresponds to XML tag <entity>
    XPath location: /emd_entry/sample/sample_component_list/sample_component/entity_list/entity
    Child elements:

    <name>
    Name:
    descriptive name for the biochemical entity.
    <oligomeric_deatils>
    Oligomeric details:
    <number_copies>
    Number of copies:
    <details>
    Details:
    any other relevant information on the biochemical entity.
    <mol_entity> or <poly_entity>
    An entity is either a molecular entity or polypeptide entity:
                3.1.1.a Molecular entity
    <mol_entity> child elements:

    <systematic_name>
     
    Systematic name:
    <common_name>
    Common name:
    <formula>
    Formula:
                3.1.1.b Polypeptide entity
    <poly_entity> child elements:
    <mutant_flag>
    Mutant flag:
    <mutation_string>
    Mutation string:
    <fragment_flag>
    Fragment flag:
    <sequence>
    Sequence:
    <source>
    Source:
    A source is either a natural source or an engineered source:
    <engineered_source>
    Engineered source:
    For sample components obtained from an expression  system, a full description at the biochemical level should be provided.
    >> to be further described
    <natural_source>
    Natural source:
    >> to be further described

    Example:

    <sample>
       <name>Dnab helicase hexamer</name>
       <aggregation_state>single particle</aggregation_state>
       <mol_wt_theo units="MDa">0.314340</mol_wt_theo>
       <sample_component_list num_comp="2">
          <sample_component id="ID000001" class="protein">
             <name>DnaB helicase</name>
             <auth_name>DnaB</auth_name>
             <details></details>
             <nomenclature class="InterPro (protein)">
                <ref_interpro _id="IPR001198">
                   <name>DnaB helicase</name>
                </ref_go>
             </nomenclature>
             <comp_degree>exact</comp_degree>
             <entity_list num_elem="1">
                <entity id="ID000002" class="polypeptide entity">
                   <name>String</name>
                   <oligomeric_details>trimer of dimers</oligomeric_details>
                   <number_copies>6</number_copies>
                   <details>String</details>
                   <poly_entity>
                       <sequence>
                       MAGNKPFNKQQAEPRERDPQVAGLKVPPHSIEAEQSVLGGLMLDNERWDDVAERVVADDF
                       YTRPHRHIFTEMARLQESGSPIDLITLAESLERQGQLDSVGGFAYLAELSKNTPSAANIS
                       AYADIVRERAVVREMISVANEIAEAGFDPQGRTSEDLLDLAESRVFKIAESRANKDEGPK
                       NIADVLDATVARIEQLFQQPHDGVTGVNTGYDDLNKKTAGLQPSDLIIVAARPSMGKTTF
                       AMNLVENAAMLQDKPVLIFSLEMPSEQIMMRSLASLSRVDQTKIRTGQLDDEDWARISGT
                       MGILLEKRNIYIDDSSGLTPTEVRSRARRIAREHGGIGLIMIDYLQLMRVPALSDNRTLE
                       IAEISRSLKALAKELNVPVVALSQLNRSLEQRADKRPVNSDLRESGSIEQDADLIMFIYR
                       DEVYHENSDLKGIAEIIIGKQRNGPIGTVRLTFNGQWSRFDNYAGPQYDDE
                       </sequence>
                       <source>
                          <engineered_source>
                             <organism>Escherichia Coli</organism>
                             <host_organism>Escherichia Coli</host_organism>
                             <host_vector>pPS560</host_vector>
                             <host_vector_type>plasmid</host_vector_type>
                       </source>
                   </poly_entity>
                </entity>
              </entity_list>
           </sample_component>
        </sample_component_list>
    </sample>


    4. Experiment
    Contains information relevant to the experimental techniques and methods used to obtain structural data.
    Corresponds to XML tag <experiment>
    Child elements: <sample_preparation> <vitrification> <imaging> <image_scans>

    4.1. Sample preparation

    Contains information relevant to sample conditions prior to loading onto grid support.
    Correspond to XML tag <sample_preparation>
    XPaht location: /emd_entry/experiment/sample_preparation
    Child elements:

    <buffer>

    Buffer:
    describes the composition of the sample buffer.
    >> to be further described
    <crystal_grow>
    Crystal growth:
    describes the conditions for crystal growing.
    >> to be further described
    <sample_support>
    Sample support:
    describes the support of the sample.
    Child elements:
    <film_material>
    Film material
    <method>
    Method
    <grid_material>
    Grid material
    <grid_mesh_size>
    Grid mesh size
    <grid_type>
    Grid type
    <pretreatment>
    Pretreatment
    <details>
    Details

    4.2. Vitrification

    Contains information relevant to the method and cryogen used in rapid freezing of the sample on the grid prior to its insertion in the electron microscope
    XML tag: <vitrification>
    XPath location: /emd_entry/experiment/vitrification
    Child elements:

    <cryogen>
    Cryogen:
    substance used for rapid-freezing
    <humidity>
    Humidity [%]:
    humidity (%) in the vicinity of the vitrification process
    <temperature>
    Temperature [Kelvin]
    <instrument>
    Instrument:
    the type of instrument used in the vitrification process
    <method>
    Method:
    description of the technique used for vitrification
    <time_resolved_state>
    Time resolved state:
    The length of time after an event effecting the sample that vitrification was induced and a description of the event
    <details>
    Details:
    any other relevant information on the vitrification process

    4.3. Image recording

    Contains information relevant to the microscope settings and parameters used to obtain the structural data.
    XML tag: <imaging>
    XPath location: /emd_entry/experiment/imaging
    Child elements:

    <microscope>
    Microscope:
    model and manufacturer of the electron microscope used.
    <specimen_holder>
    Specimen holder:
    details on the holder of the sample (type, model and/or manufacturer)
    <accelerating_voltage>
    Accelerating voltage [kV]:
    value of accelerating voltage (in kV) used for imaging
    <illumination_mode>
    Illumination mode:
    mode of illumination (e.g. flood beam, spot scan)
    <imaging_mode>
    Imaging mode:
    operating mode of the EM (e.g. bright field, dark field, diffraction)
    <nominal_cs>
    Nominal Cs [mm]:
    spherical aberration coefficient (Cs) in millimeters of the objective lens (nominal value)
    <nominal_defocus_min> <nominal_defocus_max>
    Nominal defocus [ nm]:
    defocus value of the objective lens (in nanometers) used to obtain the recorded images
    <tilt_angle_min> <tilt_angle_max>
    Tilt angle [º]:
    angle at which the specimen was tilted to obtain recorded images
    <nominal_maginification>
    Nominal magnification:
    value of the magnification indicated by the microscope readout
    <calibrated_maginification>
    Calibrated magnification:
    magnification value obtained for a known standard just prior to, during or just after the imaging experiment
    <electron_source>
    Electron source:
    type of source of electrons (e.g. field emission, LaB6, tungsten)
    <electron_dose>
    Electron dose [electrons/ Å2]:
    total electron dose received by the sample (electrons per square Angstrom)
    <energy_filter>
    Energy filter:
    type of energy filter spectrometer apparatus
    <energy_window>
    Energy window [eV]:
    energy filter range in electron volts (eV) set by spectrometer.
    <temperature>
    Temperature [Kelvin]:
    mean specimen stage temperature during imaging in the microscope
    <detector>
    Detector:
    detector used for recording images. Usually film or CCD camera.
    <details>
    Details:
    any other relevant information on the image recording phase.

    4.4. Image scanning

    Contains information on the image scanning device and parameters for digitization of the images.
    XML tag: <image_scans>
    XPath location: /emd_entry/experiment/image_scans
    Child elements:

    <scanner>
    Scanner:
    manufacture, model and/or type of instrument used for digitization.
    <sampling_size>
    Sampling size [microns]:
         sampling step size (microns) set on the scanner.
    <details>
    Details:
    any other relevant information on the image scanning phase.


    Example:

    <experiment>
       <sample_preparation>
          <buffer>
             <details>
             protein samples were diluted (>50-fold) to 40 um/ml in 50 mM Tris.HCl pH 7.6,
             2mM DTT, 5mM MgCl2, 200 mM NaCl, 0.25 mM ADP
             </details>
          </buffer>
          <sample_support>
             <film_material>carbon</film_material>
             <grid_material>molybdenum</grid_material>
             <grid_type>holey</grid_type>
             <pretreatment>30 s of glow discharge</pretreatment>
          </sample_support>
       </sample_preparation>
       <vitrification>
          <gryogen_name>liquid ethane</gryogen_name>
          <instrument>double-side blotting device</instrument>
          <method>quick plunging</method>
       </vitrification>
       <imaging>
          <microscope>Philips EM 420</microscope>
          <accelerating_voltage units="kV">100.0</accelerating_voltage>
          <illumination_mode>spot scan</illumination_mode>
          <imaging_mode>bright-field</imaging_mode>
          <nominal_defocus_min units="nm">1.5</nominal_defocus_min>
          <nominal_defocus_max units="nm">1.5</nominal_defocus_max>
          <tilt_angle_min units="degrees">45.0</tilt_angle_min>
          <tilt_angle_max units="degrees">45.0</tilt_angle_max>
          <nominal_magnification>60000</nominal_magnification>
          <electron_dose units="e/A**2">10.0</electron_dose>
          <detector class="film">
             <detector_model>Kodak SO-163</detector_model>
          </detector>
       </imaging>
    </experiment>


    5. Processing:
    Corresponds to XML tag <processing>
    XPath location: /emd_entry/processing
    Child elements: <reconstruction> <em_dataset>

    5.1. Reconstruction

    Contains information on the 3D reconstruction procedure from the 2D projection images
    Corresponds to XML tag <reconstruction>
    Xpath location: /emd_entry/processing/reconstruction
    Child elements:

    <method>
    Method:
    general method used for the 3d-reconstruction. E.g.: lattice line fitting, layer lines-Bessel functions, spherical harmonics, backprojection, ...
    <algorithm>
    Algorithm:
    <software>
    Software:
    information on the software packages used in the reconstruction procedure.
     >> to further described
    <ctf_correction>
    CTF correction:
    method used for Contrast Transfer Function (CTF) compensation
    <resolution_by_author>
    Resolution (by author) [Å]:
    final resolution (in Angstroms)of the 3D reconstruction
    Child elements: <resol_row> <resol_col> <resol_sec> <method>
    <details>
    Details:
    any other relevant information on the reconstruction procedure.

    5.2. Data set

    Contains information on the data collected (2D projection images) from
    which the final 3D reconstruction was obtained.
    Corresponds to XML tag <em_dataset>
    XPath location: /emd_entry/processing/em_dataset
    Child elements: <xtal2D> or <icosahedron> or <helix> or <single_particle>
    Details on the data set depends on the nature of the experiment:

    5.2.a. 2D crystal:

    Corresponds to XML tag <xtal2D>
    XPath location: /emd_entry/processing/em_dataset/xtal2D
    Child elements:


    <a_length> <b_length> <c_length>
    Length (a, b, c) of unit cell [Å]:
    <alpha> <beta> <gamm>
    Angles (alpha, beta, gamma) of unit cell [°]:
    <plane_group>
    Two-sided plane group:
    <details>
    Details:
    <structure_factors>
    Structure factors:
    >> to be further described

    5.3.b. Icosahedron:

    Corresponds to XML tag <icosahedron>
    XPath location: /emd_entry/processing/em_dataset/icosahedron
    Child elements:

    <num_digital_images>
    Number of digital images:
    <num_projections>
    Number of projections:
    <t_number>
    T number:
    <euler_angle_distribution>
    Euler angle distribution:
    <details>
    Details:

    5.2.c. Helix:

    Corresponds to XML tag <helix>
    XPath location: /emd_entry/processing/em_dataset/helix
    Child elements:

    <delta_phi>
    Delta phi:
    <delta_z>
    Delta z:
    <hand>
    Hand:
    <axial_symmetry>
    Axial symmetry:
    <details>
    Details:
    <layer_lines>
    Layer lines:
    >> to be further described

    5.2.d. Single particle:

    Corresponds to XML tag <single_particle>
    XPath location: /emd_entry/processing/em_dataset/single_particle
    Child elements:

    <num_digital_images>

    Number of digital images:
    <num_projections>
    Number of projections:
    <t_number>
    T number:
    <euler_angle_distribution>
    Euler angle distribution:
    <details>

    Details:

    Example:

    <processing>
       <reconstruction>
          <method>angular refinement</method>
          <algorithm>algebraic reconstruction technique (ART)</algorithm>
          <software>
             <name>XMIPP</name>
          </software>
          <software>
             <name>SPIDER</name>
          </software>
          <resolution_by_author units="A">
             <resol_row>34.5</resol_row>
             <resol_col>34.5</resol_col>
             <resol_sec>34.5</resol_sec>
             <method>differential phase residual</method>
          </resolution_by_author>
       </reconstruction>
       <em_dataset>
          <single_particle>
             <num_digital_images>
             <num_projections>568</num_projections>
             <details>
             only a prevalent view of the sample could be detected 
             in the 0 degrees tilt micrographs
             </details>
          </single_particle>
       </em_dataset>
    </processing>


    6. Supplemental material (other data files)

    Contains information on supplementary data items and material to the 3D-EM map
    Corresponds to XML tag <supplement>
    XPath location: /emd_entry/supplement
    Child elements:

    <slice_col> <slice_row> <slice_sec>

    Slices:
    three orthogonal slices of the map are provided for visualization purposes.

    <mask_set>

    Masks:
    a binary volume to segment or highlight relevant features of the 3D map (e.g.: specimen vs. background)

    <figure_set>

    Figures:
    any image with appropriate caption text to illustrate the 3D map.

    Example:

    <supplement>
       <slice_row num="25">
          <file class="image" format="CCP4">emd000100_row.map</file>
       </slice_row>
       <slice_col num="25">
          <file class="image" format="CCP4">emd000100_col.map</file>
       </slice_col>
       <slice_sec num="25">
          <file class="image" format="CCP4">emd000100_sec.map</file>
       </slice_sec>
       <figure_set num="2">
          <figure id="ID000003">
             <file class="image" format="gif">emd000100_f01.gif</file>
             <caption>
             Variation in the rotational energy corresponding to harmonics
             3 and 6 among successive slices of the reconstruction
             </caption>
          </figure>
          <figure id="ID000004">
             <file class="image" format="gif">emd000100_f02.gif</file>
             <caption>
             Tilt pairs of frozen hydrated DnaB. Untilted micrographs 
             on the left and 45° tilted micrographs on the right
             </caption>
          </figure>
       </figure_set>
       <mask_set num="1">
           <mask id="ID000005">
               <file class="mask" format="msk">emd000100_m01.msk</file>
               <caption>
               Mask corresponds to protein versus background, representing 100%
               of the expected volume (mean protein density: 1.33 g/cm3)
               </caption>
           </mask>
       </mask_set>
    </supplement>
    spacer