Discussion Meeting of Project: Integration of Information about Macromolecular Structure
(IIMS)
1. Metadescriptors
Discussion of those Metadescriptors that were mandatory and
those which were optional was begun by working through the Discussion Document
of 20th Feb 2001 from Monica Chagoyen (MC) and Richard Newman (RN).
A list of mandatory and optional data-items for the
3D-EM/MSD were generated and those areas requiring further discussion were
outlined and plans made to complete the 3D-EM Descriptors list.
2. Recruitment
Positions for Partners 3 and 4 were unfilled following
initial advertisement and it was agreed to readvertise the positions.
3. Timetable
Plans were drawn up to present examples of 3D-EM
data-deposition, with the mandatory and optional data-items,
at the 3DEM Gordon Conference from 24-29th June,
2001.
In addition it was agreed that the IIMS project be
represented in a general discussion at the Gordon Conference in the form of a
joint meeting with the RCSB and EBI. Together with a Poster presentation.
4. File Format
Consideration of image format and additional header
information from that currently in use by CCP4 format image files will also be
pursued.The aim here would be to be able to store useful information, generated
by machine, during the processing of EM data.
This was agreed by the participants.
It was also agreed that journals should be approached to
advise that a PDB code be necessary before publication of 3D-EM structures, as
is the case for X-ray and NMR structures.
Required Metadescriptors for the 3D-EM MSD
Remarks within * for comment please.
MANDATORY DATA-ITEMS:
GENERAL INFORMATION
Author(s)
Corresponding author:
Name
Affiliation
Reference paper
Release information for volume data: (choose one of the
following)
[ ] immediate release
[ ] release on publication
[ ] lock-in
please, give
desired hold period (up to 5 years): _____ years
DATA INFORMATION
EM map / volume
file /* did we
agree finally on CCP4 format? RN -yes but extend header information-work in
progress */
title
size:
number of columns
number of rows
number of sections
spacing (nm):
along columns
along rows
along sections
enforced symmetry
volume origin
(or center position): x, y, z /* which is the standard coordinate system we
adopt for 3D-EM MSD maps? We need
to define the relationship between (x,y,z) and (col,row,sec)
unambiguosly, or refer always to col,row,sec
RN:We have attached a suggested co-ord scheme at end of this document*/
/* Pending
Crystallographic issues. Relevant points here:
1. We are interested
in getting from the authors a map of the biologically relevant
unit (as is the case of the PDB/MSD for
atomic structures)
2. To be homogeneous
across the database, maps should be represented in a rectangular coordinate
system.
3. Crystallographers work usually with the unit cell as the
coordinate frame:
MC- Questions: - both in real and reciprocal space?*/RN yes but real space co-ords
submiitted to PDB - which is the content of a map reconstructed by
crystallography? (unit cell and the asymmetric unit?)*/RN asym. unit - are we
asking also for these maps?*/ RN yes for 2D and 3D xtals
MC- Do we solve it if we ask for the map of the biological unit (in a rectangular
coordinate frame) AND the final structure factors? In that case, structure
factors for crystallography should be also required.
*/RN-our EM volume map should be treated like the atomic coords map submitted to the PDB by
xray crystallographers and NMR spectroscopists structure factors should be
optional as for x-ray structures
*/RN-we should bear in mind the size of the volume map and how long it would take to
transfer-if we could apply symmetry operators to the asymmetric unit and
recreatethe biological entity that way it would reduce transmission times for deposition-*
- resolution estimate:
- value along rows (A)
- value along columns (A)
- value along sections (A)
- /* the exact way of describing how these numbers were
obtained depends on the reconstruction schema */
For single particles:
- FSC curve
- Pictures/Illustrations (at least one): /* for immediate
release */
- file
- Three orthogonal slice images: /* for immediate release */
row slice image file corresponds to row number in volume
column slice image file corresponds to column number in volume section slice
image file corresponds to section number in volume
BIOLOGICAL INFORMATION
Macromolecular complex:
Name
Aggregation state: /* a better name for this? */
(choose one of the following)
[ ] 3D crystal
[ ] 2D crystal
[ ] helix
[ ] icosahedral
[ ] single particles
[ ] individual structure
Name
Type (e.g.
protein, RNA, DNA, lipid, sugar, nucleotide, metal ligand, etc.)
Source:
[ ] engineered source (i.e. expression system)
[ ] natural source
Name of (natural or gene) source organism
Engineered mutation(s) [y/n]
Additional information for (icosahedral?) viruses /* Steve, please check */
empty (y/n)
enveloped (y/n)
EXPERIMENTAL DETAILS
Microscope type:
[ ] TEM
[ ] STEM
Specimen temperature
range: /* we need to work on the options allowed here */RN OK now?
[ ] helium temperature
[ ] nitrogen
[ ] other
Reconstruction method:
[ ] Lattice line fitting (i.e. crystallography)
[ ] Layer line (i.e. helical)
[ ] Spherical harmonics (i.e. icosahedral)
[ ] Other (including different single particle approaches and tomographic methods)
Crystal parameters:
unit cell data for 2D:
(gamma) (x, y)
plane group symmetry (for 2D crystals)
unit cell data for 3D:
(alpha,beta,gamma)
(x,y,z)
space group symmetry (for 3D crystals)
common phase origin ("undefined" value should be
allowed)
/* how is it
expressed? coordinate points in the 3D space??? */
Helical parameters: /* to be defined (Richard) */RN helix repeat C
layerline data:
rho (reciprocal radial coordinate)
amplitude, phase.
layerline value n (the Bessel function order)
layerline number l
Icosahedral parameters: /* to be defined (Steve) */
(RN: PDB PROCEDURE FOR PREPARING ATOMIC COORDINATE DATA
1. ORGANIZATION OF COORDINATE SECTION
The complete
asymmetric unit must be deposited except for virus coat proteins.)
CONVENTIONS FOR 3D-EM Data at the 3D-EM MSD
1. We propose to adopt the same Coordinate system as the PDB
data, i.e. The Right-Handed Cartesian Coordinate System (X, Y, Z) where
X = Y x Z
Y = Z x X
Z = X x Y
Graphically:
(out of page) Z .-------------- Y
|
|
|
|
X
Basic Matrix operations for rotation and translation in PDB
are given by:
R: rotation matrix
T: translation matrix
|X'| |R11 R12 R13| |X| |T1|
|Y'| = |R21 R22 R23|*|Y|+|T2|
|Z'| |R31 R32 R33| |Z| |T3|
In PDB files this type of operations are found, for example,
in the MTRIXn records fortransformations expressing non-crystallographic
symmetry:
MTRIX1 = (M11, M12, M13, V1)
MTRIX2 = (M21, M22, M23, V2)
MTRIX3 = (M31, M32, M33, V3)
We propose also to adopt this convention for the
representation of relative orientations.
2. Mapping 3D-EM volumes in Cartesian Coordinate System.
Each 3D-EM map has its own coordinate system in terms of the
rectangular grid formed by its voxels (i.e., in terms of colums (fastest), rows
and sections (slowest) when represented in a 1D array)
Our proposal for mapping a 3D-EM Volume grid in the PDB
Coordinate system is the following:
A. (col, row, sec) along (Y, X, Z)
B. Plus additional information on
- StartingVoxel = O
or 1 (first voxel is labelled as 0 or as 1) Our proposal: first voxel labelled
as 0 Additionally, the author should provide the following information:
- Spacing =
(size_col, size_row, size_sec) (in nm or A)
- OriginMapping = O
or O.5 (at voxel vertex or center)
- VolumeOrigin (or center position): (O_col, O_row, O_sec)
This will position unambiguosly the final 3D-EM map within
the same coordinate system as the atomic resolution data, allowing to define other
geometric elements within this
coordinate frame (e.g. symmetry axis), as well as the relative
orientations between two structures (two 3D-EM maps or a 3D-EM map and an
atomic model).
RN: PDB COORDINATE SYSTEMS AND TRANSFORMATIONS
a. The coordinates distributed by the Protein Data Bank
give the atomic positions
measured in Angstroms along three orthogonal directions.
Unless otherwise specified, the
default axial system (detailed below) will be assumed.
b. If a, b, c describe the crystallographic cell edges and
A, B, C are unit vectors in the default orthogonal Angstrom system, then:
1) A, B, C and a, b, c have the same origin.
2) A is parallel to a.
3) B is parallel to (a X b) X A (cross product between C and A).
4) C is parallel to a X b (i.e., c*) (cross product between a and b).
c. The matrix which premultiplies the column vector of the
fractional crystallographic coordinates to yield the distributed coordinates in
the A, B, C system is:
ab(cos(gamma))c(cos(beta))
0b(sin(gamma))c(cos(alpha) - cos(beta) cos(gamma)) / sin(gamma)
00 V/(ab sin(gamma))
where V = abc(1 - cos**2(alpha) - cos**2(beta) - cos**2(gamma)
+ 2(cos(alpha) cos(beta) cos(gamma)))**1/2
d. You need to supply along with the coordinates:
1) A transformation from the submitted to the orthogonal
coordinates that will be distributed by the PDB.
2) A transformation from the submitted to fractional
crystallographic coordinates.
e. The distributed entry will contain:
1) ORIGX -
transformation from the distributed to the submitted coordinates.
2) SCALE -
transformation from the distributed to the fractional coordinates. If the
submitted coordinates are fractions of the unit cell edges or are in the
default orthogonal system, the ORIGX and SCALE transformations will be given
default values.
f. The MTRIX transformations express approximate or exact
non-crystallographic symmetry elements in the structure. Provide these in the
space of the submitted coordinates. These transformations will be transformed
so that they operate in the distributed coordinate system.