 |
PREPARING ATOMIC COORDINATE DATA
1. ORGANIZATION OF COORDINATE SECTION
The complete asymmetric unit must be deposited except for virus coat
proteins.
The general organization of records in a Protein Data Bank
coordinate entry is as follows:
descriptive portion, including the complete sequence
coordinates for chain n
coordinates for het groups associated with chain n
coordinates for solvents associated with chain n
het groups/solvents not assigned to a specific chain
Repeat the middle three items for each chain in the entry.
2. MULTIPLE CHAINS
If the entry contains multiple chains, each chain must be
uniquely identified in the ATOM/HETATM records.
Note that it is VERY IMPORTANT that a chain that is chemically
continuous but with some residues not located be represented as one
chain. It is also essential that a chain that is cleaved be represented
as two chains.
3. RESIDUE NUMBERING
a. In the case of multiple identical chains, the residue
numbering should be the same for each chain (e.g., A 1 -
A 257 and B 1 - B 257, not A 1 - A 257 and B 301 - B 557).
If you have reasons for using a different numbering
scheme, provide an explanation in the REMARK section of
your deposition.
b. If the residue numbering is not sequential, provide an
explanation in the REMARK section of your deposition.
4. SOLVENTS
a. Present solvent molecules in the symmetry position closest
to the associated macromolecule or provide an explanation
in the REMARK section of your deposition.
b. Identify water atoms as O, 1H and 2H, and identify the
residue as HOH. The sequence numbers must be numeric and
unique within each chain.
5. ALTERNATIVE CONFORMATIONS
a. If a residue has atoms occupying alternative sites, present
the atom records for that residue together sequentially,
with their alternate location indicators. Use the same
residue name, chain identifier, sequence number, and
insertion code for each alternate site of the same atom.
b. If the total occupancy for any atom presented in alternate
conformations differs from 1.0, provide an explanation
in the REMARK section of your deposition.
6. TERMINAL OXYGENS
a. Present terminal oxygen atoms as atom OXT for proteins,
and as O3P or O5* for nucleic acids. These appear with
the C-terminal, 3'- or 5'-terminal residues. The residue
name, chain identifier, sequence number and insertion code
must be the same as for the terminal residue.
b. For chains with gaps due to disorder, it is not appropriate
to have OXT as well as O. One of them must be renamed as N of
the next residue or removed and the remaining atom must be named
O. OXT appears only at the true end of the chain.
7. OCCUPANCY AND/OR TEMPERATURE FACTOR FIELDS
If you use these fields to carry information other than
occupancy and temperature factor, explain this in the REMARK
section of your deposition. When temperature factors are
provided, the tempFactor field (columns 61 - 66) contains the
isotropic B value, even when ANISOU records are provided.
8. COORDINATE SYSTEMS AND TRANSFORMATIONS
a. The coordinates distributed by the Protein Data Bank give
the atomic positions measured in Angstroms along three
orthogonal directions. Unless otherwise specified, the
default axial system (detailed below) will be assumed.
b. If a, b, c describe the crystallographic cell edges and
A, B, C are unit vectors in the default orthogonal
Angstrom system, then:
1) A, B, C and a, b, c have the same origin.
2) A is parallel to a.
3) B is parallel to (a X b) X A (cross product between
C and A).
4) C is parallel to a X b (i.e., c*) (cross product
between a and b).
c. The matrix which premultiplies the column vector of the
fractional crystallographic coordinates to yield the
distributed coordinates in the A, B, C system is:
a b(cos(gamma)) c(cos(beta))
0 b(sin(gamma)) c(cos(alpha) - cos(beta) cos(gamma)) / sin(gamma)
0 0 V/(ab sin(gamma))
where V = abc(1 - cos**2(alpha) - cos**2(beta) - cos**2(gamma)
+ 2(cos(alpha) cos(beta) cos(gamma)))**1/2
d. You need to supply along with the coordinates:
1) A transformation from the submitted to the orthogonal
coordinates that will be distributed by the PDB.
2) A transformation from the submitted to fractional
crystallographic coordinates.
e. The distributed entry will contain:
1) ORIGX - transformation from the distributed to the
submitted coordinates.
2) SCALE - transformation from the distributed to the
fractional coordinates.
If the submitted coordinates are fractions of the unit
cell edges or are in the default orthogonal system, the
ORIGX and SCALE transformations will be given default
values.
f. The MTRIX transformations express approximate or exact
non-crystallographic symmetry elements in the structure.
Provide these in the space of the submitted coordinates.
These transformations will be transformed so that they
operate in the distributed coordinate system.
9. NMR MODELS
a. Multiple models in NMR entries must be aligned.
b. Deposit multiple models in one file. Coordinates for
each model must begin with the record MODEL N and end with
the record ENDMDL. Models must be numbered sequentially
starting with 1.
c. If you are depositing a minimized average structure, the
structure must be in a separate file that does not contain
MODEL and ENDMDL records.
d. If you supply coordinates for lone pairs of electrons,
they should be presented in the REMARK section of this
deposition form. Please do not include them within the
coordinates.
COMPLETING THE AUTODEP DEPOSITION FORM
Provide all requested data as appropriate.
1. Atom names must be given in two (2) parts: the atom
name and its alternate location indicator. A question
mark (?) must be used as placeholder if the alternate
location indicator is null.
For example, CA ?, N ?, CG1 A, and CG1 B.
2. Residues names must be given in four (4) parts: the
residue name, chain identifier, sequence number, and
insert code, using the "?" as placeholder.
For example:
GLY ? 31 ?
indicates GLY with a blank chain
identifier, sequence number 31, and no
insertion code
VAL B 235 A
refers to VAL in chain B, sequence
number 235, insertion code A
|