Hello again -
In what I think was private email (I get confused with so many discussions
going on) Peter asked where to find the 5HVP example file. I used to be able
to generate the example directly from the dictionary, but I found when I tried
to do that just now that recent versions of the dictionary have broken my
program that does this (I know, I know, that's the problem with tools that
are just hacks).
Anyway, I thought it might be worth sending all of you the most recent
version of the example that I have - it is dated 1993-10-15. Have fun.
Paula
- - - - -
THE MACROMOLECULAR CIF DICTIONARY - 15 Oct 1993
Paula Fitzgerald
Merck Research Laboratories
P. O. Box 2000, Ry50-105
Rahway, New Jersey 07065
(908) 594-5510 (voice)
(908) 594-5510 (FAX)
paula_fitzgerald@merck.com (email)
(n.b. CIF example current as of 15-Oct-93, but notes have not been updated
since 20-May-93).
CIF = Crystallographic Information File
CIF is a subset of STAR (Self-defining Test Archive and Retrival format)
S.R. Hall (1991) J. of Chemical Information and Computer Science,
31, 326-333. The format is suitable for archiving all types of text and
numerical data, in any order. The goals of CIF are generality, upward
compatibility, flexibility and electronic publication.
CIF was developed by the IUCr Working Party on Crystallographic Information, in
an effort sponsored by the IUCr Commission on Crystallographic Data and the
IUCr Commission on Journals. The result of this effort was a dictionary of
data items sufficient for archiving the small molecule crystallographic
experiment and its results (S.R. Hall, F.H. Allen and I.D. Brown (1991)
Acta Cryst. A47, 655-685). This dictionary was adopted by the IUCr at its
1990 Congress in Bordeaux. CIF is now the format in which structure papers
are submitted to Acta Crystallographica Volume C; software have been devel-
oped to automatically typeset a paper from a CIF.
In 1990, the IUCr formed a working group to expand the dictionary to include
data items revelant to the macromolecular crystallographic experiment.
This working group is chaired by Paula Fitzgerald (Merck); the members of
the group are Enrique Abola (Protein Data Bank), Helen Berman (Rutgers),
Phil Bourne (Columbia), Eleanor Dodson (York), Art Olson (Scripps), Wolf-
gang Steigemann (Martinsried), Lynn Ten Eyck (UCSD) and Keith Watenpaugh
(Upjohn).
The short term goal of the working group is to fulfill the mandate set by the
IUCr: to define CIF data names that need to be added to the core CIF
dictionary in order to adequately describe the macromolecular crystallo-
graphic experiment and its results. But the working group also feels that
it has long term goals as well: to provide sufficient data names so that
the experimental section of a structure paper could be written automatically
and to facilitate the development of tools so that computer programs can
easily interface directly with the CIF. This involves generating a
community-wide consensus about the completeness and accuracy of the data
names and soliciting the involvement of the community in the development of
the needed tools.
The two and a half years in which the macromolecular CIF effort have been
underway have coincided with years of great change at the Protein Data Bank.
The exponentially increasing volume of coordinate depositions demands a
completely automated data processing protocol. In addition, wide-spread
frustration has been growing with the fact that so much valuable information
is stored in free-format remarks, and with the limitations imposed by the
current fixed-format PDB data structure. Both of these factors have
caused the PDB to realize that a new format is needed, and thus the PDB has
decided that it will adopt CIF (or a subset of CIF) as its new exchange
format.
A draft version of the macromolecular CIF dictionary is now largely complete.
Data items have been added to the CIF core to describe the phasing process,
to describe more fully the quality of the diffraction data, and to describe
the results of structure refinement. In addition to these experimental
matters, a key effort has been made to define descriptors for the structure
that will allow the user (crystallographer and non-crystallographer alike)
to rapidly extract the biological structure (as distinct from the contents
of the asymmetric unit). Other data items have been developed to maintain
compatibility with the current PDB format.
Most data items now have entries in the draft version of the dictionary,
although in some instances the definitions are sketchy, and in some cases
probably downright wrong. The working group has found at this point that
the best way to reveal difficulties in the dictionary is to work through
examples.
Although we hesitate to begin circulating a document that we know still is
not complete, we must start to get input from the community. This is
a format that we will all have to deal with at one level or another,
because it will be the new format for the PDB. We hope it will have much
wider usage than that, providing a mechanism for transparent interchange
between different programming environments, and providing a solution for the
vexing problems of structure archiving that we all deal with. It thus
behooves everyone in the macromolecular community to take an interest in the
format while it is still fluid and changes (even major ones) can still be
implemented.
The example below will give you a flavor of what we are trying to do. All
comments will be listened to, but those that propose constructive
alternatives to unpopular features will be listened to most carefully. To
some degree our hands are tied by features of the CIF core (32-character
limits for data names, one level of nesting for loops) but in other areas we
have greater flexibility.
I would like to thank all of the members of the working group for their input
into the process, but I would like to particularly acknowledge the efforts
of Helen Berman, Phil Bourne and Keith Watenpaugh, who have labored mightily
to bring the dictionary to its present state.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
loop_
_atom_site_group_PDB
_atom_site_type_symbol
_atom_site_label_atom_id
_atom_site_label_res_id
_atom_site_label_asym_id
_atom_site_label_seq_id
_atom_site_label_alt_id
_atom_site_Cartn_x
_atom_site_Cartn_y
_atom_site_Cartn_z
_atom_site_occupancy
_atom_site_B_iso_or_equiv
_atom_site_footnote_id
_atom_site_entity_id
_atom_site_entity_seq_num
ATOM N N VAL A 11 ? 25.369 30.691 11.795 1.00 17.93 ? 1 11
ATOM C CA VAL A 11 ? 25.970 31.965 12.332 1.00 17.75 ? 1 11
ATOM C C VAL A 11 ? 25.569 32.010 13.808 1.00 17.83 ? 1 11
ATOM O O VAL A 11 ? 24.735 31.190 14.167 1.00 17.53 ? 1 11
ATOM C CB VAL A 11 ? 25.379 33.146 11.540 1.00 17.66 ? 1 11
ATOM C CG1 VAL A 11 ? 25.584 33.034 10.030 1.00 18.86 ? 1 11
ATOM C CG2 VAL A 11 ? 23.933 33.309 11.872 1.00 17.12 ? 1 11
ATOM N N THR A 12 ? 26.095 32.930 14.590 1.00 18.97 4 1 12
ATOM C CA THR A 12 ? 25.734 32.995 16.032 1.00 19.80 4 1 12
ATOM C C THR A 12 ? 24.695 34.106 16.113 1.00 20.92 4 1 12
ATOM O O THR A 12 ? 24.869 35.118 15.421 1.00 21.84 4 1 12
ATOM C CB THR A 12 ? 26.911 33.346 17.018 1.00 20.51 4 1 12
ATOM O OG1 THR A 12 3 27.946 33.921 16.183 0.50 20.29 4 1 12
ATOM O OG1 THR A 12 4 27.769 32.142 17.103 0.50 20.59 4 1 12
ATOM C CG2 THR A 12 3 27.418 32.181 17.878 0.50 20.47 4 1 12
ATOM C CG2 THR A 12 4 26.489 33.778 18.426 0.50 20.00 4 1 12
ATOM N N ILE A 13 ? 23.664 33.855 16.884 1.00 22.08 ? 1 13
ATOM C CA ILE A 13 ? 22.623 34.850 17.093 1.00 23.44 ? 1 13
ATOM C C ILE A 13 ? 22.657 35.113 18.610 1.00 25.77 ? 1 13
ATOM O O ILE A 13 ? 23.123 34.250 19.406 1.00 26.28 ? 1 13
ATOM C CB ILE A 13 ? 21.236 34.463 16.492 1.00 22.67 ? 1 13
ATOM C CG1 ILE A 13 ? 20.478 33.469 17.371 1.00 22.14 ? 1 13
ATOM C CG2 ILE A 13 ? 21.357 33.986 15.016 1.00 21.75 ? 1 13
# - - - - data truncated for brevity - - - -
ATOM C C1 APS C 300 1 4.171 29.012 7.116 0.58 17.27 1 2 ?
ATOM C C2 APS C 300 1 4.949 27.758 6.793 0.58 16.95 1 2 ?
ATOM O O3 APS C 300 1 4.800 26.678 7.393 0.58 16.85 1 2 ?
ATOM N N4 APS C 300 1 5.930 27.841 5.869 0.58 16.43 1 2 ?
# - - - - data truncated for brevity - - - -
_atom_sites_Cartn_transform_axes 'c along z, astar along x, b along y'
_atom_sites_Cartn_tran_matrix_11 58.39
_atom_sites_Cartn_tran_matrix_12 0.00
_atom_sites_Cartn_tran_matrix_13 0.00
_atom_sites_Cartn_tran_matrix_21 0.00
_atom_sites_Cartn_tran_matrix_22 86.70
_atom_sites_Cartn_tran_matrix_23 0.00
_atom_sites_Cartn_tran_matrix_31 0.00
_atom_sites_Cartn_tran_matrix_32 0.00
_atom_sites_Cartn_tran_matrix_33 46.27
_atom_sites_fract_tran_matrix_11 0.017126
_atom_sites_fract_tran_matrix_12 0.000000
_atom_sites_fract_tran_matrix_13 0.000000
_atom_sites_fract_tran_matrix_21 0.000000
_atom_sites_fract_tran_matrix_22 0.011534
_atom_sites_fract_tran_matrix_23 0.000000
_atom_sites_fract_tran_matrix_31 0.000000
_atom_sites_fract_tran_matrix_32 0.000000
_atom_sites_fract_tran_matrix_33 0.021612
loop_
_atom_sites_alt_id
_atom_sites_alt_details
'?'
; Atom sites with the alternate id set to null are not modelled in alter-
nate conformations
;
'1'
; Atom sites with the alternate id set to 1 have been modelled in
alternate conformations with respect to atom sites marked with alternate
conformation id 2. The conformations of amino acid side chains
and solvent atoms with alternate id set to 1 correlate with the
conformation of the inhibitor marked with alternate id 1. They
have been given an occupancy of 0.58 to match the occupancy assigned
to the inhibitor.
;
'2'
; Atom sites with the alternate id set to 2 have been modelled in
alternate conformations with respect to atom sites marked with alternate
conformation id 1. The conformations of amino acid side chains
and solvent atoms with alternate id set to 2 correlate with the
conformation of the inhibitor marked with alternate id 2. They
have been given an occupancy of 0.42 to match the occupancy assigned
to the inhibitor.
;
'3'
; Atom sites with the alternate id set to 3 have been modelled in
alternate conformations with respect to atoms marked with alternate
conformation id 4. The conformations of amino acid side chains
and solvent atoms with alternate id set to 3 do not correlate with
the conformation of the inhibitor. These atom sites have arbitrarily
been given an occupancy of 0.50.
;
'4'
; Atom sites with the alternate id set to 4 have been modelled in
alternate conformations with respect to atoms marked with alternate
conformation id 3. The conformations of amino acid side chains
and solvent atoms with alternate id set to 4 do not correlate with
the conformation of the inhibitor. These atom sites have arbitrarily
been given an occupancy of 0.50.
;
loop_
_atom_sites_alt_ens_id
_atom_sites_alt_ens_details
'Ensemble 1-A'
; The inhibitor binds to the enzyme in two, roughly twofold symmetric,
alternate conformations.
This conformational ensemble includes the more populated conformation of
the inhibitor (id=1) and the amino acid side chains and solvent structure
that correlate with this inhibitor conformation.
Also included are one set (id=3) of side chains with alternate conform-
ations when the conformations are not correlated with the inhibitor
conformation.
;
'Ensemble 1-B'
; The inhibitor binds to the enzyme in two, roughly twofold symmetric
alternate conformations.
This conformational ensemble includes the more populated conformation of
the inhibitor (id=1) and the amino acid side chains and solvent structure
that correlate with this inhibitor conformation.
Also included are one set (id=4) of side chains with alternate conform-
ations when the conformations are not correlated with the inhibitor
conformation.
;
'Ensemble 2-A'
; The inhibitor binds to the enzyme in two, roughly twofold symmetric
alternate conformations.
This conformational ensemble includes the less populated conformation of
the inhibitor (id=2) and the amino acid side chains and solvent structure
that correlate with this inhibitor conformation.
Also included are one set (id=3) of side chains with alternate conform-
ations when the conformations are not correlated with the inhibitor
conformation.
;
'Ensemble 2-B'
; The inhibitor binds to the enzyme in two, roughly twofold symmetric
alternate conformations.
This conformational ensemble includes the less populated conformation of
the inhibitor (id=2) and the amino acid side chains and solvent structure
that correlate with this inhibitor conformation.
Also included are one set (id=4) of side chains with alternate conform-
ations when the conformations are not correlated with the inhibitor
conformation.
;
loop_
_atom_sites_alt_gen_ens_id
_atom_sites_alt_gen_alt_id
'Ensemble 1-A' '?'
'Ensemble 1-A' '1'
'Ensemble 1-A' '3'
'Ensemble 1-B' '?'
'Ensemble 1-B' '1'
'Ensemble 1-B' '4'
'Ensemble 2-A' '?'
'Ensemble 2-A' '2'
'Ensemble 2-A' '3'
'Ensemble 2-B' '?'
'Ensemble 2-B' '2'
'Ensemble 2-B' '4'
loop_
_atom_sites_footnote_id
_atom_sites_footnote_text
1
; The inhibitor binds to the enzyme in two alternate orientations. The
two orientations have been assigned alternate location indicators *1*
and *2*.
;
2
; Side chains of these residues adopt alternate orientations that corre-
late with the alternate orientations of the inhibitor.
Side chains with alternate location indicator *1* and occupancy 0.58
correlate with inhibitor orientation *1*.
Side chains with alternate location indicator *2* and occupancy 0.42
correlate with inhibitor orientation *2*.
;
3
; The positions of these water molecules correlate with the alternate
orientations of the inhibitor.
Water molecules with alternate location indicator *1* and occupancy 0.58
correlate with inhibitor orientation *1*.
Water molecules with alternate location indicator *2* and occupancy 0.42
correlate with inhibitor orientation *2*.
;
4
; Side chains of these residues adopt alternate orientations that do not
correlate with the alternate orientation of the inhibitor.
;
5
; The positions of these water molecules correlate with alternate orien-
tations of amino acid side chains that do not correlate with alternate
orientations of the inhibitor.
;
loop_
_atom_type_symbol
_atom_type_oxidation_number
_atom_type_scat_Cromer_Mann_a1
_atom_type_scat_Cromer_Mann_a2
_atom_type_scat_Cromer_Mann_a3
_atom_type_scat_Cromer_Mann_a4
_atom_type_scat_Cromer_Mann_b1
_atom_type_scat_Cromer_Mann_b2
_atom_type_scat_Cromer_Mann_b3
_atom_type_scat_Cromer_Mann_b4
_atom_type_scat_Cromer_Mann_c
C 0 2.31000 20.8439 1.02000 10.2075
1.58860 0.568700 0.865000 51.6512 0.21560
N 0 12.2126 0.005700 3.13220 9.89330
2.01250 28.9975 1.16630 0.582600 -11.529
O 0 3.04850 13.2771 2.28680 5.70110
1.54630 0.323900 0.867000 32.9089 0.250800
S 0 6.90530 1.46790 5.20340 22.2151
1.43790 0.253600 1.58630 56.1720 0.866900
CL -1 18.2915 0.006600 7.20840 1.17170
6.53370 19.5424 2.33860 60.4486 -16.378
_audit_creation_date '92-12-08'
_audit_creation_method
; Created by hand from PDB entry 5HVP, from the JBC paper describing this
structure and from laboratory records
;
_audit_update_record
; 92-12-09 adjusted to reflect comments from Brian McKeever
92-12-10 adjusted to reflect comments from Helen Berman
92-12-12 adjusted to reflect comments from Keith Watenpaugh
;
loop_
_audit_author_name
_audit_author_address
'Fitzgerald, Paula M.D.'
; Department of Biophysical Chemistry
Merck Research Laboratories
P. O. Box 2000, Ry80M203
Rahway, New Jersey 07065
USA
;
'McKeever, Brian M.'
; Department of Biophysical Chemistry
Merck Research Laboratories
P. O. Box 2000, Ry80M203
Rahway, New Jersey 07065
USA
;
'Van Middlesworth, J.F.'
; Department of Biophysical Chemistry
Merck Research Laboratories
P. O. Box 2000, Ry80M203
Rahway, New Jersey 07065
USA
;
'Springer, James P.'
; Department of Biophysical Chemistry
Merck Research Laboratories
P. O. Box 2000, Ry80M203
Rahway, New Jersey 07065
USA
;
_audit_contact_author_name 'Fitzgerald, Paula M.D.'
_audit_contact_author_address
; Department of Biophysical Chemistry
Merck Research Laboratories
P. O. Box 2000, Ry80M203
Rahway, New Jersey 07065
USA
;
_audit_contact_author_phone '908 594 5510'
_audit_contact_author_fax '908 594 6645'
_audit_contact_author_email 'paula_fitzgerald@merck.com'
_cell_length_a 58.39(5)
_cell_length_b 86.70(12)
_cell_length_c 46.27(6)
_cell_angle_alpha 90.00
_cell_angle_beta 90.00
_cell_angle_gamma 90.00
_cell_volume 234237
_cell_special_details
; The cell parameters were refined every twenty frames during data integra-
tion. The cell lengths given are the mean of 55 such refinements; the
esds given are the root mean square deviations of these 55 observations
from that mean.
;
_cell_measurement_temperature 293(3)
_cell_measurement_theta_min 11
_cell_measurement_theta_max 31
_cell_measurement_wavelength 1.54
loop_
_citation_id
_citation_coordinate_linkage
_citation_title
_citation_country
_citation_page_first
_citation_page_last
_citation_year
_citation_journal_abbrev
_citation_journal_volume
_citation_journal_issue
_citation_journal_coden_ASTM
_citation_journal_coden_ISSN
_citation_journal_coden_PDB
_citation_book_title
_citation_book_publisher
_citation_book_coden_ISBN
_citation_special_details
primary yes
; Crystallographic analysis of a complex between human immunodeficiency
virus type 1 protease and acetyl-pepstatin at 2.0-Angstroms resolution.
;
US 14209 14219 1990 'J. Biol. Chem.' 265 ?
HBCHA3 0021-9258 071 ? ? ?
; The publication that directly relates to this coordinate set.
;
2 no
; Three-dimensional structure of aspartyl-protease from human
immunodeficiency virus HIV-1.
;
UK 615 619 1989 'Nature' 337 ?
NATUAS 0028-0836 006 ? ? ?
; Determination of the structure of the unliganded enzyme.
;
3 no
; Crystallization of the aspartylprotease from human immunodeficiency virus,
HIV-1.
;
US 1919 1921 1989 'J. Biol. Chem.' 264 ?
HBCHA3 0021-9258 071 ? ? ?
; Crystallization of the unliganded enzyme.
;
4 no
; Human immunodeficiency virus protease. Bacterial expression and
characterization of the purified aspartic protease.
;
US 2307 2312 1989 'J. Biol. Chem.' 264 ?
HBCHA3 0021-9258 071 ? ? ?
; Expression and purification of the enzyme.
;
loop_
_citation_author_citation_id
_citation_author_name
primary 'Fitzgerald, P.M.D.'
primary 'McKeever, B.M.'
primary 'Van Middlesworth, J.F.'
primary 'Springer, J.P.'
primary 'Heimbach, J.C.'
primary 'Leu, C.-T.'
primary 'Herber, W.K.'
primary 'Dixon, R.A.F.'
primary 'Darke, P.L.'
2 'Navia, M.A.'
2 'Fitzgerald, P.M.D.'
2 'McKeever, B.M.'
2 'Leu, C.-T.'
2 'Heimbach, J.C.'
2 'Herber, W.K.'
2 'Sigal, I.S.'
2 'Darke, P.L.'
2 'Springer, J.P.'
3 'McKeever, B.M.'
3 'Navia, M.A.'
3 'Fitzgerald, P.M.D.'
3 'Springer, J.P.'
3 'Leu, C.-T.'
3 'Heimbach, J.C.'
3 'Herber, W.K.'
3 'Sigal, I.S.'
3 'Darke, P.L.'
4 'Darke, P.L.'
4 'Leu, C.-T.'
4 'Davis, L.J.'
4 'Heimbach, J.C.'
4 'Diehl, R.E.'
4 'Hill, W.S.'
4 'Dixon, R.A.F.'
4 'Sigal, I.S.'
_computing_data_collection 'Collect (Siemens)'
_computing_data_reduction 'Xengen (Howard)'
_computing_phasing_MR 'Merlot (Fitzgerald)'
_computing_molecular_graphics 'Protein (Steigemann), Frodo (Jones)'
_computing_structure_refinement 'Protin/Prolsq (Konnert, Hendrickson)'
_database_code_PDB 5HVP
loop_
_database_remark_num_PDB
_database_remark_text_PDB
1 REMARK 2
2 REMARK 2 RESOLUTION. 2.0 ANGSTROMS.
3 REMARK 3
4 REMARK 3 REFINEMENT. BY THE RESTRAINED LEAST-SQUARES PROCEDURE OF J.
5 REMARK 3 KONNERT AND W. HENDRICKSON (PROGRAM *PROLSQ*). THE R
6 REMARK 3 VALUE IS 0.176 FOR 12901 REFLECTIONS IN THE RESOLUTION
7 REMARK 3 RANGE 8.0 TO 2.0 ANGSTROMS WITH I .GT. SIGMA(I).
# - - - - data truncated for brevity - - - -
loop_
_database_rev_num_PDB
_database_rev_author_name_PDB
_database_rev_date_PDB
_database_rev_date_original_PDB
_database_rev_status_PDB
_database_rev_mod_type_PDB
1 'Fitzgerald, Paula M.D' 91-10-15 90-04-30 'full release' 0
_diffrn_ambient_temperature 293(3)
_diffrn_crystal_environment
; Mother liquor from the reservoir of the vapor diffusion experiment,
mounted in room air
;
_diffrn_crystal_physical_device
; 0.7 mm glass capillary, sealed with dental wax
;
_diffrn_crystal_treatment
; Equilibrated in rotating anode radiation enclosure for 18 hours prior
to beginning of data collection.
;
_diffrn_measurement_method 'omega scan'
_diffrn_measurement_details
; 440 frames, 0.20 degrees, 150 sec, detector distance 12 cm, detector angle
22.5 degrees
;
_diffrn_measure_device_type '3-circle camera'
_diffrn_measure_device_part 'Supper model x'
_diffrn_measure_device_details 'none'
_diffrn_radiation_collimation '0.3 mm double pinhole'
_diffrn_radiation_monochromator graphite
_diffrn_radiation_type 'Cu K\a'
_diffrn_radiation_wavelength 1.54
_diffrn_rad_detector_type 'multiwire'
_diffrn_rad_detector_part 'Siemens'
_diffrn_rad_source_type 'rotating anode'
_diffrn_rad_source_part 'Rigaku RU-200'
_diffrn_rad_source_power '50 kw, 180 mA'
_diffrn_rad_source_target '8mm x 0.4 mm broad-focus'
loop_
_entity_id
_entity_type
_entity_name_common
_entity_name_systematic
_entity_source
_entity_special_details
1 polymer 'HIV-1 protease' ECx.x.x.x
; Clone obtained from HIV strain NY-5.
Expressed in E. coli.
;
; The enzymatically competent form of HIV protease is a
dimer. This entity corresponds to one monomer of an
active dimer.
;
2 non-polymer 'acetyl-pepstatin'
'acetyl-Ile-Val-Asp-Sta-Ala-Ile-Sta'
'Natural product isolated from actinomycetes'
; Statine: ((4S,3S)-4-amino-3-hydroxy-6-methylheptanoic
acid. Acetyl-pepstatin was isolated by Dr. K. Oda, Osaka
Prefecture University, and provided to us by Dr. Ben
Dunn, University of Florida, and Dr. J. Kay, University
of Wales.
;
3 water 'water' ? ? ?
loop_
_entity_keywords_entity_id
_entity_keywords_text
1 'polypeptide'
2 'natural product'
2 'inhibitor'
2 'reduced peptide'
loop_
_entity_nonp_id
_entity_nonp_entity_id
_entity_nonp_formula
_entity_nonp_formula_weight
_entity_nonp_number_of_nh_atoms
_entity_nonp_model_source
_entity_nonp_model_details
APS 2 'C31 H55 N5 O9' 641.8 45
'Built by hand using ChemNote in Quanta (MSI)'
'Geometry idealized using AMF (Merck)'
loop_
_entity_nonp_atom_entity_id
_entity_nonp_atom_atom_id
_entity_nonp_atom_type_symbol
_entity_nonp_atom_model_Cartn_x
_entity_nonp_atom_model_Cartn_y
_entity_nonp_atom_model_Cartn_z
2 1 C -0.15600 -0.90770 -2.11270
2 2 C -0.20530 -1.10010 -0.59490
2 3 O -0.51270 -2.16520 -0.06340
2 4 N 0.09550 -0.00790 0.11530
2 5 C 0.14840 -0.01830 1.58870
2 6 C 1.41550 -0.79710 2.04770
2 7 C 2.71100 -0.17870 1.47350
2 8 C 1.50570 -0.94000 3.58320
2 9 C 0.20050 1.42100 2.12200
2 10 O 0.58080 2.37350 1.43910
2 11 N -0.15500 1.55910 3.40030
# - - - - data truncated for brevity - - - -
loop_
_entity_nonp_bond_entity_id
_entity_nonp_bond_atom_id_1
_entity_nonp_bond_atom_id_2
_entity_nonp_bond_type
2 1 2 sing
2 2 3 doub
2 2 4 sing
2 4 5 sing
2 5 6 sing
2 5 9 sing
2 6 7 sing
2 6 8 sing
2 9 10 doub
2 9 11 sing
# - - - - data truncated for brevity - - - -
loop_
_entity_poly_entity_id
_entity_poly_type
_entity_poly_formula_weight
_entity_poly_non_s_chirality
_entity_poly_non_s_linkage
_entity_poly_non_s_monomer
_entity_poly_type_details
1 polypeptide(L) 10916 no no no ?
loop_
_entity_poly_seq_entity_id
_entity_poly_seq_num
_entity_poly_seq_mon_id
1 1 PRO 1 2 GLN 1 3 ILE 1 4 THR 1 5 LEU
1 6 TRP 1 7 GLN 1 8 ARG 1 9 PRO 1 10 LEU
1 11 VAL 1 12 THR 1 13 ILE 1 14 LYS 1 15 ILE
1 16 GLY 1 17 GLY 1 18 GLN 1 19 LEU 1 20 LYS
1 21 GLU 1 22 ALA 1 23 LEU 1 24 LEU 1 25 ASP
# - - - - data truncated for brevity - - - -
_exptl_crystal_grow_method 'hanging drop'
_exptl_crystal_grow_apparatus 'Linbro plates'
_exptl_crystal_grow_atmosphere 'room air'
_exptl_crystal_grow_pH 4.7
_exptl_crystal_grow_temp 18(3)
_exptl_crystal_grow_time 'approximately 2 days'
loop_
_exptl_crystal_grow_com_id
_exptl_crystal_grow_com_sol_id
_exptl_crystal_grow_com_name
_exptl_crystal_grow_com_volume
_exptl_crystal_grow_com_conc
_exptl_crystal_grow_com_details
1 1 'HIV-1 protease' '0.002 ml' '6 mg/ml'
; The protein solution was in a buffer containing 25 mM NaCl, 100 mM NaMES/
MES buffer, pH 7.5, 3 mM NaAzide
;
2 2 'NaCl' '0.200 ml' '4 M' 'in 3 mM NaAzide'
3 2 'Acetic Acid' '0.047 ml' '100 mM' 'in 3 mM NaAzide'
4 2 'Na Acetate' '0.053 ml' '100 mM'
; in 3 mM NaAzide. Buffer components were mixed to produce a pH of 4.7
according to a ratio calculated from the pKa. The actual pH of solution 2
was not measured.
;
5 2 'water' '0.700 ml' 'neat' 'in 3 mM NaAzide'
_refine_ls_number_reflns 12901
_refine_ls_number_restraints 6609
_refine_ls_number_parameters 7032
_refine_ls_R_Factor_obs 0.176
_refine_ls_weighting_scheme calc
_refine_ls_weighting_details
; Sigdel model of Konnert-Hendrickson:
Sigdel: Afsig + Bfsig*(sin(theta)/lambda-1/6)
Afsig = 22.0, Bfsig = -150.0 at the beginning of refinement.
Afsig = 15.5, Bfsig = -50.0 at the end of refinement.
;
loop_
_refine_ls_restr_type
_refine_ls_restr_target
_refine_ls_restr_model
_refine_ls_restr_number
_refine_ls_restr_criterion
_refine_ls_restr_rejects
'bond_d' 0.020 0.018 1654 '> 2\s' 22
'angle_d' 0.030 0.038 2246 '> 2\s' 139
'planar_d' 0.040 0.043 498 '> 2\s' 21
'planar' 0.020 0.015 270 '> 2\s' 1
'chiral' 0.150 0.177 278 '> 2\s' 2
'singtor_nbd' 0.500 0.216 582 '> 2\s' 0
'multtor_nbd' 0.500 0.207 419 '> 2\s' 0
'xyhbond_nbd' 0.500 0.245 149 '> 2\s' 0
'planar_tor' 3.0 2.6 203 '> 2\s' 9
'staggered_tor' 15.0 17.4 298 '> 2\s' 31
'orthonormal_tor' 20.0 18.1 12 '> 2\s' 1
loop_
_refine_ls_shell_d_res_low
_refine_ls_shell_d_res_high
_refine_ls_shell_reflns
_refine_ls_shell_R_factor_obs
8.00 4.51 1226 0.196
4.51 3.48 1679 0.146
3.48 2.94 2014 0.160
2.94 2.59 2147 0.182
2.59 2.34 2127 0.193
2.34 2.15 2061 0.203
2.15 2.00 1647 0.188
loop_
_refine_occupancy_class
_refine_occupancy_treatment
_refine_occupancy_value
_refine_occupancy_details
'protein' fix 1.00 ?
'solvent' fix 1.00 ?
'inhibitor orientation 1' fix 0.65 ?
'inhibitor orientation 2' fix 0.35
; The inhibitor binds to the enzyme in two alternate conformations. The
occupancy of each conformation was adjusted so as to result in approxi-
mately equal mean thermal factors for the atoms in each conformation.
;
loop_
_refine_B_iso_class
_refine_B_iso_treatment
'protein' isotropic
'solvent' isotropic
'inhibitor' isotropic
_reflns_data_reduction_method
; Xengen program scalei. Anomalous paris were merged. Scaling proceeded
in several passes, beginning with 1-parameter fit and ending with
3-parameter fit.
;
_reflns_data_reduction_details
; Merging and scaling based on only those reflections with I > \s(I).
;
_reflns_d_resolution_high 2.00
_reflns_d_resolution_low 8.00
_reflns_limit_h_max 22
_reflns_limit_h_min 0
_reflns_limit_k_max 46
_reflns_limit_k_min 0
_reflns_limit_l_max 57
_reflns_limit_l_min 0
_reflns_number_observed 7228
_reflns_observed_criterion '> 1 \s(I)'
_reflns_special_details none
loop_
_reflns_shell_d_res_high
_reflns_shell_d_res_low
_reflns_shell_meanI/sigI_obs
_reflns_shell_count_measured_obs
_reflns_shell_count_unique_obs
_reflns_shell_possible_%_obs
_reflns_shell_Rmerge_F_obs
31.38 3.82 69.8 9024 2540 96.8 1.98
3.82 3.03 26.1 7413 2364 95.1 3.85
3.03 2.65 10.5 5640 2123 86.2 6.37
2.65 2.41 6.4 4322 1882 76.8 8.01
2.41 2.23 4.3 3247 1714 70.4 9.86
2.23 2.10 3.1 1140 812 33.3 13.99
_struct_title
; HIV-1 protease complex with acetyl-pepstatin
;
loop_ _struct_keywords
'enzyme-inhibitor complex'
'aspartyl protease'
'structure-based drug design'
'static disorder'
loop_
_struct_asym_id
_struct_asym_entity_id
_struct_asym_special_details
A 1 'one monomer of the dimeric enzyme'
B 1 'one monomer of the dimeric enzyme'
C 2 'one partially occupied position for the inhibitor'
D 2 'one partially occupied position for the inhibitor'
loop_
_struct_biol_id
_struct_biol_special_details
1
; significant deviations from twofold symmetry exist in this dimeric
enzyme
;
2
; The drug binds to this enzyme in two roughly twofold symmetric modes.
Hence this biological unit (2) is roughly twofold symmetric to biological
unit (3). Disorder in the protein chain indicated with alternate
indicator 1 should be used with this biological unit.
;
3
; The drug binds to this enzyme in two roughly twofold symmetric modes.
Hence this biological unit (3) is roughly twofold symmetric to biological
unit (2). Disorder in the protein chain indicated with alternate
indicator 2 should be used with this biological unit.
;
loop_
_struct_biol_gen_biol_id
_struct_biol_gen_asym_id
_struct_biol_gen_symmetry
1 A 1_555
1 B 1_555
2 A 1_555
2 B 1_555
2 C 1_555
3 A 1_555
3 B 1_555
3 D 1_555
loop_
_struct_biol_keywords_biol_id
_struct_biol_keywords_text
1 'aspartyl-protease'
1 'aspartic-protease'
1 'acid-protease'
1 'aspartyl-proteinase'
1 'aspartic-proteinase'
1 'acid-proteinase'
1 'enzyme'
1 'protease'
1 'proteinase'
1 'dimer'
2 'drug-enzyme complex'
2 'inhibitor-enzyme complex'
2 'drug-protease complex'
2 'inhibitor-protease complex'
3 'drug-enzyme complex'
3 'inhibitor-enzyme complex'
3 'drug-protease complex'
3 'inhibitor-protease complex'
loop_
_struct_conf_id
_struct_conf_conf_type_id
_struct_conf_beg_label_res_id
_struct_conf_beg_label_asym_id
_struct_conf_beg_label_seq_id
_struct_conf_end_label_res_id
_struct_conf_end_label_asym_id
_struct_conf_end_label_seq_id
_struct_conf_special_details
HELX1 HELX-RHAL ARG A 87 GLN A 92 ?
HELX2 HELX-RHAL ARG B 287 GLN B 292 ?
STRN1 STRN PRO A 1 LEU A 5 ?
STRN2 STRN CYS B 295 PHE B 299 ?
STRN3 STRN CYS A 95 PHE A 299 ?
STRN4 STRN PRO B 201 LEU B 205 ?
# - - - - data truncated for brevity - - - -
TURN1 TURN-TY1P ILE A 15 GLN A 18 ?
TURN2 TURN-TY2 GLY A 49 GLY A 52 ?
TURN3 TURN-TY1P ILE A 55 HIS A 69 ?
TURN4 TURN-TY1 THR A 91 GLY A 94 ?
# - - - - data truncated for brevity - - - -
loop_
_struct_conf_type_id
_struct_conf_type_criteria
_struct_conf_type_reference
HELX-RHAL 'author judgement' ?
STRN 'author judgement' ?
TURN-TY1 'author judgement' ?
TURN-TY1P 'author judgement' ?
TURN-TY2 'author judgement' ?
TURN-TY2P 'author judgement' ?
loop_
_struct_conn_id
_struct_conn_conn_type_id
_struct_conn_par1_label_res_id
_struct_conn_par1_label_asym_id
_struct_conn_par1_label_seq_id
_struct_conn_par1_label_atom_id
_struct_conn_role_par1
_struct_conn_symmetry_par1
_struct_conn_par2_label_res_id
_struct_conn_par2_label_asym_id
_struct_conn_par2_label_seq_id
_struct_conn_par2_label_atom_id
_struct_conn_role_par2
_struct_conn_symmetry_par2
_struct_conn_special_details
C1 saltbr ARG A 87 NZ1 positive 1_555 GLU A 92 OE1 negative 1_555 ?
C2 hydrog ARG B 287 N donor 1_555 GLY B 292 O acceptor 1_555 ?
# - - - - data truncated for brevity - - - -
loop_
_struct_conn_type_id
_struct_conn_type_criteria
_struct_conn_type_reference
saltbr
'negative to positive distance > 2.5 \%A, < 3.2 \&A' ?
hydrog
'N to O distance > 2.5 \%A, < 3.5 \&A, N O C angle < 120 degrees' ?
loop_
_struct_site_id
_struct_site_special_details
'P2 site C'
; residues with a contact < 3.7 \%A to an atom in the P2 moiety of the
inhibitor in the conformation with _struct_asym_id = C
;
'P2 site D'
; residues with a contact < 3.7 \%A to an atom in the P1 moiety of the
inhibitor in the conformation with _struct_asym_id = D)
;
loop_
_struct_site_gen_id
_struct_site_gen_site_id
_struct_site_gen_label_res_id
_struct_site_gen_label_asym_id
_struct_site_gen_label_seq_id
_struct_site_gen_symmetry
_struct_site_gen_special_details
1 1 VAL A 32 1_555 ?
2 1 ILE A 47 1_555 ?
3 1 VAL A 82 1_555 ?
4 1 ILE A 84 1_555 ?
5 2 VAL B 232 1_555 ?
6 2 ILE B 247 1_555 ?
7 2 VAL B 282 1_555 ?
8 2 ILE B 284 1_555 ?
loop_
_struct_site_keywords_site_id
_struct_site_keywords_text
'P2 site C' 'binding site'
'P2 site C' 'binding pocket'
'P2 site C' 'P2 site'
'P2 site C' 'P2 pocket'
'P2 site D' 'binding site'
'P2 site D' 'binding pocket'
'P2 site D' 'P2 site'
'P2 site D' 'P2 pocket'
_symmetry_cell_setting orthorhombic
_symmetry_Int_Tables_number 18
_symmetry_space_group_name_H-M 'P 21 21 2'
loop_
_symmetry_equiv_pos_as_xyz
+x,+y,+z -x,-y,z 1/2+x,1/2-y,-z 1/2-x,1/2+y,-z