Summary of changes; Release of mmCIF 0.8
Paula Fitzgerald (paula_fitzgerald@Merck.Com)
Thu, 7 Mar 96 13:41:41 EST
Hello again -
Yesterday afternoon and evening, Helen, John and I once again found ourselves
gathered in central New Jersey, trying to at last really wrap up this stage
of dictionary review and modification. When last we wrote (on January 29th)
we had dealt with what we considered the minor issues, and were bringing you
all up to date with those changes before tackling the larger and more
complicated issues.
We began dealing with the larger issues that same afternoon, and continued
that effort throughout February. As of today, we feel that we have done what
we can with the larger issues, and we are ready to release a new version of
the dictionary to you. But not only to you - we would like at this point to
open the review process to a larger audience, by posting notices about all of
this to some of the major structural newsgroups and mailing lists. But
before we do that, we would like to give all of you one more chance to look
things over. As we really want to get this moving, we will plan on posting
the announcements on March 14 (carefully avoiding the Ides of March).
We would also like to draw your attention to the new and improved mmCIF Web
page, which will be available in the usual place starting today (March 7).
We have reorganized things a bit, added a lot of supporting and introductory
material, and in general tried to make this a lot more understandable to the
general user. Give it a look. The new version of the dictionary (version
0.8, a major stop forward, is also there).
Now, for the nuts and bolts. This message will end with the audit trail, as
usual, but each of the major issues that we dealt with deserves a few words
of its own.
1) There had been a discussion of alternative nomenclatures for the
sequence, residue name, residue number, etc. identifiers in the atom site
record. After a lot of discussion, we decided to introduce alternative
items for each of these, not just for the residue number, which is what
we had done before. These new data items have names like
_atom_site.auth_seq_id, completely parallel to the current names like
_atom_site.label_seq_id (auth is an abbreviation for author's notation).
So a completely independent second set has been introduced throughout the
dictionary, providing the user complete freedom to adopt a personal
nomenclature. The primary data items, which does has to obey some rules,
remains the default and are the mandatory data items; the author's
notation data items are always optional.
There is one confusing thing about this - in order to be consistent, we
changed the meaning of atom_site.label_seq_id to be the one that must run
from 1-n (and whose parent is _entity_poly_seq.num (previously
_atom_site.entity_seq_num had played this role. We truly hope this does
not generate too much confusion - it really was necessary to make a
cleaner schema.
2) Matrices and vectors for non-crystallography symmetry have been added to
the STRUCT_NCS category. That category group was reorganized a good bit
to make the data structure cleaner.
3) There had been a suggestion that we needed an ENTITY_LINK category to
handle linkages that occur at the entity level (for instance, a disulfide
bond between the A and B chains of insulin.) We began to create such a
category, but then realized that was completely redundant with the
CHEM_COMP_LINK category, so we stepped back from that idea and simply
created a generalized CHEM_LINK category, that can be referred to from
both ENTITY and CHEM_COMP.
4) The data type code "char" was changed to "line." The complete set is now
text (text of any length), line (limited to one line), and code (limited
to one word).
5) We talked at length about the issue of NMR ensembles of structure and how
to handle then. Our decision was that the concept of ensembles was
already provided for in the ATOM_SITES_ALT category and its
subcategories. This may take a bit more justification and explanation,
but we are convinced that these data items, which are already present,
will handle this whole issue cleanly.
6) A completely new SOFTWARE category has been created, replacing the old
COMP_PROG category.
7) Another big item was how to specify segments of structure (or of several
structures) in order to point to entries in external databases. It had
been suggested that we do this at the entity level, but after a lot of
thought we decided that such database references were really annotation
of the structure, and that they properly belonged in the STRUCT category
group, where all other structure annotation takes place. So we created
STRUCT_REF, which we think handles all of the cases that were raised with
us. The old ENTITY_REFERENCE categories are gone.
There were several other smaller things that got taken care of - they appear
in the audit trail, but don't really require discussion here.
Again - thank you all so much for all of your suggestions and comments, and
for helping to bring the dictionary so close to being a final and complete
document.
Paula, John, Helen
- - - - - - - - - - - - - -
0.7.31 1996-02-12
;
Changes (JDW):
+ Added data items for _database_pdb_matrix.tvect_matrix[][] and
_database_pdb_matrix.tvect_vector[].
+ Generalized category CHEM_LINK to handle descriptions of a
any type of linkage. Created CHEM_COMP_LINK to describe
linkages between components, and ENTITY_LINK to describe
linkages between entities (and within entities between
nonsequential components). Both CHEM_COMP_LINK and ENTITY_LINK
reference the linkage description in the CHEM_LINK_* categories.
;
0.7.32 1996-02-17
;
Changes (JDW):
+ atom_site.entity_id renamed atom_site.label_entity_id.
+ atom_site.entity_seq_num deleted.
+ added items _atom_site.auth_asym_id, _atom_site.auth_atom_id,
_atom_site.auth_comp_id, and _atom_site.auth_seq_id. These
items provide placeholders for alternative nomenclature that
may be used by the author.
+ Set the parentage for _atom_site.label_seq_id to
_entity_poly_seq.num. All components of the atom site label
(_atom_site.label_*) are now linked to the mmCIF hierarchical
description of structure. The data items in _atom_site.auth_*
may be used by authors to provide alternative identifiers
in the atom site which conform with the scheme that is used in
the publication of the structure.
+ added category group mm_atom_site_auth_label
+ added auth_asym_id, auth_atom_id, auth_comp_id, and auth_seq_id
child data items to the categories: GEOM_ANGLE,GEOM_BOND,
GEOM_CONTACT, STRUCT_CONF, STRUCT_CONN, STRUCT_MON_NUCL,
STRUCT_PROT, STRUCT_PROT_CIS, STRUCT_NCS_DOM_GEN,
STRUCT_SHEET_HBOND, STRUCT_SHEET_RANGE, and STRUCT_SITE_GEN.
;
0.7.33 1996-02-19
;
Changes (JDW):
+ Replaced category COMP_PROG with category SOFTWARE supplied by
P. Bourne.
+ Fine tuned some values of _item_type.code. Fixed regular expression
for code and ucode.
;
0.7.34 1996-02-20
;
Changes (JDW):
+ Integrated STRUCT_REF, STRUCT_REF_SEQ and STRUCT_REF_SEQ_DIF from
PMDF.
+ Removed ENTITY_REFERENCE and ENTITY_POLY_SEQ_DIF.
+ Integrated modified categories STRUCT_NCS_DOM, STRUCT_NCS_DOM_LIM,
STRUCT_NCS_ENS, STRUCT_NCS_ENS_GEN, and STRUCT_NCS_OPER from PMDF.
+ changed _item_type.code's 'char' and 'uchar' to 'line' and 'uline'.
;
0.8.0 1996-03-06
;
Changes (PMDF, HB, JDW):
+ Added unit type 8pi2_angstroms_squared B anisotropic temperature factors,
and added conversion factor for this new unit type in the
ITEM_UNITS_CONVERSION category.
+ Changed _item_type.code for _symmetry_equiv.id to 'code'
+ Added default value 'no' to _chem_comp.mon_nstd_flag.
;
********************************************************************************
Dr. Paula M. D. Fitzgerald ______________ voice and FAX: (908) 594-5510
Merck Research Laboratories ______________ email: paula_fitzgerald@merck.com
P.O. Box 2000, Ry50-105 ______________ or bean@merck.com
Rahway, NJ 07065 USA
(for express mail use 126 E. Lincoln Ave. instead of P. O. Box 2000)
********************************************************************************