mmCIF Software Tools

The CIFLIB Class Library

CIFLIB is a software library that was developed to provide an application interface to information in CIF format. The entity relationship diagram of the CIFLIB class library in the figure below illustrates that CIFLIB follows the essentially relational organization used in the mmCIF dictionary. CIFLIB provides functions which perform the following types of operations:

The previous diagram shows how this software library facilitates integrating the CIF interchange format with other applications. As the figure illustrates, CIFLIB provides complete access to the DDL, CIF dictionaries and CIF data files. This library can be used to build wrappers and filters around existing applications which need to access CIF data. Since CIFLIB provides complete access to the dictionary data model, the library can be conveniently used as an in-memory database or as a loader for an external database.

Accessing data in CIF format using CIFLIB is a multistep process. CIFLIB first reads the DDL dictionary. Once the DDL is loaded and checked against internally coded rules based on DDL 2.1, a CIF dictionary which is based on this DDL is read and checked. This process can be quite time consuming for large dictionaries, so a provision has been made to retain the state of any file which has been checked in an auxiliary file. This auxiliary file will be used in preference to the original file in subsequent file accesses if its modification date is more recent. Finally, CIF data files are read and checked with respect to the CIF dictionary. In any file access, CIFLIB provides complete access to the data blocks containing the DDL, the CIF dictionary, and any number of blocks containing user data.

Each CIF file may be divided into data block sections. CIFLIB treats each data block as an independent database loaded into the data model defined in its associated dictionary. The CIF DDL is at the top of the chain and provides the data model for a CIF dictionary. The CIF dictionary in turn provides the data model for CIF data files. CIFLIB provides functions to read, write and merge data blocks. Any number of data blocks can be managed by the library.

Within each individual data block, category groups provide a mechanism for organizing categories into conceptually meaningful collections. CIFLIB provides functions to obtain the list of category groups defined within a data block as well as the names of the member categories of each group.

The library provides a set of functions for accessing category level features within a data block. These functions provide a complete list of the categories specified within a data block, the list of data items specified within each category, and the number of rows of data in a category. The attributes of a category defined in the CIF dictionary such as the category description, category examples, member data items, key data items, and member subcategories can also be obtained.

Functions are provided to read, write and update individual data items, rows of data items, and columns of data items. These functions also check the integrity of item values with respect the their dictionary definitions. Access to all of the item attributes defined in the CIF dictionary is provided, and convenience functions are provided for the most commonly used attributes such as alias names, data type, default value, and enumeration.

CIFLIB provides a set of functions which give information about parent/child relationships, and provide access to the parent and child item values. The parent and child relationships returned by the functions in this section span a single generation; however, complicated hierarchies of parentage can be easily traced.

Although the extended DDL uses different conventions for naming data, it provides a mechanism to reference alternative data names. The mmCIF dictionary uses this feature to show the correspondence between the mmCIF data items and the existing core CIF data items. Because the mmCIF dictionary incorporates all of the definitions in the core CIF dictionary, it is possible for software developed for the extended DDL to use the mmCIF dictionary to read, write and check data items derived from either dictionary.

The CIFLIB C Language Application Program Interface

A C language application program interface to CIFLIB has been developed to provide a convenient functional interface to the CIFLIB class library. A reference manual which describes each interface function in detail is available. CIFLIB provides a set of functions which access the error codes generated by those library functions which perform integrity checking. The CIFLIB functions which access and update individual item values return only a single error code. Functions providing read access return only the first error encountered in checking the target item. Similarly, functions providing update access return only the first error encountered in the checking process; however, all of the errors that may be detected during an I/O operation are appended to the warning or error lists maintained for each datablock. Higher level functions, which read and write files and data blocks also append their diagnostic codes to internal error and warning lists. A set of functions has been provided to access and refresh these lists. Functions are also provided to translate individual error codes and to print the contents of an entire data block.

An mmCIF to HTML Converter

To demonstrate some of the functionality of the C API a dictionary-to-HTML converter application has been developed to provide flexible access to the contents of the mmCIF and DDL dictionaries on the World Wide Web.

The HTML mmCIF dictionary is organized so that a user can flexibly navigate through the hierarchy of the definitions and between all data item relationships. The first level of presentation is a page of category groups and group descriptions. The contents of each category group can be explored and specific categories within each group can be selected. Each category is presented on a page which includes all of the DDL attributes pertaining to the category description. From within the category presentation, individual data items can be selected. The data item presentation includes all of the relevant DDL attributes and selections for all related data items.

The CIFPARSE Function Library

In many instances it is desirable to be able to read and write information in CIF format without the overhead of dictionary based integrity processing. The CIFPARSE function library was developed to provide convenient tools to access CIF data without any semantic checking. CIFPARSE uses a simple lex/yacc parser to read and check the syntax of mmCIF data files. CIF data is stored as strings in a simple data structure and a collection of access functions are provided to retrieve individual items of data.

This library is particularly useful for applications that exchange CIF data between program applications or for repetitive access to static CIF data that has been previously semantically validated. A reference manual which describes each function in detail is available.

The CIFOBJ Class Library

The design CIFLIB class library closely follows the relational model that is specified in DDL 2.1. Categories in mmCIF dictionaries and data files are mapped into tabular data structures in the class, and the data access methods provided by the class are all row oriented. This design is satisfactory for applications which access CIF data in the CIF dictionary one category at a time; however, it is not particularly efficient for applications which access dictionary data item-wise. This pattern of access is common for applications which have to assemble all of the dictionary attributes of a data item for the purpose of integrity processing. The CIFOBJ class library was developed in order to provide an object view of the mmCIF dictionary. The class library has two components as illustrated in diagram below. The first component builds a persistent store of objects of type: item, sub-category, category, and dictionary. Each object is a container for all relevant attributes for that object type. CIBOBJ accesses and checks the dictionary contents using methods provided by CIFLIB. The CIFOBJ loader class assembles the dictionary objects and passes these to the object storage manager. The second component of the CIFOBJ class library provides methods for building dictionary objects from the persistent store. CIFOBJ implements methods to access all of the attributes for each object type and returns this information as a string or an array of strings. A reference manual which describes these classes in detail is available.

The CIFTABLE Class Library

The CIFTable package contains several files that converge together in the SSTable class, a searchable string table . The SSTable class has a number of characteristics that make it useful for a wide variety of applications:

A reference manual is available for the public and static interface to the SSTable class.