README.cyclops Information for cyclops 2.1.4, 12 May 97 _________________________________________________________________ Before using this software, please read the NOTICE and please read the IUCr Policy on the Use of the Crystallographic Information File (CIF) _________________________________________________________________ *%%%%%* *%%%%%%%%%%%* *%%%%%/\|/\%%%%%* *%%%%%%|--2--|%%%%%%* C Y C L O P S 2 ...... STAR data name checker *%%%%%\/|\/%%%%%* *%%%%%%%%%%%* Version 2.1.4 *%%%%%* 12 May 1997 CYCLOPS2 is a fortran program for checking STAR data names against data -------- name dictionaries written in DDL-STAR format proposed by Tony Cook of ORAC Ltd., Leeds. Data names may be checked in any text file. CYCLOPS Version 2 by Sydney R. Hall (syd@crystal.uwa.edu.au) Crystallography Centre University of Western Australia Nedlands 6009, AUSTRALIA and Herbert J. Bernstein (yaya@bernstein-plus-sons.com) Bernstein + Sons 5 Brewster Lane Bellport, NY 11713, U.S.A. Version 2 handles DDL1 and DDL2 dictionaries, and uses CIFtbx2 The latest program source and information is available from: Em: syd@crystal.uwa.edu.au ,-_|\ Sydney R. Hall sendcif@crystal.uwa.edu.au / \ Crystallography Centre Fx: +61 9 380 1118 --> *_,-._/ University of Western Australia Ph: +61 9 380 2725 v Nedlands 6009, AUSTRALIA _________________________________________________________________ This version of CYCLOPS is released with CIFtbx2. Before reading this document, please read the CIFtbx2 README file. The basic instructions for building CYCLOPS2 are included with those for building CIFtbx2. Additional information needed for the installation of CYCLOPS appears in the section INSTALLATION NOTES, below. If you are on a UNIX system, the Makefile for CIFtbx2 can be used to build and test cyclops. PROGRAM OPERATION CYCLOPS reads the text containing data names from the standard input device (normally device 5). One or more STAR data name dictionaries (in DDL format) are opened (normally from unit 1). A report on the data names is output to the standard output device (normally device 6). Messages are output to the standard error device (normally device 0). If an error is encountered in attempting to write to the standard error device, further messages output is diverted to the standard output device. NOTE: If no dictionaries are specified on the command line, an attempt is made to open a file named "STARDICT" to use either as a dictionary or from which to obtain a list of names of dictionaries. STARDICT may directly contain a dictionary as in version 1 of CYCLOPS, or contain a list of file names of dictionaries, one per line. In that case the list must begin with the line for which the first 5 characters are "#DICT". The list of unreferenced data names is suppressed unless the line "#VERBOSE" appears in STARDICT, or the command line argument "-v yes" is used. In a UNIX-like environment, the program is run as: cyclops [-i input_text] [-o validation_output] \ [-d dictionary] [-p priority] [-c catck] \ [-f command_file] [-v verbose] [-s short]\ where: input_text defaults to $CYCLOPS_INPUT_TEXT or stdin validation_output defaults to $CYCLOPS_VALIDATION_OUT or stdout dictionary defaults to $CYCLOPS_CHECK_DICTIONARY (multiple dictionaries may be specified) input_text of "-" is stdin, validation_output of "-" is stdout -c has values of t or 1 or y vs. f or 0 or n, (default f, i.e. no category checking), -v has values of t or 1 or y vs. f or 0 or n, (default f, i.e. non-verbose), -s has values of t or 1 or y vs. f or 0 or n, (default f, i.e. not short), short restricts output to items not in dictionaries -p has values of first, final or nodup (default first for first dictionary has priority) a command file may contain additional arguments 1. INSTALLATION Here is the recommended procedure for implementing and testing this version of cyclops. 1.0. Before you try to install this version of cyclops2 *** ========================================================== *** *** ========================================================== *** *** ==>>> You must have ciftbx version 2.5.4 or greater <<<== *** *** ==>>> installed in a directory named ciftbx.src. <<<== *** *** ==>>> The scripts mkdecompln and rmdecompln, which <<<== *** *** ==>>> come with ciftbx, must be installed in the <<<== *** *** ==>>> top level directory and executable. <<<== *** *** ==>>> To test cyclops, you must have compressed <<<== *** *** ==>>> copies of cif_core.dic and cif_mm.dic installed<<<== *** *** ==>>> in a directory named dictionaries. <<<== *** *** ========================================================== *** *** ========================================================== *** The directory structure within which you will work is top level directory ------------------- | | ------------------------------ | | | dictionaries ciftbx.src cyclops.src ------------ ---------- ----------- You may have acquired this package in one of several forms. The most likely are as a "C-shell Archive," a "Shell Archive", or as separate files. The idea is to get to separate files, all in the same directory, named cyclops.src, parallel to the directory ciftbx.src, but let's start with the possibility that you got the package as one big file, i.e. in one of the archive file formats. Place the archive in the top level directory. *** ========================================================== *** *** ========================================================== *** *** ==>>> The files in this kit will unpack into a <<<== *** *** ==>>> directory named cyclops.src. It is a good idea<<<== *** *** ==>>> to save the current contents of cyclops.src <<<== *** *** ==>>> and then to make the directory empty <<<== *** *** ========================================================== *** *** ========================================================== *** If you are on a machine which does not provide a unix-like shell, you will need to take apart the archive by hand using a text editor. We'll get to that in a moment. 1.1. ON A UNIX MACHINE If you have the shell archive on a unix machine, follow the instructions at the front of the archive, i.e. save the uncompressed archive file as "file", then, if the archive is a "Shell Archive" execute "sh file". If the archive is a "C-Shell Archive" execute "csh file". 1.2. IF YOU DON'T HAVE UNIX If sh or csh are not available, then it is best to start with the "C-Shell Archive" and do the steps that follow. If you must use the "Shell Archive" you should be aware that the lines you want to extract have been prefixed with "X", while most of the lines you want to discard have not. For a "C-Shell Archive" such prefixes are rare and the file is easier to read. Assume you have a "C-Shell Archive". Use your editor to separate the different parts of the file into individual files in your workspace. Each part starts with a lot of unixisms, then several blank lines and then two lines which identify the file, and most importantly, contain the text "CUT_HERE_CUT_HERE_CUT_HERE" You can look at the line before and the line after to see if you are at the head or tail of a file. Use your editor to search for the "CUT_HERE" lines. Each part is carefully labeled and indicates the recommended filename for the separated file. On some machines these filenames may need to be altered to suit the OS or compiler. 1.3. MANIFEST The partitions are as follows: part filename description 1 cyclops.src/README.cyclops this file 2 cyclops.src/MANIFEST a list of files in the kit 3 cyclops.src/Makefile a control file for make to compile and test the code 4 cyclops.src/cyclops.cmn CYCLOPS common block 5 cyclops.src/cyclops.f CYCLOPS fortran source 6 cyclops.src/cyclops_test.prt CYCLOPS stdout test output 7 cyclops.src/mtest.cyc CYCLOPS STARCHEK output 2. COMPILING AND EXECUTING Here are the recommended steps for a UNIX system. Vary this according to the requirements of your OS and compiler. You will find it simplest to work if you place the cyclops files together in a common subdirectory named 'cyclops.src'. Be very careful if you place them in a directory with other files, since some of the build and test instructions remove or overwrite existing files, especially with extensions such as '.o', '.lst', or '.diff'. On a UNIX system, most of what you need to do to build and test cyclops is laid out in Makefile. *** Be sure to examine and edit this file appropriately before using it.*** But, once properly edited, all you should need to do is 'make clean' to remove old object files, 'make all' to build new version of 'cyclops' and 'make tests' to test what you have built. For non-UNIX-like environments, you will have to provide replacements for iargc, getarg and getenv. The following are reasonable possibilities: integer function iargc(dummy) iargc=1 return end subroutine getarg(narg,string) integer narg character*(*) string string=char(0) if (narg.eq.1) string='-vn' return end subroutine getenv(evar,string) character*(*) evar,string string=char(0) if(evar.eq."CYCLOPS_INPUT_TEXT") * string='STARTEXT'//char(0) if(evar.eq."CYCLOPS_VALIDATION_OUT") * string='STARCHEK'//char(0) if(evar.eq."CYCLOPS_CHECK_DICTIONARY") * string=char(0) return end This combination of substitute routines would "wire-in" CYCLOPS2 to read its input text from a file named STARTEXT, write the validations output to a file named STARCHEK, and check names against the default dictionary or file listing dictionaries STARDICT Note that the PARAMETER 'MAXBUF' should contain the maximum number of characters contained on a single text line. The default value is 200. If you don't wish to use the Makefile or can't, then here are the essential steps to build cyclops: (a) compile 'cyclops.f' [note that you will need the files 'cyclops.cmn', 'ciftbx.cmn', ciftbx.cmv', 'ciftbx.cmf', and 'ciftbx.sys' in the same directory as cyclops.f, or you will need to change the fortran includes.] (b) if you hove not already done so, compile 'ciftbx.f' and 'hash_funcs.f' to create the object files 'ciftbx.o' and 'hash_funcs.o'. (c) link 'cyclops.o', 'ciftbx.o' and 'hash_funcs.o' together to make the program 'cyclops' (d) if at all possible, please test cyclops by creating a file named 'STARDICT' with the three lines #DICT cif_core.dic cif_mm.dic and place a copy of cif_core.dic and of cif_mm.dic into the same directory. Then execute the command ./cyclops mtest.prt STARCHEK 2> cyclops_test.lst Compare STARCHEK to 'mtest.cyc' and 'cyclops_test.lst' to 'cyclops_test.prt' There should not be any differences. (e) if you have any problems with this process please report them to Herbert J. Bernstein [em: yaya@bernstein-plus-sons.com, ph: 1-516-286-1339, fax: 1-516-286-1999] for the changes from CYCLOPS to CYCLOPS2 in particular, or to Syd Hall [em: syd@crystal.uwa.edu.au fx: 61(9)3801118] for general ciftbx and CYCLOPS issues. If in doubt as to where your problem lies, send it to whichever one of us is more likely to be convenient to your time-zone, and we will try to sort things out for you. FILES USED dictionary input input on device 1 Optionally a list of dictionaries, 1 per line after a #DICT line validation output output on device 6 ('stdout') Input text input on device 1, if a file, 5 if 'stdin' Message device output on device 0 ('stderr') may be diverted to the validation output file Direct access in/out on device 3 SAMPLE REPORT The following is a portion of CYCLOPS report output when run to check the file mtest.lst from this kit against the dictionaries cif_core.dic and cif_mm.dic: CYCLOPS Check List ------------------ Dictionary data names = 2244 New data names in text = 4 [1] Dictionary cif_core.dic 2.0.1 data names = 624 [2] Dictionary cif_mm.dic 0.9.01 data names = 1620 Data names NOT in Dictionary Line Numbers _blat1 . . . . . . . . . . . . . . . . . . . . . . . . 9 11 94 9 6 197 199 306 312 318 32 4 330 _blat2 . . . . . . . . . . . . . . . . . . . . . . . . 13 15 98 10 0 201 203 303 309 315 32 1 327 _dummy_test . . . . . . . . . . . . . . . . . . . . . 5 7 90 9 2 193 195 217 _rubish_here . . . . . . . . . . . . . . . . . . . . . 447 [1] Dictionary cif_core.dic 2.0.1 [2] Dictionary cif_mm.dic 0.9.01 Line Numbers [2] _atom_site.calc_attached_atom . . . . . . . . . . 429 [1] = _atom_site_calc_attached_atom 428 [2] _atom_site.calc_flag . . . . . . . . . . . . . . . 426 [1] = _atom_site_calc_flag 425 [2] _atom_site.fract_x . . . . . . . . . . . . . . . . 38 44 50 40 6 [1] = _atom_site_fract_x 405 [2] _atom_site.fract_y . . . . . . . . . . . . . . . . 39 45 51 41 0 [1] = _atom_site_fract_y 409 [2] _atom_site.fract_z . . . . . . . . . . . . . . . . 40 46 52 41 4 [1] = _atom_site_fract_z 413 WARNING There were major changes to CYCLOPS2 in the transition to version 2.1.0. In addition to major restructuring of the output, input and output files are handled very differently. CYCLOPS2 now reads the input text file from the standard input file, writes the report to the standard output file, and writes messages to the standard error file. The redirection to specific files in UNIX may be given either by listing the source and target filenames on the command line or by using "<" and ">". This makes is easier to use CYCLOPS2 as a "filter" in UNIX-like environments, but invalidates jobs which depend on the former CYCLOPS2 behavior inherited from CYCLOPS: writing the report to a file named "STARCHEK" and messages to the standard output file. For example, if, on a UNIX system, under sh, you had a command line of the form: cyclops < text > summary the new command line could be cyclops text STARCHEK 2> summary CHANGES CYCLOPS2 works very much that same way CYCLOPS version 1 works. However, the dictionary processing, for good or ill, is now that used by CIFtbx2. This means that some user-defined dictionaries which do not follow CIF-style dictionary practices may produce warning messages. We had to make this change to be able to find all the name definitions in the newer DDL2-based dictionaries, since some names definitions are embedded in loops quite remote from the defining tags. The STARCHEK output has been reorganized to be more practical with very large dictionaries. The list of data names not defined in any dictionary now comes first. This is followed by the data names which are both defined and referenced. Finally, the names which are defined but not referenced are listed. The number of line numbers listed for any name has been reduced from 99 to 19. If there are more than 19 lines on which a name is used, this is indicated after the 10th line number by ".." with the first 10 and the last 9 line numbers being explicitly given. To avoid having to copy large dictionaries and to allow for multiple, layered dictionaries, the file STARDICT may optionally contain a list of paths to dictionaries to load. To distinguish this case from a dictionary loaded directly from STARDICT, the first 5 characters of STARTDICT must be "#DICT" followed by a list of dictionary file names, one per line. If the dictionaries are not in the same directory, give full path names to the proper directories. ***** WARNING ***** CYCLOPS2 will overwrite an existing STARCHEK file without warning messages. Release 2.1.4 restored the missing code for the "-f command-file" command line option. Examples were changed to use cif_core.dic and cif_mm.dic. Several typos in the documentation were corrected. The Makefile was updated to use compressed dictionaries. The new command line option "-c catck" to control dictionary category checking was added. The default is not to check. Release 2.1.3 added the option "-p priority" to the command line, and changed the handling to STARDICT to prevent opening of STARDICT if any dictionaries are specified on the command line. The default list of tags to be ignored was updated from the DDL0 list to the DDL1.4 list. The scan for tags enclosed in parentheses, brackets or braces was made more robust. Release 2.1.2 added the option "#SHORT" to STARDICT and "-s short" to the command line to allow for a short output of unreferenced names only. A bug which prevented recognition of names of the form "*_name" was corrected. Duplicate dictionary file names are now detected and removed. Release 2.1.1 made the necessary changes to adapt to the reorganization of 'ciftbx.cmn' and 'ciftbx.sys' in ciftbx 2.5.1. Release 2.1.0 reorganized the report, so that "[n]" is used to identify the fact that a tag came from dictionary "n" in a consolidated report for all dictionaries sorted alphabetically by name, first giving the data names which were not referenced, then the data names which were referenced, and, optionally (depending on the command-line argument -v[y|n]), a section reporting the data names defined in the dictionary, but not referenced. Release 2.0.4 used ciftbx 2.4.6 to report upper/lower case versions of data item names. Release 2.0.3 introduced support for aliases. The output is annotated with the remarks "Referenced Alias: ..." or "Unreferenced Alias: ..." listing the name of any referenced or unreferenced aliases to the name being cited. An unreferenced alias is not given a further listing, but a referenced alias will be listed with its own citations. Release 2.0.2 fixed a problem with identification of data names within quoted strings. In this release, the first blank after the initial underscore terminates the search for the end of the token. Release 2.0.1 differed from release 2.0 in having the format of the REPORT output cleaned up for better column alignment and some missing parameters on some of the cerr calls corrected. SIMPLIFIED OPERATION See PROGRAM OPERATION, above, for the full cyclops command line. However, for routine use, an approach very similar to that followed with CYCLOPS may still be followed. This simple script file 'cyclops.sh' might be useful to execute CYCLOPS on Unix machines. *** ===>>>Note the change in line 12 from the same example in earlier versions of 'cyclops.sh' <<<=== *** #! /bin/sh DICT=/usr1/syd/cif EXT=cif_core.dic if [ $# -eq 2 ]; then EXT=$2 fi DICT=$DICT/$EXT rm STARCHEK rm STARDICT echo "#DICT" > STARDICT echo $DICT >> STARDICT cyclops $1 STARCHEK rm STARDICT vi STARCHEK As a first test, after compiling and linking cyclops.f to create cyclops, use the above script to check 'cif_core.dic' itself. Enter: cyclops.sh cif_core.dic Note that the two "extra" data names detected in the dictionary arose from an appendix and an example. Use the listed line numbers to check this in the dictionary file. If you have a CIF that you wish to validate against the IUCr core dictionary, enter: cyclops.sh If you have a CIF that you wish to validate against another dictionary (say, cifdic.P92), enter: cyclops.sh cifdic.P92 _________________________________________________________________ Updated 27 May 1997 yaya@bernstein-plus-sons.com NOTICE Creative endeavors depend on the lively exchange of ideas. There are laws and customs which establish rights and responsibilities for authors and the users of what authors create. This notice is not intended to prevent you from using the software and documents in this package, but to ensure that there are no misunderstandings about terms and conditions of such use. Please read the following notice carefully. If you do not understand any portion of this notice, please seek appropriate professional legal advice before making use of the software and documents included in this software package. In addition to whatever other steps you may be obliged to take to respect the intellectual property rights of the various parties involved, if you do make use of the software and documents in this package, please give credit where credit is due by citing this package, its authors and the URL or other source from which you obtained it, or equivalent primary references in the literature with the same authors. Some of the software and documents included within this software package are the intellectual property of various parties, and placement in this package does not in any way imply that any such rights have in any way been waived or diminished. With respect to any software or documents for which a copyright exists, ALL RIGHTS ARE RESERVED TO THE OWNERS OF SUCH COPYRIGHT. Even though the authors of the various documents and software found here have made a good faith effort to ensure that the documents are correct and that the software performs according to its documentation, and we would greatly appreciate hearing of any problems you may encounter, the programs and documents any files created by the programs are provided **AS IS** without any warrantee as to correctness, merchantability or fitness for any particular or general use. THE RESPONSIBILITY FOR ANY ADVERSE CONSEQUENCES FROM THE USE OF PROGRAMS OR DOCUMENTS OR ANY FILE OR FILES CREATED BY USE OF THE PROGRAMS OR DOCUMENTS LIES SOLELY WITH THE USERS OF THE PROGRAMS OR DOCUMENTS OR FILE OR FILES AND NOT WITH AUTHORS OF THE PROGRAMS OR DOCUMENTS. THE IUCR POLICY ON THE USE OF THE CRYSTALLOGRAPHIC INFORMATION FILE (CIF) [This is a text printout of the IUCr web page at http://http://www.iucr.ac.uk/iucr-top/ipr.html as of 26 January 1997. Please consult the IUCr for current policy] The Crystallographic Information File (Hall, Allen & Brown, 1991) is, as of January 1992, the recommended method for submitting publications to Acta Crystallographica Section C. The International Union of Crystallography holds the Copyright on the CIF, and has applied for Patents on the STAR File syntax which is the basis for the CIF format. It is a principal objective of the IUCr to promote the use of CIF for the exchange and storage of scientific data. The IUCr's sponsorship of the CIF development was motivated by its responsibility to its scientific journals, which set the standards in crystallographic publishing. The IUCr intends that CIFs will be used increasingly for electronic submission of manuscripts to these journals in future. The IUCr recognises that, if the CIF and the STAR File are to be adopted as a means for universal data exchange, the syntax of these files must be strictly and uniformly adhered to. Even small deviations from the syntax would ultimately cause the demise of the universal file concept. Through its Copyrights and Patents the IUCr has taken the steps needed to ensure strict conformance with this syntax. The IUCr policy on the use of the CIF and STAR File processes is as follows: _________________________________________________________________ * 1 CIFs and STAR Files may be generated, stored or transmitted, without permission or charge, provided their purpose is not specifically for profit or commercial gain, and provided that the published syntax is strictly adhered to. * 2 Computer software may be developed for use with CIFs or STAR files, without permission or charge, provided it is distributed in the public domain. This condition also applies to software for which a charge is made, provided that its primary function is for use with files that satisfy condition 1 and that it is distributed as a minor component of a larger package of software. * 3 Permission will be granted for the use of CIFs and STAR Files for specific commercial purposes (such as databases or network exchange processes), and for the distribution of commercial CIF/STAR software, on written application to the IUCr Executive Secretary, 2 Abbey Square, Chester CH1 2HU, England. The nature, terms and duration of the licences granted will be determined by the IUCr Executive and Finance Committees. _________________________________________________________________ In summary, the IUCr wishes to promote the use of the STAR File concepts as a standard universal data file. It will insist on strict compliance with the published syntax for all applications. To assist with this compliance, the IUCr provides public domain software for checking the logical integrity of a CIF, and for validating the data name definitions contained within a CIF. Detailed information on this software, and the associated dictionaries, may be obtained from the IUCr Office at 5 Abbey Square, Chester CH1 2HU, England. _________________________________________________________________ IUCr Webmaster