TABLE OF CONTENTS

               GENERAL INFORMATION ..................... intro

               REFERENCING THE PHASES PACKAGE .......... 0.00

               GETTING STARTED ......................... 1.00
                 Accessing on line documentation ....... 1.01
                 Template scripts and files ............ 1.02
                 Flow Charts ........................... 1.03
                 File Formats .......................... 1.04

               PROGRAM WRITEUPS ........................ 2.00
                 Phasit ................................ 2.01
                 Bndry ................................. 2.02
                 Fsfour ................................ 2.03
                 Mapinv ................................ 2.04
                 Pamfile ............................... 2.05
                 Mapview ............................... 2.06
                 Gmap .................................. 2.07
                 Missng ................................ 2.08
                 Mrgdf ................................. 2.09
                 Mrgbdf ................................ 2.10
                 Rd31 .................................. 2.11
                 Mk31b ................................. 2.12
                 Psrch ................................. 2.13
                 Cmbiso ................................ 2.14
                 Cmbano ................................ 2.15
                 Topdel ................................ 2.16
                 Gref .................................. 2.17
                 Import ................................ 2.18
                 Extrmap ............................... 2.19
                 Extrmsk ............................... 2.20
                 Mapavg ................................ 2.21
                 Maporth ............................... 2.22
                 Lsqrot ................................ 2.23
                 Lsqrotgen ............................. 2.24
                 Skew .................................. 2.25
                 Bldcel ................................ 2.26
                 Mdlmsk ................................ 2.27
                 Mrgmsk ................................ 2.28
                 Trnmsk ................................ 2.29
                 Rdhead ................................ 2.30
                 Precess ............................... 2.31
                 O_to_SP ............................... 2.32
                 Xpl_phi ............................... 2.33
                 Pdb_cds ............................... 2.34
                 Rmheavy ............................... 2.35
                 Ctour ................................. 2.36
                 Viewplt ............................... 2.37
                 Plttek ................................ 2.38
                 Mkpost ................................ 2.39
                 Pstats ................................ 2.40
                 Hndchk ................................ 2.41
                 Sloext ................................ 2.42
 
               EXAMPLES ................................ 3.00
                 Pamfile ............................... 3.01
                 Initial phasing ....................... 3.02
                 Solvent levelling ..................... 3.03
                 Doall scripts ......................... 3.04
                 Expected output ....................... 3.05

               NATIVE, DIFFERENCE AND "CALCULATED"
               PATTERSON MAPS .......................... 4.00

               REFINING HEAVY ATOM PARAMETERS .......... 5.00

               HEAVY ATOM DIFFERENCE, DOUBLE DIFFERENCE
               AND CROSS DIFFERENCE FOURIER MAPS ....... 6.00

               CREATING/EDITING SOLVENT MASKS .......... 7.00

               INCORPORATION OF PARTIAL STRUCTURES ..... 8.00

               REDUCED BIAS NATIVE, COMBINED AND
               DIFFERENCE FOURIER MAPS ................. 9.00

               INCORPORATION OF NONCRYSTALLOGRAPHIC
               SYMMETRY AVERAGING ..................... 10.00
                 Averaging with Multiple Crystals ..... 10.01
                 Averaging Difference or 2FO-FC Maps .. 10.02
                 Sample Input Files for Averaging ..... 10.03

               DENSITY MODIFICATION WITH MOLECULAR
               REPLACEMENT DERIVED PHASE INFORMATION .. 11.00

               PHASE EXTENSION ........................ 12.00
 
               MAD PHASING ............................ 13.00
 
               VMS USER INFORMATION ................... 14.00

               UNIX SHELL SCRIPTS ..................... 15.00

               VMS COMMAND PROCEDURES ................. 16.00

PHASES PHASES is a package of computer programs designed to compute phase angles for diffraction data from macromolecular crystals. The package is complete in that it contains programs for the following: merging and scaling of native and derivative data sets; analyzing difference statistics; computing Patterson and electron density maps; searching for peaks; refining heavy atoms (or protein domains as rigid groups); computing phases by MIR (multiple isomorphous replacement), SIR (single isomorphous replacement), SAS (single wavelength anomalous scattering), SIRAS (single isomorphous replacement supplemented with anomalous scattering), MIRAS (multiple isomorphous replacement supplemented with anomalous scattering) or from atomic coordinates for an input model; noncrystallographic symmetry averaging; combining phases from a partial structure with MIR etc phases; computation and analysis of cross difference or Bijvoet difference Fourier maps; and for phase extension and refinement. Once an initial set of phases is generated, programs are included to improve them by carrying out solvent levelling with negative density truncation and/or combination with model based phase information and/or averaging over noncrystallographic symmetry. Solvent levelling is facilitated by the automatic protein-solvent boundary determination method (Wang, in Methods in Enzymology 115, 1985) which is implemented here entirely in reciprocal space in a much more efficient manner than in previous programs. If applied to SIR or SAS starting phases, the programs can also carry out the ISIR or ISAS phasing procedures described by Wang. The programs are written in FORTRAN 77 with the exception of a single C interface subroutine (facilitating use of X-Window graphics in some programs) and are applicable to any space group without requiring changes by the user. The program package was written by W. Furey, VA Medical Center and University of Pittsburgh, Dept. of Crystallography. The package consists of 5 major programs and many utility programs as follows: PROGRAM FUNCTION PHASIT.F Computes MIR, SIR, MODEL etc phases from input atomic parameters and diffraction data. Can refine heavy atoms or derivative scaling parameters in "phase refinement" mode. BNDRY.F Computes coefficients for automatic boundary determination. Determines protein-solvent boundary mask, flattens solvent and applies negative density truncation, combines phases from external sources (map inversion or from partial structures) with original phase information, extends phases to higher resolution. FSFOUR.F Space group general 3D FFT program for electron density calculations. MAPINV.F Space group general 3D FFT program for structure factor calculations. * MAPVIEW.F Interactive contouring/map viewing program. Allows user to view maps and masks, and trace/edit solvent or averaging masks. CTOUR.F Creates contoured plots from FSFOUR maps, either as individual sections, mono or stereo projections. Plots can be viewed directly or converted to PostScript. GMAP.F Extracts region from a FSFOUR map and creates corresponding maps for the graphics programs TOM, O or CHAIN. Also can create skeleton files for TOM or O. MISSNG.F Selects reflections for phase extension. MRGDF.F Generates coefficients for isomorphous difference Fourier or cross Fourier. MRGBDF.F Generates coefficients for Bijvoet difference Fourier or cross Fourier. RD31.F Converts internal binary file to ASCII for examination and/or editing. MK31B.F Restores ASCII version of file to binary. PSRCH.F Searches Fourier map and lists unique peaks. CMBISO.F Combines native and derivative isomorphous replacement data into one file and scales the derivative data to the native. CMBANO.F Combines native data and derivative anomalous scattering data into one file and scales the derivative data to the native. TOPDEL.F Examines isomorphous/anomalous scattering differences, identifies and rejects outliers, prepares file for difference Pattersons. GREF.F Refines heavy atom parameters against isomorphous or anomalous scattering differences; refines protein domains, substructures etc as rigid groups against native data. RMHEAVY.F Temporarily removes density in map from heavy atoms, to aid in accurate solvent mask generation. IMPORT.F Allows user to introduce his own phases and Hendrickson-Lattman coefficients (computed by external programs) into the PHASES package for subsequent calculations. This allows one to bypass the PHASIT program. XPL_PHI.F Creates input reflection file for XPLORE from a PHASES style phased file. * PRECESS.F Lets one construct and interactively examine "pseudo" precession or "pseudo" difference precession photos made from reflection files. * VIEWPLT.F Displays up to 10 plots created by CTOUR on workstation or X-Window capable monitor. PLTTEK.F Displays plots created by CTOUR on terminals capable of using TEKTRONIX 4010 emulation. MKPOST Converts plots created by CTOUR to PostScript. PDB_CDS.F Converts coordinate files between PDB and PHASES formats, and vice versa. EXTRMAP.F Extracts a region (submap) from the standard FSFOUR map for use in averaging, skewing etc. EXTRMSK.F Extracts a region (submap) from the standard solvent mask for possible editing, skewing etc. MAPAVG.F Averages one or more maps to impose non- crystallographic symmetry. MAPORTH.F Orthogonalizes non-orthogonal map (and optionally mask) for use in refinement of noncrystallographic symmetry operator. LSQROT.F Refines purely rotational noncrystallographic symmetry operator against electron density. LSQROTGEN.F Refines general noncrystallographic symmetry operator (arbitrary rotational angle, with translation) against electron density. SKEW.F Skews a map (and optionally a mask) to a new, and arbitrarily oriented cell. BLDCEL.F Rebuilds a complete unit cell map (and optionally, mask) from an input asymmetric unit submap (and optionally, mask). MDLMSK.F Creates a mask from coordinates in an input atomic model, for use in averaging, NC symmetry operator refinement or use in solvent flattening. MRGMSK.F Merges multiple masks created by MDLMSK into a single mask. TRNMSK.F Transforms mask created in a "skewed" cell back to the normal cell. HNDCHK.F Interpolates density from a map at specified sites, usually for the purpose of determining the proper hand. SLOEXT.F Controls number of iterations and rate of phase extension to higher resolution. RDHEAD.F Dumps header from averaging map (submap) or mask files for examination. O_TO_SP.F Extracts spherical polar angles and axis location for use in PHASES from rotation matrix/translation vector produced by program "O" PSTATS.F Tabulates mean phase difference between two phase sets as a function of d spacing. * Two versions are provided for these interactive graphical programs, one for Silicon Graphics hardware which uses the GL, and the other, (called MAPVIEW_X, PRECESS_X, VIEWPLT_X) which can run on any system with a color monitor supporting the X-Window protocol (which also includes SGI). Versions of the package are supplied for Silicon Graphics, Sun, IBM R6000, ESV and DEC ALPHA (both OSF and OPENVMS) workstations, but it should be easy to port to other machines. It generally will not be necessary to modify the programs as they are "self adjusting" and can accomodate various sized problems. However, in the unlikely event that modifications are necessary, messages indicating what should be changed will be printed (it will never involve more than 3 lines of code). Each program is independent and can be run in standalone mode, and individual write-ups are supplied for each. Usually however, a command file is submitted to invoke an entire sequence of program executions constituting a complete phasing application. Template command files are given to carry out the entire phasing process for both UNIX and VMS based computers. Comments and/or inquires should be made to: Dr. William Furey Biocrystallography Laboratory PO BOX 12055 VA Medical Center University Drive C Pittsburgh, Pa 15240 tel (412) 683-9718 E-Mail fureyw@VMS.CIS.PITT.EDU The documentation is organized such that first a description of the necessary files is provided, followed by an overview of the phasing process. Then flow charts are given indicating how the programs can be used to carry out frequently needed tasks. Then individual program write-ups are given, which include descriptions of computations performed by the major programs. Then sample input files are given both for routine phasing and for solvent flattening. Then descriptions of procedures for refining heavy atoms, creating/editing solvent masks, incorporating partial structure information, noncrystallographic symmetry averaging, phase extension and MAD phasing are provided. Finally, a listing of all template command procedures for both UNIX and VMS systems is given.
0.00 REFERENCING THE PHASES PACKAGE When publishing results obtained from use of the software, a statement should be included like "all heavy atom refinement, phasing, solvent flattening, noncrystallographic symmetry averaging, map calculations etc. (or whatever is appropriate) were carried out with the PHASES package (Furey & Swaminathan, 1995)." This refers to the following paper: "PHASES-95: A Program Package for the Processing and Analysis of Diffraction Data from Macromolecules", W. Furey & S. Swaminathan, in MACROMOLECULAR CRYSTALLOGRAPHY, a volume of Methods in Enzymology, eds. C. Carter & R. Sweet, Academic Press, Orlando, Fl. (1996), in press.
1.00 GETTING STARTED The first thing to do is to prepare an input parameter file specifying the cell constants, symmetry information etc. This file is referred to as the "standard parameter file" throughout the PHASES package, and is often called "PAMFIL" generically in specific program writeups. One should select a name for it which is indicative of the particular structure being worked on, and rapidly communicates to the user that it is a parameter file. For example, PDC.PAM might be a good choice for phasing pyruvate decarboxylase. The main purpose of this file is to insure consistency in cell constants, symmetry, lattice type etc throughout all programs, and to eliminate redundant input of these parameters by the user. In addition one can optionally specify the name of a "running log file." If this is done then in addition to normal output to either the screen or individual log files for each program, a copy of all printed output is also appended to a single file, preceeded by a time stamp indicating what program was run and when. Thus one can maintain a complete history of all computations and results in a single log file. Each standard parameter file should contain the following information in the indicated sequence. LOGFILE=FILNAME Where FILENAME is the name of the desired "running" log file. If no cumulative log is desired, enter LOGFILE=NULL There must be no spaces immediately preceeding or following the "=". Upper or lower case is permitted. LATTICE=X Where "X" is either P,A,B,C,I,F or R There must be no spaces immediately preceeding or following the "=". Upper or lower case is permitted for the word LATTICE, but only UPPER case for the single character symbol. A, B, C, ALPHA, BETA, GAMMA Unit cell constants, in angstroms and degrees. Readable in free format, i.e. at least one blank or comma separating entries. NSYM Number of equivalent positions in the space group. Do NOT include additional translations associated with centering conditions for non-primitive lattices, i.e. for space group C2 NSYM=2. (this entry read in free format). The NSYM symmetry operators follow, one operator per line EXACTLY as indicated in the International Tables for X-Ray Crystallography. The first operator should ALWAYS be X,Y,Z. Note that for rhombohedral lattices the HEXAGONAL CELL AND SYMMETRY OPERATORS SHOULD BE USED, along with the lattice type R. The following sample serves as a complete template for a parameter file, for space group P2(1)2(1)2(1) LOGFILE=seb.rlog LATTICE=P 45.331 68.33 79.62 90. 90. 90. 4 X,Y,Z 1/2-X,-Y,1/2+Z 1/2+X,1/2-Y,-Z -X,1/2+Y,1/2-Z Once a suitable parameter file is created, the phasing process can begin. One starts phasing by preparing one or more "scaled" or "merged" files containing x-ray diffraction data. The files will vary depending on whether isomorphous replacement or anomalous scattering data is to be used for phasing. Each file should be ASCII (read in free format) with all records containing the same type of information. Each record should contain H, K, L, FP, Sig(FP), FPH, Sig(FPH) for isomorphous replacement or H, K, L, F+, Sig(F+), F-, Sig(F-) for native anomalous scattering or H, K, L, FP, Sig(FP), FPH+, Sig(FPH+), FPH-, Sig(FPH-) for derivative anomalous scattering where H, K, L = Miller indices (integers). FP, FPH = Native and Derivative structure factor amplitudes F+, F- = Structure factor amplitudes for reflection. F+ corresponds to indices H, K, L, F- to -H,-K,-L. FPH+, FPH- = Derivative structure factor amplitudes. FPH+ corresponds to indices H, K, L, FPH- to -H, -K, -L. Sig(X) = Estimated standard deviation for quantity X. A separate file should be prepared for each derivative/anomalous scattering data set. For isomorphous replacement and derivative anomalous scattering data the FPH values should have already been properly scaled to the FP values. If more than one data set is to be used for phasing, then ALL F VALUES SHOULD BE ON THE SAME SCALE. Indeed, for MIR phasing it is best to keep corresponding FP values IDENTICAL in each data set. The "scaled/merged" files are usually prepared by the programs CMBISO or CMBANO and are generally given filenames ending in ".scl", but they can also be generated externally by the user. It is always desirable however, to use the ".scl" ending as some of the programs in PHASES will deduce the file format from the ending of the filename. Once these files are prepared, they can be used to create difference Pattersons to identify heavy atom sites. A control file containing heavy atom parameters (either for the derivative, or anomalous scatterers) must then be prepared, and GREF or PHASIT can be run. If PHASIT is simply used to compute structure factors from a model, then the ".scl" files are not needed, but reflection and coordinate files must still be supplied. One can use the phase file output from PHASIT directly to compute an electron density map with FSFOUR, or the file can be used with programs BNDRY, FSFOUR and MAPINV to carry out solvent levelling, negative density truncation, phase extension and phase combination analogous to Wang's ISIR procedure. If the latter is selected, then the file output from PHASIT should be named "phasit.31" In general, programs CMBISO and/or CMBANO are used to prepare all reflection data files. Then TOPDEL is run to reject outliers and select data for difference Patterson calculations to be performed by FSFOUR. The Patterson map can be interactively contoured and examined in MAPVIEW, searched for peaks in PSRCH, or contoured to generate hard copies as PostScript files with CTOUR and MKPOST. Once heavy atom locations are identified, they can be refined by GREF or PHASIT. The heavy atom parameters and data are then used in PHASIT to compute SIR, MIR phases etc. Next MISSNG is run followed by the solvent flattening/negative density truncation/phase extension iterations carried out by BNDRY and invoked by the procedure DOALL. If more than one derivative is needed, or one wants to search for additional heavy atom sites, programs MRGDF and/or MRGBDF can then be used to create difference or cross difference coefficients, FSFOUR computes the map, and MAPVIEW, PSRCH or CTOUR are used to identify peaks again. One can also use the difference coefficients files produced by PHASIT to compute "double difference" type maps to search for minor sites. The new heavy atom parameters are then included in PHASIT, and the process is repeated. This procedure can be cycled over as many derivatives or data sets as needed. As a final step, it is often useful to hold the "solvent flattened" phases fixed in PHASIT, and refine the heavy atom parameters again. This final set of heavy atom parameters is then used to compute final MIR etc phases in PHASIT, which are then used to start a final round of solvent flattening. The final map resulting from these phases can be interactively contoured and examined in MAPVIEW, converted to graphics map format (e.g. for TOM, O or CHAIN) and skeletonized by GMAP, or hard copies can be prepared by CTOUR and MKPOST.
1.01 ACCESSING ON LINE DOCUMENTATION The complete PHASES manual (what you are reading now) is maintained online in the file PHASES.WUP. This file generally resides in the top level of the PHASES directory, which initially is a subdirectory under "export" (on UNIX systems), but its location may vary depending on how one installs the software. On OpenVMS systems it can be accessed by referring to PHASES_DOC:PHASES.WUP (if one installs the software as described later). It is recommended that each user make a copy of the manual in his own working directory so it can be examined without fear of destroying the original. The manual is a simple ASCII text file and can be examined in the editor of your choice. All program write-ups begin with the program name followed by a single space and then by the word "WRITE-UP" (all in uppercase), so that, for example, to get to the write-up for program FSFOUR one can simply enter an editor and search for "FSFOUR WRITE-UP" or just "FSFOUR W". This will position the editor at the appropriate place in the manual. Just be sure to exit the editor without making any changes. Indeed, it may be desirable to set the file protection so that it can be read but not written.
1.02 TEMPLATE SCRIPTS AND FILES Included with the PHASES distribution are a series of sample control files (*.sh or *.com files) as well as sample input data (*.d or *.dat files). As initially distributed, these files reside in the top level of the PHASES directory (itself a subdirectory under "export" on UNIX systems, or in PHASES_TEMPL, if installed as suggested in the "VMS USER INFORMATION" section). The "*.sh" files are UNIX shell scripts to invoke one or more programs, while the "*.com" files accomplish the same tasks under OpenVms. Similarly, the "*.d" and "*.dat" files are sample data inputs for programs under the UNIX and OpenVms operating systems, respectively. Generally the "*.d" and "*.dat" files are identical. It is suggested that each user copy these files to his working directory to serve as templates for new applications. This will minimize the possibility of typing errors, and also serve as an example for a particular calculation. Indeed, it may be desirable to open two windows, one editing the template file and the other positioned to examine the appropriate write-up as described in the preceeding section.
1.03 FLOW CHARTS Native file Derivative file . . . . v v ************************************ * CMBISO or CMBANO * ************************************ . . . "Scaled/Merged" file . . v ***************** * TOPDEL * ***************** . . "Patt" file . . v ***************** * FSFOUR * ***************** . . "Map" file . . v .................................................... . . . . . . v v v **************** ****************** ************* * PSRCH * * MAPVIEW * * CTOUR * **************** ****************** ************* Path for initial processing of a derivative data set, includes merging and scaling native and derivative data, rejecting outliers, computing difference Patterson maps and examination. "Scaled/Merged" file(s) . . v *************** "Phased" file, * PHASIT * from BNDRY after *************** solvent flattening . . . . . . . "Phased" file . . . . . . . . v v . ................................... . . . . . v . ********************* . * MRGDF or MRGBDF *<-- "Scaled/Merged . ********************* file" "difference file" . . . . "Cross phase" file . . . . . v . ********************* ..............>* FSFOUR * ********************* . . "Map" file . . v .................................................... . . . . . . v v v **************** ****************** ************* * PSRCH * * MAPVIEW * * CTOUR * **************** ****************** ************* Paths for generating and examining "cross difference" Fourier, "cross Bijvoet difference" Fourier, "double difference" Fourier or "double Bijvoet difference" Fourier, started either by generating SIR, MIR etc phases, or using "solvent flattened" phases. Native file . . v ***************** "Phased" *************** ----------* PHASIT * ----------------->* MISSNG * . ***************** file . *************** . . . . . . . . "Extension" file . . . . . . Partial . . structure ***************** . . file * FSFOUR *<------ . . . ***************** . . . . . ^ ----- . . . . . . . . . -------- . . . . . . . . . . . . v . v . . ***************** . --------->* BNDRY *<------------------------- ***************** . ^ . . . . v . ***************** * MAPINV * ***************** Path for solvent flattening process, as implemented in the "doall" procedure. Starts with SIR, MIR etc phases and includes mask generation, solvent flattening and phase combination iterations. The leftmost and rightmost branches are optional, for inclusion of partial structure information and phase extension, respectively. The FSFOUR- BNDRY-MAPINV loop performs the iterations. The PHASIT output is fed directly to FSFOUR only during the initial pass, to generate the first map. In all passes it is fed to BNDRY to serve as the "anchor" phases in the phase combination step. ************ ************* * FSFOUR *------------>* EXTRMAP * ************ ************* ^ ^ . . . . . . v ************** . . ************* * PHASIT *------- . * MAPAVG *<---- ************** . . ************* . . . . . . . . "Envelope" . . . Mask v . v . ************ ************* . "Extension"----->* BNDRY *<------------* BLDCEL *<---- file ************ ************* ^ . . . . v ************ * MAPINV * ************ Path followed during solvent flattening iterations modified to include noncrystallographic symmetry averaging. The PHASIT output is fed directly to FSFOUR only during the initial pass, to generate the first map. In all passes it is fed to BNDRY to serve as the "anchor" phases in the phase combination step. The "extension" file is optional, and is used for phase extension only.
1.04 FILE FORMATS Most of the programs in the PHASES package utilize the same internal file formats, choosen for combinations of simplicity and efficiency. The major files used are now described. 1) "Input" files. Entering data initially into the package assumes one can prepare reflection files either in free format, as XENGEN-like "MULISTS", or as "SCALEPACK" style files. Thus input structure factor files can have any of the following record formats. FREE FORMAT i.e. h, k, l, F, sig(F) (ASCII, read in free format) The free format input file is generally assumed in the programs if the filename ends in ".DAT" or ".dat", and sometimes will be assumed if no other file type is deduced from the ending of the filename. The "free format" implies that the values in each record are separated by at least one blank space or a comma. or XENGEN like "MULIST" i.e. h, k, l, res, F, sig(F), F+, sig(F+), F-, sig(F-), iflag in format ( 3I4, 1X, F6.4, 6(1X, F8.2) 1X, I2 ) The "iflag" status flag is optional. If present, it will be used to screen for viable anomalous scattering data. If absent, only values with F+ and F- greater than zero will be used when anomalous data is needed. MULIST format is generally assummed within the programs if the filename ends in ".MU" or ".mu". or SCALEPACK style files i.e. the file starts with a variable number of header records, the total number of which is given by one plus 2 times the number given in the first header record (format I5). After the header records (usually 3) the data follows as individual records containing h, k, l, I+, sig(I+), I-, sig(I-) in format ( 3I4, 4F8.0 ) Note that unlike the other formats, for SCALEPACK files INTENSITIES and their standard deviations are given instead of AMPLITUDES. Also, the files need not contain Bijvoet pair data as the last two items may be missing, as would be the case if the data were reduced treating Freidel mates as equivalent. SCALEPACK format is generally assumed within the programs if the filename ends in ".SCA" or ".sca". 2) "Scaled" (and merged) structure factor files. These files are produced by CMBISO or CMBANO, starting with input files of type (1). The files are ASCII, with each record containing h, k, l, FP, sig(FP), FPH, sig(FPH) or h, k, l, FP+, sig(FP+), FP-, sig(FP-) or h, k, l, FP, sig(FP), FPH+, sig(FPH+), FPH-, sig(FPH-) in format ( 3I4, 6F10.2) for either isomorphous, native anomalous or derivative anomalous data sets, respectively. SCALED files are generally assumed within the programs if the filename ends in ".SCL" or ".scl". 3) "Phased" structure factor files. These files are produced mainly by PHASIT and BNDRY, but can also be generated by other programs. There are two types of "phased" files, depending on whether or not probability distributions are available. Both types of files are BINARY, but can be converted to ASCII by the utility program RD31. The first type, the normal or "long" format has records containing h, k, l, FOM*FO, FO, PHIbest, A_B, C_D, MK, FOM where h, k, l, A_B, C_D and MK are INTEGERS, and the others REALS The Hendrickson-Lattman probability distribution coefficients are packed two per word, in the A_B and C_D entries according to A_B = ( IFIX(A*100) + 16384 )*32768 + IFIX(B*100) + 16384 C_D = ( IFIX(C*100) + 16384 )*32768 + IFIX(D*100) + 16384 FO is the observed protein structure factor amplitude, PHIbest the "best" (centroid) phase in degrees, and FOM the associated figure of merit. MK is the restricted phase indicator, such that if MK=1 there are no restrictions. If MK > 1, then the reflection is centric, with one of the allowed phases given by 15*(MK-1), and the other 180 degrees away from it. There is also an alternate version of the "long format" phase file, obtained only from running option 3 of BNDRY with IOTYP=1, which has FO and FC replacing FOM*FO and FO in the records. This file type is used ONLY if one wants to do solvent flattening and/or NC symmetry averaging iterations on DIFFERENCE or 2FO-FC MAPS. Its usage is explained elsewhere in the documentation. The second type, or "short" format has records containing h, k, l, FO, FC, PHI where FC is the "calculated" structure factor amplitude, as typically computed from input coordinates for a model in PHASIT, GREF, or output from the map inversion program MAPINV. Note that the Fourier program FSFOUR only reads the first six entries in a record, so that in general, EITHER type of "phased" file can be used for map calculations. However, some map types might be accessible with only one of the formats (e.g. difference maps). Both long and short format PHASED files are generally recognized within the programs if the filename ends in ".31". 4) "Mask" files. These files are binary, with the same record format applying both to "solvent masks" and "averaging masks." The file starts with a header record containing A,B,C,AL,BE,GA,NX,NY,NZ,IXMN,IYMN,IZMN,IXMX,IYMX,IZMX with the first 6 values REAL*4, the next 9 INTEGER*4, the lengths in Angstroms and the angles in degrees. NX = Number of grid points defining one "cell length" along NY = respective axis. Implicitly defines grid spacing as del x = A/NX, del y = B/NY and del z = C/NZ NZ = IXMN, IXMX = Minimum, maximum grid index defining map region such IYMN, IYMX = that x (fractional) = IX * (del x) / A etc. There are no restrictions on magnitudes or signs. IZMN, IZMX = The mask follows as (IYMX-IYMN+1)*(IZMX-IZMN+1) records, with each containing one row (IXMX-IXMN+1 BYTE values) along X, starting at IXMN. Y is slowest varying, i.e. the file could have been created with the following FORTRAN code: DO 30 IY=IYMN,IYMX DO 20 IZ=IZMN,IZMX 20 WRITE(LU)(MSK(IX,IY,IZ),IX=IXMN,IXMX) 30 CONTINUE Note that the mask entries are FORTRAN type BYTE (INTEGER*1). For solvent masks, the entries will either be 0 (protein) or 2 (solvent). For averaging masks only the values 0, 10, 20, 30, 40 etc are meaningful as they indicate the grid point is inside the primary envelope for molecules 1, 2, etc. The masks can be displayed with program MAPVIEW, and program RDHEAD can be used to list the header record. 5) "FSFOUR" maps. These maps are produced by FSFOUR (and BLDCEL). They are binary, and contain a variable number of header records followed by the map. The map ALWAYS covers one full cell. See FSFOUR write-up (and possibly examine the program source) for further details. 6) "Submaps" Also referred to as "averaging" maps. These map files are binary, with the same header and record structure as "mask" files, except that the density values are written as FORTRAN type REAL instead of mask values. They are usually prepared by MAPVIEW or EXTRMAP, but can be generated by MAPORTH, SKEW, TRNMSK etc. Note that RDHEAD can also be used to list the header record. 7) "Extension" file. Used for phase extension, and created by program MISSNG. This file is ASCII, and contains a list of reflection indices, Fobs and phase probability distribution coefficients, for reflections absent on the main "phased" file, but for which native amplitudes and possibly phase probability distribution coefficients are available. It is used only for phase extension. The records simply contain h, k, l, Fobs, A_B, C_D in format ( 3I4, F10.2, 2I12 ) where the distribution coefficients are packed as in a normal phased file. If no distribution coefficients are available the A_B and C_D values are zero.
2.00 PROGRAM WRITEUPS This section includes writeups for each individual program. For the major programs (PHASIT, BNDRY etc), in addition to describing the required input data, a complete description of how the program works is included. Often, suggested strategies are provided as well. First time users should read at least the PHASIT, BNDRY, FSFOUR, MAPINV and MAPVIEW writeups completely.
2.01 PHASIT WRITE-UP PHASIT can be run in one of two modes, protein phasing mode or structure factor calculation mode. Some of the input data is common to both modes, but other data is needed only for the particular mode invoked. First, the data that is always needed is described. INPUT DATA (UNIT 5) CARD 1 - PAMFIL (free format) PAMFIL = name of parameter file containing cell and symmetry information. CARD 2 - MODE, NXSCAT (free format) MODE = 0 for protein phase calculations. = 1 for structure factor calculations. NXSCAT = number of additional atomic types for which scattering factors will be input. Note that 20 types are already stored in the program (see below), thus this is usually nonzero only for exotic atoms or wavelengths other than CU K alpha. The following block of cards should be included only if NXSCAT > 0 Up to 5 additional atomic types may be input. For each additional atomic type, include the following 3 records REC 1 (A(J),J=1,4) (free format) A(J) = Coefficients for analytical approximation to scattering factors, as in Int. Tables, Vol IV, pages 99-101. REC 2 (B(J),J=1,4) , C (free format) B(J) = Coefficients for analytical approximation to scattering C = factors, as in Int. Tables, Vol IV, pages 99-101. REC 3 DEL f' , DEL f'' (free format) DEL f' = real part of anomalous scattering correction term. DEL f'' = imaginary part of anomalous scattering correction term. The appropriate remaining data should be supplied only for the mode selected. **** additional input for protein phasing mode (MODE= 0 )**** CARD 3 + 3*NXSCAT - NSETS, NOREF, N (free format) NSETS = number of data sets (derivatives)to use in phasing (max = 30) NOREF = 0 for protein phase calculation only. = 1 for protein phase calculation plus "phase refinement" of derivative parameters. N = minimum number of contributing data sets for the phase of an acentric reflection to be output. CARD 4 +3*NXSCAT - OUTREF (free format) OUTREF = Name of output reflection file to contain the final protein phases. The following block of cards 1-6, must then be repeated for each data set 1) TITLE = anything (free format) 2) FILEIN = input merged data filename (free format) 3) FILOUT = output difference Fourier filename (free format) 4) DCUT, SIGCUT, ISOFLG, SCLFPH, BOVFPH, SCLFH, ( EC(I),I=1,4 ) (free format) DCUT = minimum allowed d spacing. SIGCUT = minimum allowed F/sig value. ISOFLG = 0 for isomorphous replacement data. = 1 for native anomalous scattering data. = 2 for derivative anomalous scattering data. SCLFPH = scale factor multiplying FPH (obs) to scale it to FP (obs). Usually =1. unless refined in previous run. BOVFPH = overall thermal factor, applied to FPH (obs) to scale it to FP (obs). Applied as exp(BOVFPH*ssthol) * FPH. Usually = 0. unless refined in previous run. SCLFH = scale factor multiplying |FH|(calc) to scale it to the observed data. If unknown, input 0. and it will be computed. (EC(I),I=1,4) = coefficients for 3 term polynomial, used to generate "standard" E (lack of closure, based on intensity) values as function of |FP|., and the minimum allowed value of E. If unknown, input 0. for each and they will be computed. 5) NA = (number of heavy atoms/anomalous scatterers with known positions, free format) 6 etc) ATNAME, X, Y, Z, B, OCC, ITYPE FORMAT(7X,A8,5F10.5,I5) ATNAME = anything ITYPE = 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19 or 20 for C, N, O, S, Fe+3, Pt+2, Hg+2, Au+3, Pb+2, Os+4, I-, Zn+2, Ca+2, Mg+2, Cd+2, U+6, P, Br-, Cl- or Sm+3, respectively. ITYPE = 21 through 20+NXSCAT for the additional types, in the same order as originally input by the user. OCC = Occupancy factor X,Y,Z = Fractional atomic coordinates B = Thermal factor. Note that if B is > 0., then it is assumed to be an isotropic thermal factor. If B is input as 0., then the temperature factor is assumed to be anisotropic with the B11, B22, B33, B12, B13, B23 elements being supplied on the immediately following record. If B is < 0., then the temperature factor is assumed to be isotropic with magnitude = ABS(B), but it will be converted to anisotropic prior to use in the program. The following record should be included ONLY if the supplied B value is less than or equal to 0. for the preceeding atom. 5a etc) B11, B22, B33, B12, B13, B23, BRES, SIG FORMAT(8F10.5) B11 = B22 = Components of anisotropic thermal factor tensor. B33 = If B (previous record) is < 0., then these fields are irrelevant as the program will compute them B12 = by converting |B| to anisotropic. B13 = B23 = BRES = Possible target value for restraining the isotropic equivalent of the anisotropic temperature factor. If BRES > 0., then a restraint term of the form WT*(BRES-BEQ)**2 is included in the least squares equations. SIG = Sigma for restraint term, used only if BRES is > 0. WT is 1/SIG**2. (Suggested value =0.5) Include cards 5 (and possibly 5a) for each of the NA atoms. ***** END OF INPUT, UNLESS HEAVY ATOM REFINEMENT WAS REQUESTED ***** If "phase refinement" was requested (NOREF=1), then include the following cards. CARD A) NPASS, FMCUT, NHVCYL, IWT, IEXC, NFIXP, MAXLIK (free format) NPASS = # of times protein phases are to be recomputed, i.e. # of refinement passes. (max=10). Protein phases are held fixed during each pass, and updated at the end of each pass. FMCUT = Figure of merit cutoff. Reflections will not be used in phase refinement if the associated figure of merit is < FMCUT. NHVCYL = # of refinement cycles to be performed in each pass. (max=50). Each cycle can refine heavy atom and/or scaling parameters for any data set. IWT = 0 for refinement weights based on expected lack of closure. = 1 for refinement weights based on estimated accuracy of current protein phase. = 2 for unit weights. IEXC = 0 to exclude contribution to protein phase distribution from each data set when parameters for that data set are being refined. = 1 to include contributions to protein phase distribution from all possible data sets during refinement. NFIXP = 0 for normal operation (uses protein phases based on current heavy atom data during refinement). = 1 to read in externally derived protein phases, and hold them fixed during heavy atom refinement. If NFIXP=1, then IEXC is reset 1, and IWT is reset to 0 if it was 1. MAXLIK = 0 for conventional parameter refinement. = 1 for "Maximum Likelihood" parameter refinement. **** The following card should be included ONLY if NFIXP=1 **** CARD A' FXDFIL (free format) FXDFIL = name of file containing the protein phases to be held fixed and used during refinement. The following card set B,C,D must then be repeated for each of the NHVCYL cycles requested. CARD B) IVSET (free format) IVSET = data set number (in order as originally input) of set for which derivative parameters are to be refined. CARDS C) (IVAR(J),J=1,5 or 10) (free format) Variable selection information IVAR(1) = 1 to refine x coordinate, 0 to hold fixed IVAR(2) = 1 to refine y coordinate, 0 to hold fixed IVAR(3) = 1 to refine z coordinate, 0 to hold fixed IVAR(4) = 1 to refine occupancy, 0 to hold fixed IVAR(5) = 1 to refine B (or B11), 0 to hold fixed IVAR(6) = 1 to refine B22, 0 to hold fixed IVAR(7) = 1 to refine B33, 0 to hold fixed IVAR(8) = 1 to refine B12, 0 to hold fixed IVAR(9) = 1 to refine B13, 0 to hold fixed IVAR(10)= 1 to refine B23, 0 to hold fixed Card C must be repeated for as many atoms as are in the specified data set. Each card refers to a single atom, in the same order as originally input. Note that IVAR(6-10) are appropriate only if the corresponding atom was input with (or converted to) an anisotropic temperature factor. CARD D) (IVSCL(I),I=1,3) (free format) IVSCL(1) = 1 to refine SCLFPH, 0 to hold fixed IVSCL(2) = 1 to refine BOVFPH, 0 to hold fixed IVSCL(3) = 1 to refine SCLFH, 0 to hold fixed Note! For native anomalous scattering data sets, IVSCL(1) and IVSCL(2) must be 0 **** FILES **** The input "scaled/merged" reflection files have already been described. The output protein phase file OUTREF is binary and contains records with the following: H, K, L, FMFO, FO, PHIBEST, IPRAB, IPRCD, MK, FOM where H, K, L = Miller indices (integers) FMFO = Figure of merit weighted structure factor amplitude (either FOM * FP or FOM * F+) FO = Observed structure factor amplitude (either FP or F+) PHIBEST = Best (centroid) phase, in degrees. IPRAB Hendrickson-Lattman coefficients A,B,C,D for the phase = probability distribution used, packed two per word as IPRCD (IFIX(A*100)+16384)*32768 + IFIX(B*100)+16384 and (IFIX(C*100)+16384)*32768 + IFIX(D*100)+16384 MK = Restricted phase indicator. For general reflections MK=1, for centric reflections MK > 1 and one of the allowed phase values is (MK-1)*15 degrees (the other possibility is 180 degrees away). FOM = Figure of merit associated with PHIBEST and used for weighting. The output files "FILOUT" are "short form" phase files suitable for computing difference Fouriers, double difference Fouriers, observed difference Pattersons or "calculated" difference Pattersons for each data set, via the MAPTYP=1,3,6,7 options, respectively, in FSFOUR. They can be used to identify more heavy atom sites, to generate difference Pattersons or to generate "calculated difference Pattersons" from the input heavy atom model for comparison with the "observed difference Pattersons". These files actually contain records with IH,IK,IL,FHobs,FHcalc,PHI_Hcalc IH,IK,IL,(FP+ - FP-)obs,(FP+ - FP-)calc,(PHI_PRO-90) IH,IK,IL,(FPH+ - FPH-)obs,(FPH+ - FPH-)calc,(PHI_PRO-90) for isomorphous, native anomalous and derivative anomalous data sets, respectively. If phase refinement is requested (NOREF=1) and protein phases are to be explicitly input (NFIXP=1), then an additional file FXDFIL with the same structure as the output phase file above must also be supplied to provide the protein phase information. If MAXLIK = 0 only the indices, PHIBEST and FOM will be used. If MAXLIK = 1 the Hendrickson-Lattman coefficients will also be used. In protein phasing mode the program expects to read in one or more "merged" data files, i.e. files with records containing H, K, L, FP, SFP, FD, SFD for isomorphous replacement data, H, K, L, F+, SF+, F-, SF- for native anomalous scattering data or H, K, L, FP, SFP, FPH+, SFPH+, FPH-, SFPH- for derivative anomalous scattering data. It is assumed that the native and derivative data has already been properly scaled together (via CMBISO or CMBANO). If more than one data set is input containing native F values (FP), corresponding FP values are assumed to be identical (on same scale) in each set, as would be the case if each derivative set was scaled to the same native set with CMBISO. It is not necessary for any given reflection to be present in all sets. If more than one data set is supplied, but a reflection is present in only one of them, then the resulting output phase for that reflection will correspond to an SIR (or SAS) calculation rather than MIR. One can however, request that acentric reflection phases be output only if N or more data sets contributed, where N is an input parameter. Thus an N value of 2 would insure that output phases are generated only for cases where the phase ambiguity has been resolved (in principle). For centric reflections there is no phase ambiguity, hence the N criterion is not applied. If only one data set is input, then N should be 1 to insure that all computed phases (either SIR or SAS) are output. NOTE!!!! If both NATIVE anomalous scattering and other types of data sets are input, THE NATIVE ANOMALOUS SCATTERING SETS SHOULD BE THE LAST ONES INPUT. If both anomalous and isomorphous data sets are input then the F and SIG values for the anomalous data should be on the same scale as the isomorphous data. This will happen automatically if CMBISO and CMBANO are used to prepare the data files and the same native set was used as input. If NATIVE anomalous scattering data is to be used IN ADDITION TO OTHER DATA TYPES, then it is convenient to also run it through CMBANO to put it on the scale of the other data, and then edit the output file to strip away the extra FP and Sig(FP) fields. This is needed to conform to the file format for native anomalous scattering sets, yet be properly scaled for consistancy with the other data sets. If only mutiple anomalous scattering data sets are input, then F values for all sets are assumed to be on the same scale, and the heavy atom parameters should correspond to the same hand, and be consistent with the input indices. IT IS ASSUMED THAT WHEN MULTIPLE DATA SETS ARE INPUT, THE ORIGIN AND HAND IS CONSISTENT THROUGHOUT ALL DATA SETS. **** additional input for SF calculation mode (MODE=1) **** CARD 3 + 3*NXSCAT - INPREF (free format) INPREF = Name of file containing the input reflections for which structure factors will be computed. CARD 4 + 3*NXSCAT - INPCDS (free format) INPCDS = Name of file containing the input atomic coordinates. CARD 5 + 3*NXSCAT - OUTSF (free format) OUTSF = Name for output file containing the calculated structure factors. CARD 6 + 3*NXSCAT - KRES,(KILRES(I),I=1,KRES) (free format) KRES = Number of residues to be omitted from structure factor calculation. (KILRES(I),I=1,KRES) = residue numbers for the KRES residues to be omitted. CARD 7 + 3*NXSCAT - IMODE, IHLCF, ISIGA (free format) IMODE = 0 if atomic type to be derived from first character of atom name (see below) = 1 if atomic type explicitly input (see below) IHLCF = 0 "Short" Fourier output. File contains Fobs, Fcalc, phase. = 1 "Full" Fourier output. File contains FM*Fobs, Fobs, phase, Hendrickson-Lattman coefs etc. NOTE! IHLCF is meaningful only when ISIGA is zero, as the nature of the output file is determined for ISIGA > 0 as described below. ISIGA = 0 If "full" file output is requested (IHLCF=1), Bricogne's modification of Sim's weights are to be used to construct the phase probability distributions. = 1 For "Full" file output but with distributions based on Sigma_A weights. = 2 For "short" file output appropriate for reduced bias difference maps based on sigma_A weighting (use Fo-FC option in FSFOUR). = 3 For "short" file output appropriate for reduced bias native maps based on sigma_A weighting (use 2FO-FC option in FSFOUR). **** FILES **** INPREF - Input structure factor file. Several types of files can be used here, and the type of file is deduced from the last part of the filename. Allowed file types include binary (31 type files, either long format or short format), any of the "merged" files, "MULISTS", SCALEPACK style files or files in free format. If the filename ends with ".31", then a binary style "phased" file is assumed, which can be the output from a previous PHASIT or BNDRY run. Either long or short format files can be used, and the program will figure out which type was input and pick up the indices and Fobs values appropriately. The records thus would contain either h, k, l, FOM*FO, FO, PHIbest, A_B, C_D, MK, FOM (long format) or h, k, l, FO, FC, PHI (short format) Note that previous files output from PHASIT, structure factor mode with ISIGA > 1 or output in "phasing mode" as a "difference coefficient file" are NOT appropriate as they do NOT contain FO explicitly. Similarly, long format files output from BNDRY with IOTYP=1 are not appropriate as they do not contain FO in the second amplitude slot. If the file name ends with ".MU" or ".mu", then it is assumed to be an ASCII "MULIST" i.e. a file generated by program MAKEMU (in the XENGEN system) or by program FBSCALE. In that case each record is assumed to contain H, K, L, RES, F, Sig(F), F+, Sig(F+), F-, Sig(F-), Iflag in format (3I4, 1X, F6.4, 6(1X, F8.2) 1X, I2 ). Only the indices and F values will be used. If the filename ends with ".SCA" or ".sca", then an ASCII SCALEPACK file is assumed. After a variable number of header records (see the FILE FORMATS section), reflection records follow and contain H, K, L, I+, sig(I+), I-, sig(I-) in format (3I4, 4F8.1) Note the use of intensities rather than F's. The last two items in each record may be omitted. If present, they would be used only if I+ was not measured. If the filename ends with anything other than ".31", ".MU", ".mu", ".SCA" or ".sca", the file is assumed to be ASCII and is read in free format. The records are assumed to contain H, K, L, FO where H, K, L = Miller indices (integers) FO = Observed structure factor amplitude Note that this is appropriate for any of the "scaled and merged" files output by CMBISO or CMBANO, and generic files as well. INPCDS - Input atomic coordinate file, ASCII with format ( 1X, A1, 5X, A1, I3, A4, 5F10.5, I5). Each record should contain CHN, RT, IRES, ATOM, X, Y, Z, B, OCC, ITYP where CHN = single character chain identifier (not used) RT = single letter amino acid code (not used) IRES = sequence number (used only if rejecting residues) ATOM = atom name (used only if IMODE=0) X,Y,Z = fractional atomic coordinates B = Isotropic thermal factor OCC = Occupancy factor ITYP = 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19 or 20 for C, N, O, S, Fe+3, Pt+2, Hg+2, Au+3, Pb+2, Os+4, I-, Zn+2, Ca+2, Mg+2, Cd+2, U+6, P, Br-, Cl- or Sm+3, respectively. ITYP = 21 through 20+NXSCAT for the additional types, in the same order as originally input by the user. Note that if IMODE=0, then atomic types are derived from the first character of the atom name, but only C,N,O,S or Fe will be recognized. Include one record of this type for each atom OUTSF - The output structure factor file differs, depending on the values of IHLCF and ISIGA. If ISIGA = 0 and IHLCF = 0, the file is binary with each record containing H, K, L, FO, FC, PHIcalc where H, K, L = Miller indices (integers) FO = Observed structure factor amplitude FC = Calculated structure factor amplitude (scaled to input set) PHIcalc = Calculated phase angle in degrees. If ISIGA =1 (or ISIGA=0 AND IHLCF = 1) the file is binary with each record containing H, K, L, FMFO, FO, PHIcalc, IPRAB, IPRCD, MK, FOM where H, K, L = Miller indices (integers) FMFO = Figure of merit weighted structure factor amplitude FOM * FO FO = Observed structure factor amplitude FO PHIcalc = Calculated phase, in degrees. Hendrickson-Lattman coefficients A,B,C,D for phase IPRAB probability distribution centered on calculated phase, = packed two per word as IPRCD (IFIX(A*100)+16384)*32768 + IFIX(B*100)+16384 and (IFIX(C*100)+16384)*32768 + IFIX(D*100)+16384 MK = Restricted phase indicator. For general reflections MK=1, for centric reflections MK > 1 and one of the allowed phase values is (MK-1)*15 degrees (the other possibility is 180 degrees away). FOM = Figure of merit associated with PHIcalc and used for weighting. Note that this record structure is identical to that produced in protein phasing mode, although the probability distributions will all be unimodal. If ISIGA = 2 the file is binary with each record containing H, K, L, FOM*FO, D*FC, PHIcalc with the parameters as previously described, and D is as defined in Read's Sigma_A procedure. This file is appropriate for reduced bias DIFFERENCE maps, and should be used in FSFOUR with the FO-FC option. If ISIGA = 3 the file is binary with each record containing H, K, L, FOM*FO, D*FC, PHIcalc for acentric reflections and H, K, L, FOM*FO/2, 0., PHIcalc for centric reflections with the parameters as previously described, and D is as defined in Read's Sigma_A procedure. This file is appropriate for reduced bias NATIVE maps, and should be used in FSFOUR with the 2FO-FC option. In structure factor calculation mode, a set of reflection indices and observed F values are read in from one file (which can be the output file generated from a previous run of PHASIT or BNDRY). Atomic coordinates, occupancies and thermal parameters are read in from another file. Structure factors are then computed for all input reflections, and a binary output file is written. Records in the binary file differ depending on which options (IHLCF and ISIGA parameters) were selected. In one case a "short" form of the phase file is written, generally containing Fobs, Fcalc and the phase. The output structure factor file then is identical (in structure) to that produced by MAPINV, thus it can be used in option 3 of the BNDRY program to combine phase information from the partial (or complete, but tentative) structure with other phase information. If combined with an output from PHASIT (protein phasing mode), then SIR, MIR etc phases can be combined with those from the model structure. If combined with an output from BNDRY, then partial structure phases can be combined with MIR, etc phases AFTER density modification. The file can also be used directly to compute electron density, difference density, "residue deleted" maps etc., based on phases and amplitudes computed from the input model. Provisions are available to omit various residues from the structure factor calculation, thereby facilitating use of the file for computation of "residue deleted" electron density maps. If other options are selected, after calculating the structure factors and scaling them to the observed data Hendrickson-Lattman coefficients are also computed, based either on Bricogne's modification of Sim's weighting scheme or on Read's Sigma_A procedure. The output file then can contain FM*Fobs, Fobs, Phi, HL coefficients, restricted phase indicator and figure of merit. In that case the output file structure is identical to that produced by BNDRY, or by PHASIT in protein phasing mode. The file can then also be used to compute Fourier maps, but conventional DIFFERENCE Fouriers can NOT be computed since the Fcalcs are not present on the file. It can however, then be used as the "anchored" phases to which other phase information can be "tethered", i.e. replace the MIR phases. It can also be input to MISSNG, so that phase extension can be tethered to the partial structure phases in subsequent density modification cycles. By invoking other options the file can contain coefficients appropriate for "reduced bias" native or difference maps, based on Randy Read's Sigma_A procedure. **** PHASIT PROGRAM STRUCTURE **** In protein phasing mode the following events take place. For each data set the program will do the following: 1) Read in all reflections and reject those which fail to pass the supplied d and F/SIG cutoff information. 2) The indices of each accepted reflection are transformed (if needed) to correspond to a "standard" asymmetric unit, systematic absences are rejected, and phase restrictions are identified for centric reflections. If the data set contains anomalous scattering data centric reflections are rejected. All other reflections are stored. 3) Heavy atom parameters are read in and structure factors are computed based on the heavy atom positions, using the appropriate scattering factors for isomorphous or anomalous scattering data, respectively. 4) A suitable number of reflections are chosen from which difference magnitudes ABS(FP - FPH), ABS(F+ - F-) or ABS(FPH+ - FPH-) are used to scale the heavy atom structure factors. For isomorphous replacement data all reliable centric reflections are used, if any are present. If there is an insufficient number of centric reflections, the selected list is augmented by the 25% largest differences for acentric data. For anomalous scattering data, the 25% largest differences ABS(F+ - F-) etc are used. If the user input a scale factor, then it is used instead of the computed value. R factors are then reported after scaling the heavy atom structure factors. 5) The data is grouped into ranges based on the magnitude of FO or (F+ + F-)/2, and rms E values (lack of closure) are computed for each range. All centric data (possibly augmented with acentric data as described above) are used to determine E values in the isomorphous replacement case. In the anomalous scattering case only the 25% strongest differences are used. For centric isomorphous replacement data the input sig(FP) and sig(FPH) values are used to remove from the E values the components arising from measurement error, and the remaining lack of closure value is halved. The components due to measurement error are then added back. This enables the E values determined from centric data to be applicable to the acentric data. A three term polynomial is then fit by least squares to the rms E values as functions of FO or (F+ + F-)/2. If the user input the polynomial coefficients, then this step is bypassed. 6) From the scaled heavy atom structure factors, input amplitudes and computed E values, Hendrickson-Lattman coefficients are computed to represent the SIR (or SAS) phase probability distributions. For the centric isomorphous replacement data the E values are first adjusted to "undo" the downscaling making them appropriate for acentric data. 7) SIR (or SAS) phases are then computed by integrating over the distributions to yield "best" phases and the associated figure of merit. Figure of merit statistics are then output, along with an estimate of the "phasing power" ( FH(calc)/E or 2.*FH"(calc)/E ) as a function of resolution. Note that for the purpose of phasing power calculations E values are based on amplitude differences, whereas for the actual probability distributions E values are based on intensity differences. 8) The indices, observed and calculated amplitudes, input standard deviations, Hendrickson-Lattman coefficients, calculated phase components and restricted phase indicators are output to a scratch file. After repeating the procedures 1-8 for each data set, phase information from all sets is combined as follows: 9) The scratch files are rewound and read. The first time unique indices are encountered, they are stored along with FP (or F+), the restricted phase indicator and the Hendrickson-Lattman coefficients. A counter is also saved to keep track of the number of data sets (probability distributions) contributing to each reflection. If the same reflection is encountered again, the Hendrickson-Lattman coefficients are added to those already saved and the counter is incremented. 10) For each unique reflection, the cumulative Hendrickson-Lattman coefficients are used to generate the combined phase probability distribution. The distribution is then integrated to yield the "best" (centroid) phase and associated figure of merit. The computed phase is then saved, and the number of contributing data sets and restricted phase indicator are examined. If the reflection is acentric, the number of data sets contributing to that particular distribution is compared to N (input value) to decide whether or not to output the reflection. 11) The indices, figure of merit weighted FP (or F+), FP (or F+), best phase, Hendrickson-Lattman coefficients (for combined distribution), restricted phase indicator and figure of merit are then output for each centric reflection and for those acentric reflections passing the "N" criteria. Figure of merit statistics are then output for the final phase set. A "difference Fourier coefficient" file is also written for each data set enabling one to search for additional sites, or to compare Pattersons "calculated" from the input sites with the "observed" difference Pattersons. Both difference maps (showing all heavy atom sites) and "double difference maps" (after subtracting out the input heavy atoms) can be computed with the same file, as can the "observed" and "calculated" difference Pattersons. 12) If more than one data set was input, the scratch files are then rewound and read again to recompute the "phasing power" and "bias" for each data set. This time however, the phasing power calculations are based on lack of closure values obtained using the new protein phases. In theory, for data sets containing only small errors, the phasing power for each data set should increase relative to its initial value if the multiple data sets are consistantly resolving the phase ambiguity. Large decreases indicate an inconsistant derivative or lack of isomorphism beyond a given d spacing, and generally result from incorrect signs of many isomorphous or anomalous scattering differences. Usually there will be small decreases observed when more than 2 or 3 data sets are used. This means that some of the signs of delta F are inconsistant and is unavoidable with experimental data. Also, the phasing power is essentially the "signal to noise ratio" for each data set, thus when it falls below 1.00 the data probably does more harm than good. A good policy is to truncate each data set at the resolution where the phasing power falls to about 1.00. The "mean relative error" M.R.E., defined as (1/N) * SUM (e(phi)**2 / 2.*E**2)) where e(phi)**2 is the lack of closure, weighted over all possible protein phases for each reflection is also output for each data set, and should be about 0.5 if the E's are properly determined. In addition, the mean phase "bias" toward heavy atom phases is listed both as a function of resolution, and overall for each data set. Since there should be no correlation between true protein and heavy atom phases, the mean bias should be 90 degrees for each data set. If it deviates significantly from 90 degrees, one (or possibly more correlated) data set(s) is/are likely to be dominating the phasing process, and biasing the results. 13) If more than one data set was input and derivative parameters are NOT being refined (NOREF=0), the program then starts a second cycle by updating the E value polynomial coeficients for each set as before, but this time using probability weighted averages over all possible protein phase values for each reflection. The updated E values are then used to recompute Hendrickson-Lattman coefficients for each set. New SIR or SAS phases are then computed and Figure of merit statistics are listed for each set separately. The results are then written to new scratch files. Steps 9-12 are then repeated to produce and evaluate new combined distributions. Statistics are given as before, but this time the mean absolute phase shift (in degrees) from the previous cycle is output as well. Only the results of this final cycle will appear on the output phase and difference coefficients files. This recycling procedure generally improves results since phases are based on what are normally more accurate E values. This is especially true for the anomalous scattering data sets, since the original E's were estimated from a small subset of data based on crude (though reasonable) statistical arguments. The program then terminates. 14) If atomic or scaling parameters ARE being refined (NOREF=1), for each data set a check is made to determine whether E value polynomial coeficients have been updated yet for it (as for example, in a previous run). If not, new coefficients are determined as in step 13, and new SIR or SAS phases are computed based on them. If the E coefficients are updated for ANY set, then all sets are combined again to determine new protein phases and statistics as before. Once updated polynomial coefficients are available for each set, and protein phase estimates have been obtained based on them, refinement of parameters then proceeds. The program loops over each set to be refined as follows: If externally derived protein phases are to be used (NFIXP=1), the indices, phases, FOM'S (and distribution coefficients if maximum likelihood refinement is requested) are read in and stored. Otherwise, protein phases and figures of merit are recomputed using contributions to the combined phase probability distributions from either ALL data sets, or from all EXCLUDING the set currently being refined, as indicated by the user supplied parameter IEXC. For the set being refined, heavy atom structure factors and derivatives are then computed, and FPH(calc) (or FPH+(calc), FPH-(calc)) and its derivatives with respect to the variable parameters are computed, using the selected protein phases. Contributions to the Cullis and Kraut R factors are then accumulated. If the current figure of merit exceeds the input cutoff, the derivatives are included in the buildup of least squares equations minimizing the weighted lack of closure with respect to the selected variable parameters. If MAXLIK=0 the quantity minimized is SUM [ W*(|FPH|(obs) - |FPH|(calc))**2] for isomorphous or SUM [ W*((|FPH+|(obs)-|FPH-|(obs)) - (|FPH+|(calc)-|FPH-|(calc)) )**2 ] for anomalous scattering data sets, respectively, where W is 1./E**2, 1./E'**2 (E' is the RMS E value (based on amplitudes) only for the contributing data sets), or unity as selected by the user via the parameter IWT. If MAXLIK=1, instead of computing |FPH|(calc) at the single value of phi(Protein)=phi(best), the equations above are modified to include contributions from all possible values of phi(Protein), with each suitably weighted by the probability associated with phi(Protein). Thus in the isomorphous case the quantity minimized becomes SUM [ W * SUM [ P(i) * (|FPH|(obs) - |FPH|(calc,i))**2 ] ] where P(i) is the probability for phi(Protein) used in the calculation of |FPH|(calc,i), and P(i) is stepped over the phase circle in 5 degree increments. A similar expression is used in the anomalous case. The least squares equations are solved by matrix inversion, and the parameters are then updated. The following R factors are reported. R Cullis = SUM | ||FPH|(obs) +/- |FP|(obs)| - |FH|(calc) | ---------------------------------------------- SUM | |FPH|(obs) +/- |FP|(obs)| with the sum taken over all centric reflections. R Kraut = SUM | |FPH|(obs) - |FPH|(calc) | --------------------------------- SUM |FPH|(obs) with the sum taken over all acentric reflections (isomorphous case). R Kraut = SUM ||FPH+|(obs)-|FPH+|(calc)| + ||FPH-|(obs)-|FPH-|(calc)| ----------------------------------------------------------- SUM |FPH+|(obs) + |FPH-|(obs) with the sum taken over all acentric reflections (anomalous case). After NHVCYL refinement cycles, the heavy atom structure factors and R factors are recomputed based on the new parameters. Steps 6-12 are then repeated to generate new protein phases, and the E values are updated as in step 13. The whole process is repeated for each of the NPASS passes requested. After each pass, the mean absolute phase shift over all reflections is output. After the last pass, the protein phase and difference coefficients files are written, and a new file NEWPARAMS.INP is created, which is a copy of the original input deck except that the new heavy atom parameters, scale and E coefficients replace the original ones. This deck can be used for further refinement in a subsequent job. Note that within a pass, protein phases are held fixed (except for possible removal of contributions from the derivative being refined). They are updated only after the end of each pass, and even then, only if externally derived phases are NOT being used. ***** NOTES ON PHASE REFINEMENT ***** During phase refinement, one generally excludes contributions to the protein phase probability distributions from the data set for which parameters are being refined (IEXC = 0). This is because the assumption is that the protein phases and heavy atom parameters are independent, which will not be true if the derivative contributed to the protein phases. Indeed, it may not be strictly true even if contributions to protein phases are omitted from the derivative, if it has heavy atom sites in common with another derivative that IS contributing. On the other hand, successful phase refinement of parameters depends on REASONABLY ACCURATE protein phases being available. This presents a problem when only a few derivatives are to be used. If protein phase contributions come from only one derivative (the one not being refined), then the protein phases are very poorly determined as they are actually SIR phases. Phase refinement then usually results in reduction of the FH scale factor and most occupancies. The end result is a degradation of most all statistical indicators, but little or no change in the figure of merit. In this case it may be desirable to ignore the correlation, and include all contributions to the protein phase (IEXC=1), which results in stable, although slow refinement. In that case the expected improvement is usually obtained, but the bias toward heavy atom phases may be slightly larger than desired. It is sometimes useful to do this even with 3 or more derivatives. Also, note that the R Cullis and R Kraut values are dependent on the current protein phases. Thus if contributions from the set being refined are excluded, these factors will generally increase as they do not reflect the final protein phases, but only the phases in use at the time they were computed. For this reason, it is always desirable to include all possible contributions (IEXC=1) at least in the last cycle, just to get the final Cullis and Kraut R factors which correspond to the MIR phases for publication purposes. The parameter shifts need not be used. It is often desirable to read in externally derived protein phases, and hold them fixed for use in heavy atom parameter refinement. This could be the case, for example, if the initial parameters are poorly determined, but a "solvent flattened" and/or "symmetry averaged" map looks reasonable. In that case, protein phases obtained from the map (and possibly combined with the original phases) might be better suited for parameter refinement than the original phases were. These "EXTERNAL" phases can be input and used during parameter refinement (NFIXP=1). In that case, the program still computes new protein phases after each refinement pass for the purpose of updating statistics, E values and final output, but the phases which were input are ALWAYS used UNCHANGED during every refinement cycle. The output phases however, will always correspond to those computed from the current heavy atom parameters, and can be used to start a new round of solvent flattening. IT IS STRONGLY SUGGESTED that one always do at least one round of refinement against solvent flattened phases in this manner, AND USE THE NEW PARAMETERS TO INITIATE A FINAL ROUND OF SOLVENT FLATTENING! An important aspect of phase refinement is that it enables refinement of the derivative to native scaling parameters. These parameters should initially be 1. and 0. for SCLFPH and BOVFPH, as CMBISO or CMBANO has equated the scattering from native and derivative data sets. While this is adequate for initial heavy atom determination, it can not be strictly correct as the presence of the additional heavy atoms MUST increase the scattering for the derivative crystal relative to that from the native. Thus refinement of the FPH scale factor should increase it to slightly more than unity, the exact value being limited by the composition of the native and derivative crystals. If the FPH scale factor falls below unity, it can not correspond to reality. There is however, no restriction on the BPH scale factor (which is actually a delta B, between the native and derivative data sets). Since the data sets have already been "thermally" scaled (in CMBISO or CMBANO), refinement of BPH generally results only in small shifts, which can be positive or negative. Also, note that all changes in the derivative scaling parameters are TEMPORARILY applied internally in the program. The input "merged" data files for each set are NOT modified in any manner, and still correspond to the scaling applied in CMBISO or CMBANO. The cross-phase Fourier coeficient generating programs MRGDF and MRGBDF can apply the additional scaling parameters, if desired, for the purpose of generating difference or cross difference Fouriers which reflect the new scaling parameters. Also, note that in principle one can refine both the derivative FPH and FH scale factors simultaneously, but since they are correlated, in practice this sometimes leads to poor results. This is particularly the case with derivative anomalous scattering data. In that case, it may be best to refine only one of these two parameters in any given cycle, and alternate refinement of them between cycles. Refinement of the native- derivative scale factors works best when initiated against FIXED EXTERNAL PHASES (e.g. solvent flattened and/or NC symmetry averaged). For maximum likelihood phase refinement one has considerably more flexibility in the weights and in the figure of merit cutoff. Since the contributions will be weighted by their probabilities anyway, one can greatly reduce the figure of merit cutoff, perhaps even to include all reflections. It might also be useful to then refine with the "exterior" weights unity (IWT=2) so that the probabilities will be the only weights applied. During maximum likelihood refinement there is no need to exclude contributions from the derivative being refined. Note that maximum likelihood refinement can also be done with external phases (NFIXP=1). In the program, although contributions to the matrix (and hence the parameter shifts) come from all points on the probability distribution, for statistical purposes the R factors are still reported only while assuming phi(protein) = phi(best). In structure factor calculation mode the following events take place: 1) All atomic parameters are read and checked to insure that the atom type is recognized, and that enough storage exists to do the calculation. If any residues were targeted for rejection, atoms in the residue have their scattering factors set to zero to effectively eliminate them from the input list. 2) Each reflection is read in and the corresponding structure factor is computed based on the atomic parameters input. The indices, FO, FC and phase are stored, and sums for the least squares calculation of a scale factor and for computation of the correlation coefficient between observed and calculated amplitudes are incremented. 3) After all structure factors are generated, the scale factor relating FO to FC is computed, all FC's are rescaled and an R factor (based on F) and correlation coefficient are computed. 4) The R factor, correlation coefficient and number of reflections processed is listed. 5) If both IHLCF=0 and ISIGA=0, the indices, FO, scaled FC and phase are output for each reflection and the program terminates. 6) If IHLCF=1 and ISIGA=0, the data are sorted, and mean values of abs(Fo**2 - Fc**2) in various resolution shells are computed. A three term polynomial is then fit to the delta data as a function of resolution. For each reflection, the indices (and phase) are converted, if needed, to the "standard" asymmetric unit, and the expected value of abs (Fo**2 - Fc**2) is obtained from the polynomial and is used to compute Hendrickson-Lattman coefficients for the reflection using Bricogne's modification of Sim's weighting scheme, i.e. W = 2 * FO * FC / < | FO**2 - FC**2 | > sin(theta)/lambda A = W * COS (Phi calc) B = W * SIN (Phi calc) C = 0 D = 0 The distributions are evaluated (to get the figures of merit), and the indices, Fm*Fo, Fo, Phi, Hendrickson-Lattman coefs, restricted phase indicator, and Fm are written to the output file. A sum to compute the mean figure of merit is also updated. The mean figure of merit is listed, and the program terminates. 7) If ISIGA > 0 the indices and phases are transformed to the standard asymmetric unit, the data are sorted on resolution, and are converted to normalized structure factors. Both sigma_A and "D" values are then computed for each shell as described by Read. Distribution coefficients are then computed as described above except that W = 2. * Sigma_A * Eobs * Ecal / (1. - Sigma_A**2) for acentric data W = Sigma_A * Eobs * Ecal / (1. - Sigma_A**2) for centric data The distributions are then evaluated to get the figure of merit, and coefficients appropriate for conventional electron density, reduced bias native or reduced bias difference maps are written to the output file as requested. The mean figure of merit is then reported. Note that the options (IHLCF=1 and ISIGA=0,) or ISIGA=1 are very useful if one wishes to "solvent flatten" or "average" a map which is obtained from a model, i.e. a molecular replacement solution, since it provides an "MIR like" phase file which can be used to "tether" subsequent phase information to (via BNDRY, option 3), while the other options are useful for direct examination of maps or to provide model based phases for phase combination with MIR like information.
2.02 BNDRY WRITE-UP BNDRY is a program to facillitate solvent flattening, negative density truncation and/or phase combination. If the starting phases come from a single derivative or single wavelength anomalous scattering data, then it can be used to carry out Wang's ISIR/ISAS procedure. If the starting phases come from MIR or any other source, it can be used for "phase refinement" via solvent flattening and/or negative density truncation. The program can also be used for phase extension to higher resolution or to missing reflections within the input resolution, or for combining partial structure information with MIR, SIR etc data. BNDRY can be run in one of four modes, with each mode carrying out a particular function. The mode desired is selected via an input parameter. The input required, output generated and tasks performed for each mode are now described. INPUT CONTROL DATA (UNIT 5) RECORD 1 PAMFIL (free format) PAMFIL = Name of file specifying cell parameters, symmetry information etc. RECORD 2 IOPT (free format) IOPT = 0 Convert input Fourier coefficients to those appropriate for protein-solvent boundary determination. = 1 Construct mask map identifying protein and solvent regions from "smeared" map input. = 2 Modify electron density map via solvent leveling and negative density truncation. = 3 Combine phase information from new source with input Hendrickson-Lattman representation of phase probability distributions. Each run of the program should invoke only one of the options. Depending on the option called, the following data must also be given. ***** for IOPT=0 ***** RECORD 3 RAD (free format) RAD = sphere radius, for weighted averaging of density in boundary determination (typically 2.5-3 times minimum d spacing) RECORD 4 INPREF (free format) INPREF = name of input reflection/phase file. This file should have been generated by creating a map with the desired starting phases, setting all density values (omitting the F000 term) < 0 to 0, and inverting it. RECORD 5 OUTREF (free format) OUTREF = name of output reflection/phase file. A map generated with this data is "locally averaged", or "smeared", and is appropriate for protein/solvent boundary determination. **** FILES **** INPREF = (binary file) = Input Fourier coefficients, from inversion of original electron density after negative density truncation, as computed by program MAPINV. One reflection per record containing H, K, L, FO, FC, PHI where H, K, L = INTEGERS, FC,PHI=REALS and PHI is in degrees. FO not used. OUTREF = (binary file) = Output Fourier coefficients, same form as INPREF but modified to correspond to "smeared" map appropriate for boundary determination. ***** for IOPT=1 ***** RECORD 3 MAPFL1 (free format) MAPFL1 = Name of input "locally averaged" or "smeared" map file. This map should have been produced by FSFOUR from the coefficients in file OUTREF (option 0). RECORD 4 MSKFIL (free format) MSKFIL = Name of output mask file. RECORD 5 PSOL (free format) PSOL = bulk solvent fraction (by volume, typically 0.3-0.6) If the bulk solvent fraction is unknown, it can be estimated reasonably well by the formula PSOL= 0.97 - (1.22 * Z * Mw)/V where V is the unit cell volume (cubic angstroms), Mw is the molecular weight of the protein (Daltons), and Z is the number of protein molecules of weight Mw in the unit cell. This assumes a standard value for protein volume, and that 3% of the solvent is tightly bound to protein. **** FILES **** MAPFL1 = (binary file) = input electron density map ("smeared" map) as computed by program FSFOUR from coefficients generated in option 0. MSKFIL = (binary file) = output mask map file. After one header record, each subsequent record corresponds to one input record of the map (as type BYTE). Values of 2 indicate solvent region, anything else indicates protein region. Note that the mask can be displayed/edited by program MAPVIEW. ***** for IOPT=2 ***** RECORD 3 MAPFL1 (free format) MAPFL1 = Name of input electron density map file to be modified. RECORD 4 MSKFIL (free format) MSKFIL = Name of input protein/solvent mask file. RECORD 5 MAPFL2 (free format) MAPFL2 = Name of output (modified) electron density map file. RECORD 6 SVAL (free format) SVAL = empiracal constant, used to approximate F000/V (See program desciption that follows) Typical values: .060 (3.0 Angstrom data) .086 (3.5 Angstrom data) .112 (4.0 Angstrom data) .250 (6.0 Angstrom data) **** FILES **** MAPFL1 = (binary file) = input original (unmodified) electron density map as generated by program FSFOUR MSKFIL = (binary file) = input mask map (from run with IOPT=1) MAPFL2 = (binary file) = output modified electron density map (same structure as input map) ***** for IOPT=3 ***** RECORD 3 IEXT, DCUT, DAMP, ICMB, IOTYP (free format) IEXT = 0 For no phase extension. = 1 To extend phases to additional amplitudes given on file EXTREF (up to DCUT resolution). = 2 Same as 1, but phases AND AMPLITUDES also generated for any other missing reflections up to DCUT angstrom resolution. DCUT = d spacing cutoff, in angstroms, for extension. (note that extension only possible to reflection index range (or to symmetry related reflections) specified in input to MAPINV) DAMP = Damping factor, (in range 0-1.) for weighting contribution of input probability distribution to the combined distribution. Usually 1.0, which applies the true weight. If < 1., will downweight contribution of original distribution, i.e. increase relative weight of new (map inversion or partial structure) distribution to the combined distribution. ICMB = 0 To use Bricogne's modification of Sim's weighting procedure during phase combination. = 1 To use Read's Sigma_a procedure for weighting during phase combination. IOTYP = 0 For normal output, i.e. phase file to contain FOM*FO and FO in the amplitude slots. = 1 For modified output, i.e. phase file to contain FO and FC in the amplitude slots. (This is used only if one wants to do NC symmetry averaging on difference or 2FO-FC maps, or solvent flattening on 2FO-FC maps). RECORD 4 INPPRB (free format) INPPRB = Name of input phase probability distribution file. RECORD 5 INPFC (free format) INPFC = Name of input calculated structure factor file. (Obtained from inversion of modified map or computed from a partial structure). RECORD 5A EXTREF (free format) ****** include this record ONLY IF IEXT > 0 ***** EXTREF = Name of input file containing additional structure factor amplitudes (and possibly phase probability distribution coefficients) for phase extension. RECORD 6 OUTPRB (free format) OUTPRB = Name of output phase probability distribution file, corresponding to combined (and possibly phase extended) data. **** FILES **** INPPRB = (binary file) = Input Fourier and probability distribution coefficients, one reflection per record containing H,K,L,FM*FO,FO,PHI,IPRAB,IPRCD,MK, FOM where H, K, L,IPRAB,IPRCD,MK= integers, FO,FM*FO,PHI,FOM=real and PHI is in degrees. This file usually is prepared by program PHASIT or IMPORT. But if it is a previous output from BNDRY, then new phase information can be combined with phases AFTER density modification instead of with the original phases. In the latter case IOTYP should have been set to 0 when the file was originally created. INPFC = (binary file) = Input Fourier coefficients (from inversion of modified map or from partial structure), with records containing H,K,L,FO,FC,PHI as output from MAPINV or PHASIT. FO is not used. EXTREF = (ASCII, free format) = Input reflection file. Each record should contain H, K, L, FNAT, A_B and C_D where H, K, L = INTEGERS, FNAT=REAL and A_B, C_D are INTEGERS. If phase probability distribution coefficients are available they are packed two per word in A_B and C_D as in a normal "phased" file. If they are not available A_B and C_D are zero. This file is needed only if IEXT > 0. Phases will be extended to these reflections, subject to DCUT criteria. This file usually is prepared by program MISSNG. OUTPRB = (binary file) = Output Fourier coefficients, after combination of phase information, with same structure as on INPPRB, except that the Hendrickson-Lattman coefficients, phase and figure of merit correspond to the new phase. Note that the FO entry will actually contain FC for reflections which were "amplitude" extended. If IOTYP=1, then the amplitude slots in each record will contain FO and FC instead of FM*FO and FO, enablng different types of Fourier maps to be computed during density modification runs, if desired. WARNING! The IOTYP=1 option is fine if the output file is to be used ONLY for Fourier calculations, but it must NOT be used later in BNDRY or PHASIT as an input file, since they expect FO to be picked up from the second amplitude slot! **** BNDRY PROGRAM STRUCTURE **** Depending on the value of IOPT, the following events take place. For IOPT = 0 The input sphere radius RAD is read in along with the current reflection data and phase information. A unique set of reflections is selected and the Fourier transform of the weighting function W = 1-r/RAD with W = 0. for r > RAD is then computed for each reflection. The transform FS is as follows A = 4 * pi * RAD * sin(theta)/lambda 3 4 FS = 4 * pi * RAD * [ 2 * ( 1 - COS(A) ) - A * SIN(A) ] / A The input structure factor amplitudes are multiplied by FS, and the modified data is written out. In the original Wang algorithm, the weighting function was applied by convolution with the electron density in direct space, after zeroing out negative densities. Instead, we zero out the negative densities, invert the truncated map to obtain structure factors, multiply the structure factors by the transform of the weighting function, and compute a new modified map from the resulting modified structure factors. This is identical to the original method except that it is much more efficient, particularly for large maps and/or large RAD. It also does not require the time-saving approximation of repeating the procedure twice with half of the desired RAD as was sometimes needed with the direct space algorithm. The resulting coefficients will generate a map which is equivalent to zeroing out negative densities and then taking a weighted average (with weight W) of all density within RAD angstroms of each grid point in the input map. For IOPT = 1 The input fractional volume known (or thought) to represent solvent is read in and converted to the corresponding number of grid points in the map. The modified electron density map is then read and a histogram is generated keeping track of how many grid points have associated electron density of a given value. Starting from the lowest density value, an electron density threshold RHOCUT is incremented until the total number of grid points having density less than RHOCUT equals the number of grid points representing solvent. The modified electron density map is then rewound and read again, but this time as each record is read, a "MASK" value is defined for each grid point depending on whether the corresponding density exceeds RHOCUT. Mask values of 2 represent the solvent region, anything else the protein region. The mask map is then output to a file. Note that the mask can be displayed/edited with program MAPVIEW. For IOPT = 2 An empiracal constant SVAL is read in and is used to approximate F000/V (on scale of input map) from the relationship < Rho solvent > + F000/V -------------------------- = SVAL Rho(Max) + F000/V An electron density map and a mask map are read in. Using the mask map to discriminate protein and solvent regions, the mean density in the solvent region is computed, and the maximum density in the protein region is determined. From these three quantities F000/V (on scale of input map) is then estimated from the relationship above. A new modified map is then constructed such that the electron density at each grid point is equal to < Rho solvent > + F000/V if in solvent region. Maximum of (Rho input) + F000/V or zero if in protein region. Thus solvent leveling and negative density truncation are enforced. This is identical to the procedure in Wang's ISIR programs. For IOPT = 3 Indices, structure factor amplitudes, Hendrickson-Lattman coefficients and restricted phase indicators are read in and stored for the original set of phased reflections. If phase extension is requested then a file containing additional reflections for which amplitudes (and possibly phase probability distribution coefficients) are available is also read in and the data stored. Then new computed structure factors (either from inversion of a modified map or from a model based calculation) are read in. Indices and phases from the computed structure factors are transformed (if needed) to the standard asymmetric unit, systematic absences are rejected, phase restrictions are determined and the DCUT criteria is imposed (for phase extended reflections). The new indices are compared with those stored, and if a match is found the new phases and amplitudes are paired up with the old. If no match is found and AMPLITUDE extension was requested, the unpaired reflections are written to a scratch file. The FC's are then scaled to the FO's by least squares based on the original phased reflections, and the paired data are then sorted on resolution and divided into ranges according to sin(theta)/lambda. If Sim weighting or AMPLITUDE extension is requested the mean sin(theta)/lambda and mean ABS(FO**2 - FC**2) are computed for each range, and a three term polynomial in sin(theta)/lambda is fit to the mean ABS(FO**2-FC**2) by least squares. If Sigma_A weighting is requested the ranges are then used to determine normalized structure factor amplitudes from which the Sigma_A values are derived. For all paired reflections, new Hendrickson-Lattman coefficients for the map inversion/partial structure data are computed according to A = W * COS (Phi calc) B = W * SIN (Phi calc) C = 0 D = 0 where for Sim weighting W = 2 * FO * FC / < | FO**2 - FC**2 | > sin(theta)/lambda and for Sigma_A weighting W = 2. * Sigma_A * EO * EC / (1. - Sigma_A**2) for acentric data W = Sigma_A * EO * EC / (1. -Sigma_A**2) for centric data Test calculations (on a single model structure) indicated that the Sigma_A weighting gave slightly worse results when used to combine solvent flattened and MIR phases, but slightly better results when combining model based phases with MIR phases. Regardless of weighting, the new coefficients are combined with any input coefficients. Prior to phase combination, the original input coefficients are damped (if desired by the user), to increase the relative contribution of the newly introduced (map inversion or model based) information. Usually the damping factor is 1. (no damping), but in cases where phase combination involves a fairly small (percentage wise) partial structure fragment, damping the input coefficients can increase the impact of the partial structure contribution. The combined phase probability distributions are then evaluated and integrated to yield "best" (centroid) phases and the associated figure of merit. The combined phase information is then written to the output file in one of two user selected formats, and summary statistics are listed which include R factors, correlation coefficients and mean figures of merit for original, extended and all reflections. It is important to note that when phasing with only anomalous scattering data at one wavelength, phase extension will ALWAYS be required to insure that centric reflections get phased. Also, when phase extension is requested and input reflection data for extension (generated by program MISSNG) includes probability distributions, then the distributions will always be combined with those from the map inversion/partial structure. For extended reflections input without distribution coefficients, the output phases will correspond exactly to those from the map inversion/partial structure. If AMPLITUDE extension was requested, the previously written scratch file is rewound, read and the FC's rescaled (using the previously determined scale factor). After insuring that only unique reflections are selected, these data are also passed to the output file, except that the figure of merit and HL coefficients are computed using W = FC * FC / < | FO**2 - FC**2 | > sin(theta)/lambda This option is generally used to fill in missing (usually low order) reflections within the resolution range of the measured data, NOT for extension to higher resolution. It is usually done only after convergence is obtained with all other data. Use of the BNDRY program for density modification, control files and important considerations are discussed later where sample inputs are given.
2.03 FSFOUR WRITE-UP PURPOSE- To calculate three dimensional Fourier transforms (maps) when given a set of Fourier coefficients and control cards. This program will calculate maps by using a multivariate variable radix fast Fourier transform algorithm. the only restrictions are that the number of grid points along each axis is even, and is a product of the factors 2, 3, 4, or 5. Each factor can be used more than once. The program is fully general so that all space groups can be handled. The input structure factors must fall in the following range: -NX/2 < h < NX/2 -NY/2 < k < NY/2 -NZ/2 < l < NZ/2 where NX, NY, NZ are the number of grid points along the a, b, and c axes, respectively. Input structure factors outside the range will be omitted from the calculation. INPUT DATA (UNIT 5) CARD 1 PAMFIL (free format) PAMFIL = Name of input file containing cell and symmetry information. CARD 2 TITLE (free format) TITLE = anything CARD 3 NCENT,NX,NY,NZ,MAPTYP,IPRINT,NPIC,NORN,INPF,GSP,DCUT (free format) NCENT = 0 for noncentrosymmetric space groups = 1 for centrosymmetric space groups NX = number of grid points along the a,b and c axes, respectively. If NY = an input value is inconsistant withthe factoring scheme, the NZ = next largest acceptable value will be used. If zero, see GSP below. MAPTYP = Fourier coefficient selection integer = 1 for FO*exp(i*PHIC) = 2 for FC*exp(i*PHIC) = 3 for (FO-FC)*exp(i*PHIC) = 4 for (2*FO-FC)*exp(i*PHIC) = 5 for (FO-FC)**2 (difference Pattersons) = 6 for FO**2 = 7 for FC**2 = 8 for -i*(FH+ - FH-)*exp(i*PHIH+) (Bijvoet difference Fourier) = 9 for (3*FO-2*FC)*exp(i*PHIC) IPRINT = 0 for no printing of map = 1 for printing of map NPIC = number of non-hydrogen atoms in the asymmetric unit. (Not used within the program but is passed on to program PSRCH via the map file. Should not exceed 140). NORN = 0 for XZ sections = 1 for YZ sections = 2 for XY sections ***** CAUTION ***** If the map file is to be input to programs PSRCH, MAPINV, MAPVIEW, GMAP or CTOUR, NORN must be 0. INPF = 0 for binary reflection file input. = 1 for formatted reflection file input. GSP = Desired grid spacing in angstroms. Defaults to 1.0, applied only if NX=NY=NZ=0 to determine number of grid points along each axis. DCUT = minimum d spacing cutoff, in angstroms, for acceptance of input reflections. CARD 4 INPREF (free format) INPREF = Name of input reflection file. CARD 5 MAPFIL (free format) MAPFIL = Name of output map file. CARD 6 LEVEL, (XLIM(I), I=1,3) (free format) ***** this card should be included ONLY if IPRINT is nonzero ***** LEVEL= scan level, if peaks are greater than scan level, the peak will be underlined with **, if zero, defaults to 100 XLIM(1) = XLIM(2) = printing limits. map will be printed from 0 to XLIM (fractional) XLIM(3) = along each axis ********* NOTES ON THE PROGRAM ********** The input reflection file is terminated by an end of file, and should contain records with H, K, L, FOBS, FCAL, PHI where the first three variables are INTEGERS and the remainder REALS. PHI should be in degrees. The file may either be formatted or binary as indicated by the parameter INPF. If it is formatted the format is assumed to be ( 3I4, 2F10.2, F7.2). If the input file contains records with H,K,L,FPH,FP,PHI then MAPTYP=5 can be used to compute isomorphous difference Pattersons (PHI is not used). If the input records contain H,K,L,F(H,K,L),F(-H,-K,-L),PHI(H,K,L), then MAPTYP=5 will compute anomalous difference Pattersons, and MAPTYP=8 can be used to compute Bijvoet difference Fouriers. Note that if a binary file is input each record must contain six words even if all of them are not used in the calculation, i.e. PHI is not needed if MAPTYP=5,6 or 7, but some value still must be supplied. The output map file is binary and contains NSYM + 2 header records followed by the map. If NORN = 0, the map is written such that each record contains NX density values (one row along x), with NZ consecutive records constituting each section of constant y, i.e. y is slowest varying. If NORN = 1, the positions of x and y are interchanged. If NORN = 2, the positions of y and z are interchanged. All map values are integers scaled as described below. When NORN=0 the map file is suitable for input to programs PSRCH for locating peaks, to MAPINV for modification followed by inversion, to MAPVIEW for interactive contouring and display, to GMAP for conversion to TOM/O or CHAIN formats and for creation of skeletons, or to CTOUR to create hard copies of contoured plots. If NORN is nonzero the only recourse is to print the map within this program. ***** SCALING THE DATA ***** Two scales are used, one for the binary map file and one for the printed map output,if requested. If the input coefficients were on an absolute scale, then the absolute electron density is obtained as follows: rho (absolute) = 10.*(printed map value)/(V*scale) + F000/V rho (absolute) = (value on binary map output file)/(V*scale) + F000/V where V is the unit cell volume and scale is given on the lineprinter output. F000 is the total number of electrons in the unit cell. Note that even if F000 is supplied on the input file, it will not be used in the program, and must be added as indicated above. Also note that the PRINTED output is limited to two digits per density value, but the density is NOT rescaled to a maximum of 99. This means that values of 99 merely imply a density of AT LEAST 99. ***** BIJVOET DIFFERENCE FOURIERS ***** When maptyp=8 is selected, and the input reflection file contains records with H, K, L, F(H,K,L), F(-H,-K,-L), PHI(H,K,L) then a "Bijvoet difference Fourier" will be computed. In this case the map consists of only the "imaginary" part of the electron density, and should show strong positive peaks only at the sites of anomalous scatterers (if the hand is correct). The multiplication factor -i is applied only after expansion to a hemisphere to effectively interchange real and imaginary parts of the density, as the program would normally only compute the "real" part. ***** FILES ***** INPREF - Input Fourier coefficient file, can be either formatted or binary as determined by input parameter INPF. Records should contain h, k, l, Fobs, Fcal, Phi with h,k,l INTEGERS and Fobs, Fcal, Phi REALS. Phi is in degrees. If INPF=1, then file should be formatted with FORMAT(3I4,2F10.2,F7.2) MAPFIL - Binary map file output. Contains NSYM+2 header records followed by the map, as described earlier.
2.04 MAPINV WRITE-UP PURPOSE- To calculate three dimensional Fourier coefficients (structure factors) when given an electron density map and control cards. This program will calculate structure factors by using a multivariate variable radix fast Fourier transform algorithm to invert an electron density map. The program is fully general so that all space groups can be handled. It is assumed that the input map was prepared by program FSFOUR. Structure factors may be calculated for reflections in the following range: h .ge. -NX/2 and h .lt. NX/2 k .ge. -NY/2 and k .lt. NY/2 l .ge. 0 and l .lt. NZ/2 where NX, NY, NZ are the number of grid points along the a, b, and c axes, respectively, in the input map. INPUT DATA (UNIT 5) CARD 1 PAMFIL (free format) PAMFIL = Name of input file containing cell and symmetry parameters. CARD 2 TITLE (free format) TITLE = anything CARD 3 MAPFIL (free format) MAPFIL = Name of input map file. CARD 4 SFOUT (free format) SFOUT = Name of output structure factor file. CARD 5 IPRNT, IPAIR, HMIN, HMAX, KMIN, KMAX, LMAX (free format) IPRNT = 0 for no printing of structure factors. = 1 for printout IPAIR = 0 for no pairing of calculated structure factors with observed data. = 1 to combine calculated structure factors with observed data (supplied on auxilliary file) and output R factor to the line printer = 2 same as 1, but a separate file with the combined data is also written. HMIN = limiting values defining range of HMAX = indices for which structure factors KMIN = will be calculated KMAX = (LMIN is always 0) LMAX = CARD 5A AUXINP (free format) ***** This card should be included ONLY if IPAIR > 0 ***** AUXINP = Name of input file containing auxilliary structure factors for scaling. CARD 5B AUXOUT (free format) ***** This card should be included ONLY if IPAIR = 2 ***** AUXOUT = Name of output file to contain calculated structure factors scaled to (and paired with) the auxilliary structure factors. CARD 6 SC, F000, IMOD, IRHOMN (free format) SC = scale factor applied to calculated structure factors (see below). If 0. defaults to 1. F000 = total number of electrons in the unit cell (see below). IMOD = 0 for no modification of map prior to transformation = 1 to modify map prior to transformation according to input criterion =-1 same as 1 but the resulting density is also squared prior to transformation. IRHOMN = modification criterion (applied if IMOD .ne.0). If (rho input + IRHOMN) < 0, rho will be reset to 0. If F000 is supplied, IRHOMN is automatically set to correspond to non-negativity of electron density. ********** NOTES ON THE PROGRAM ********** The input map file MAPFIL is assumed to have been generated with program FSFOUR. It is binary, terminated with an end of file, and after a few header records, contains the electron density map represented as records (of integers) along x. y is the slowest varying coordinate. All calculated structure factors within the index range specified will be output to file SFOUT. Note that this may include redundant (symmetry related) as well as systematically absent reflections, if they they fall within the specified index range. The output file is binary, with records of H,K,L,FCALC,FCALC,PHI and is terminated by an end of file. H,K and L are INTEGERS whereas FCALC and PHI are REALS. PHI is in degrees. Note that FCALC is duplicated within each record so that the file structure is consistant with the input required by program FSFOUR. If IPAIR > 0, then in addition to file SFOUT, an input file AUXINP of observed structure factor amplitudes will be paired with the corresponding calculated amplitudes and phases, and the combined data used in one cycle of least squares refinement of a scale factor. The resulting R factor between observed and calculated amplitudes is then output to the lineprinter. Note that the input reflection data on file AUXINP is restricted to the same range of indices as the calculated data. If input values fall out of bounds they will be ignored. Therefore, if data were collected with L negative, it will have to be transformed by symmetry before it can be used successfully on file AUXINP. If IPAIR = 2, the results are identical to those obtained with IPAIR = 1, except that the combined (and rescaled) data is also output to a separate file AUXOUT. The new file is of the same form as SFOUT, but with records consisting of H,K,L,FOBS,FCALC,PHICALC for only those reflections which were input on file AUXINP. ***** SCALING THE DATA ***** It is often desirable to control the scale of the calculated structure factors. If the input electron density map was generated from structure factors which are related to an absolute scale by: F(input to FSFOUR) = k * F(absolute) then k should be input for SC to obtain calculated structure factors on an absolute scale. If sc= 0. (or 1.), then the calculated structure factors will be on the same scale as those used to generate the map (unless IMOD = -1, in which case they will be much larger). Note that this scaling applies only to the output on file SFOUT (and the lineprinter, if IPRNT .ne.0). If IPAIR .eq. 2, then the calculated structure factors on file AUXOUT will always be scaled for best agreement with those supplied on file AUXINP. ***** MODIFYING THE MAP ***** The following applies only if IMOD .ne. 0. Inclusion of F000 will result in imposing non-negativity of electron density everywhere in the map prior to inversion, provided SC is reasonably well known. If SC is unknown, then F000 and SC on card 3 should be zero and IRHOMN should be input to control the type and degree of modification. Intelligent use of this parameter would then require knowledge of the input map values prior to running the job. IRHOMN should be equal to F000/V on the same scale as the input map. If IMOD = -1, IRHOMN is first added to each density value, resulting values below zero are set to zero, and each value is then squared prior to Fourier transformation. This is equivalent to imposing non-negativity, followed by one cycle through the tangent formula. Phases can therefore be tangent formula refined or extended. ***** FILE REQUIREMENTS ***** MAPFIL - input map file, binary, as output by program FSFOUR SFOUT - output file with all calculated structure factors, binary, six word records as described earlier AUXINP - auxilliary input structure factor file (required only if IPAIR .ne. 0), binary, six word records in same form as SFOUT (only H,K,L and FOBS are used) AUXOUT - output auxilliary structure factor file, binary, six word records as described earlier.
2.05 PAMFILE WRITE-UP This is not a program, but rather a description of a "standard parameter file" which is read by all programs in the PHASES package, and several auxilliary programs as well. The main purpose of this file is to insure consistency in cell constants, symmetry, lattice type etc throughout all programs, and to eliminate redundant input of these parameters by the user. In addition one can optionally specify the name of a "running log file." If this is done then in addition to normal output to either the screen or individual log files for each program, all printed output is also appended to a single file, preceeded by a time stamp indicating what program was run and when. Thus one can maintain a complete history of all computations and results in a single log file. The standard paramater file is often referred to generically in program write-ups as "PAMFIL." One should select a name for it which is indicative of the particular structure being worked on, and rapidly communicates to the user that it is a parameter file. For example, PDC.PAM might be a good choice for phasing pyruvate decarboxylase. Each standard parameter file should contain the following information in the indicated sequence. LOGFILE=FILNAME Where FILENAME is the name of the desired "running" log file. If no cumulative log is desired, enter LOGFILE=NULL There must be no spaces immediately preceeding or following the "=". Upper or lower case is permitted. LATTICE=X Where "X" is either P,A,B,C,I,F or R There must be no spaces immediately preceeding or following the "=". Upper or lower case is permitted for the word LATTICE, but only UPPER case for the single character symbol. A, B, C, ALPHA, BETA, GAMMA Unit cell constants, in angstroms and degrees. Readable in free format, i.e. at least one blank or comma separating entries. NSYM Number of equivalent positions in the space group. Do NOT include additional translations associated with centering conditions for non-primitive lattices, i.e. for space group C2 NSYM=2. (this entry read in free format). The NSYM symmetry operators follow, one operator per line EXACTLY as indicated in the International Tables for X-Ray Crystallography. The first operator should ALWAYS be X,Y,Z. Note that for rhombohedral lattices the HEXAGONAL AXES AND SYMMETRY OPERATORS SHOULD BE USED, along with the lattice type R. The following sample serves as a complete template for a parameter file, for space group P2(1)2(1)2(1) LOGFILE=seb.rlog LATTICE=P 45.331 68.33 79.62 90. 90. 90. 4 X,Y,Z 1/2-X,-Y,1/2+Z 1/2+X,1/2-Y,-Z -X,1/2+Y,1/2-Z
2.06 MAPVIEW WRITE-UP MAPVIEW is an interactive program to contour and display electron density map sections, display mask sections and to facilitate construction of one or more "molecular masks", by allowing one to interactively "trace out" envelope boundaries with a cursor tied to a mouse. The selected map (and mask) regions may be output for use in other programs, or simply displayed on a workstation monitor. The program is extremely useful for examining ANY map, be it an electron density, difference density or Patterson map, but it is essential for creation of molecular masks for use in noncrystallographic symmetry averaging. The program functions interactively, and must be run on a workstation with a color monitor. Two program versions are available, one specific for Silicon Graphics systems which uses the GL, and another (called mapview_X) which uses X-window graphics. The X-window version also functions on SGI hardware, but the original mapview runs only on SGI systems. If running mapview on an SGI, be sure to initiate the program from a WINTERM window (as opposed to from an XTERM window). mapview_X can be initiated from either window type. Both programs use only the left and middle mouse buttons to accept user input (and the keyboard). When either program is started up, the following sequence of events takes place. ***** MAP SELECTION ***** The program first prompts the user for the name of a "map" file, and inquires whether or not it is a "FSFOUR" type file. Usually it will be a FSFOUR file (which runs over the whole unit cell), but if not, one can input a map as an "averaging" style file (e.g xz sections, as generated by EXTRMAP,MAPORTH,MAPAVG,SKEW or the saved output from an earlier run of this program). ***** MASK SELECTION ***** The program then prompts to find out whether a mask will be created or used. If one simply wants to look at contoured maps it won't, but to create/examine/edit masks for noncrystallographic symmetry averaging purposes or to look at solvent boundary maps answer yes. If yes, the program prompts as to whether a previously created mask file should be used, and if so, for the file name. ***** MAP REGION SELECTION INFORMATION ***** If a "FSFOUR" style map was input, the program then prompts for the minimum and maximum coordinates in each direction, to define the region to be displayed. Any values are allowed (including multiple cells, i.e. the range can span one or more cell edges, as any needed unit cell translations will be done automatically). The program then prompts for the desired section orientation, i.e xz, xy or yz sections. Note that if a previously created mask file is recovered, then only xz sections can be used, and the range selected MUST cover EXACTLY that used when the mask was created. For looking at the solvent masks created by BNDRY this means one must select the entire cell, i.e x,y,z all ranging from 0. to 0.999, although program EXTRMSK can be used first if a different region must be chosen. If an "averaging" style map was input this section is bypassed as only xz sections are possible, and the range will cover precisely that in the input map. ***** SECTION SELECTION INFORMATION ***** The program will then inform the user as to how many map sections are present, how they are numbered, and inquire which section should be displayed initially. ***** CONTOUR LEVEL INFORMATION ***** The program then informs the user what the minimum, maximum and sigma values for the entire input electon density map are. It then prompts for minimum, maximum and increment values for contour levels. The program then contours and displays the requested section, and enters interactive mode. *************** I N T E R A C T I V E M O D E *************** When in interactive mode, a menu is displayed and all subsequent actions are requesed by the user pressing the left mouse button while the cursor is in the desired menu option area. The following actions then take place, when the appropriate item is selected. ***** NEXT SECTION OPTION ***** Replaces the display with the contoured map corresponding to the next (adjacent, higher) map section ***** PREV SECTION OPTION ***** Replaces the display with the contoured map corresponding to the previous (adjacent, lower) map section ***** NEW SECTION OPTION ***** Prompts user for section to be displayed. It lets the user "jump around" the map, rather than having to scroll up or down with multiple "prev" or "next" requests. ***** C LEVEL OPTION ***** Prompts user for new contour level information, then recontours and displays the currently selected section. ***** NEW DIRECTION OPTION ***** Allows user to reselect display region range and/or map orientation. Allowed only when FSFOUR style maps are in use. ***** ADD NEXT SECTION OPTION ***** Contours and displays next (adjacent, higher) section, but doesn't clear old one first, i.e can be used to create projections, since the contoured sections accumulate one on top of another on the screen. Note!! There is no "shifting" as sections are added, thus projections are strictly true only when viewed down an axis orthogonal to the section. (When only a small number of sections are accumulated it is reasonably valid even for nonorthogonal systems, as long as the cell angles are not extreme). ***** ADD PREV SECTION OPTION ***** Same as "ADD NEXT SECTION" option, but the previous (adjacent, lower) section is contoured and added. Same considerations as above. ***** SET MASK NO. OPTION ***** Active only when masks are in use. Allows user to select which mask number (from 1 to 12) is to be used during mask tracing via "TRACE MASK" option. This allows one to create multiple molecular envelope masks, usually one for each molecule within the asymmetric unit. The currently active mask no. will be displayed at the bottom right of the screen, along with a sample of the color assigned to that mask. ***** EXIT OPTION ***** If the orientation is not such that xz sections are in use, the program simply terminates. If xz sections are in use, the program prompts as to whether the map region selected should be saved, and if masks are in use, whether the masks should be saved. If either maps or masks are to be saved, the user is informed as to what region is currently available. If masks are in use and the "MAKE ASU" option had previously been selected, the user is also informed about the ranges delineating the "molecule" as determined from the mask. The user is then prompted as to what region should be saved. The saved region can only be the entire region currently available, or a subset of it. The user is then prompted for file names for each saved file. The "saved" map file is in "AVERAGING" style format, and thus can be read back in to the program later, or can be input to the averaging programs. The mask file can also be read back and/or used with the averaging programs. ***** CLEAR OPTION ***** Clears the screen and redraws the current section, with all parameters unchanged. Useful if one wants to start over and recreate a mask for a specific section, if not happy with the original choice. Also, if the system is very busy, occasionally switching from an open textport back to interactive mode can leave remnants of the textport on the screen. This option can then be used to correct the display. ***** SAVE IMAGE OPTION ***** Pressing the left mouse button while in the "SAVE IMAGE" menu area will save the entire screen contents as an "image" file, with the name "MAPV_N.RGB", where N is a one or two digit number. Numbers start from zero and are automatically incremented each time an image is saved. Up to 100 images can be made in any job. Note!! This function is only operational in the SGI version. The following additional menu options are functional only when masks are in use ***** SHOW MASK OPTION ***** Displays "masks" for current section (if available). Each masked grid point is displayed as a colored dot superimposed on the contoured section. Unassigned points (outside all molecular envelopes) are shown in blue. Points inside molecular envelopes are shown with the color of the particular envelope mask (with black, i.e no color for mask 1). If the "MAKE ASU" option was run, points within molecular envelopes which are redundant (by crystal symmetry) are shown in red. Points outside molecular envelopes, but related to those within molecular envelopes by crystal symmetry are shown in green. ***** MAKE ASU OPTION ***** Should be invoked after all needed mask sections are created. Will prompt the user for the standard parameter file specifying the symmetry information. The program will then examine each point within all molecular envelopes, and check for redundant (by crystallographic symmetry) entries. Mask values for redundant points will then be changed. After using this option, the "SHOW MASK" option will flag redundant points with the color red. If one is happy with the particular asymmetric unit "retained" (not red), then one can exit as the averaging programs will recognize the updated mask value as redundant and ignore it. If however, one feels a symmetry related part of the redundant area should have been retained instead, then the masks should be recreated for the offending sections. After examining all sections, the total number of redundant points, and unique points within all envelopes is output. After identifying redundant points, the user is prompted as to whether points related by symmetry to those within the envelopes should also be identified. If so, the "SHOW MASK" option will color all grid points related to those within the envelopes green. This is useful to insure that all density is accounted for, and to emphasize intermolecular borders. Note that the "MAKE ASU" option SHOULD NOT be used with SKEWED maps. (If a mask is created on a "skewed" map, program TRNMSK can convert it to correspond to normal sections, so that this option can still be used, provided the original extracted map used to create the skewed map is input as well). ***** COPY PREV MASK OPTION ***** Copies mask for previous (adjacent, lower) section to current section, and displays it. It is automatically saved, but one can recreate it with the "TRACE MASK" option, if desired. ***** COPY NEXT MASK OPTION ***** Copies mask for next (adjacent, higher) section to current section, and displays it. It is automatically saved, but one can recreate it with the "TRACE MASK" option, if desired. ***** TRACE MASK OPTION ***** Invoking this option allows the cursor and mouse buttons to be used to trace out the boundary for one or more electron density "islands" in order to isolate individual molecular envelope or averaging boundaries. Press the LEFT mouse button to invoke the option. Then move cursor into map region, and press the LEFT mouse button once at a point on the boundary. Move to new point on boundary and press again. A blue line will be drawn connecting to the previous point. Continue selecting points all along the boundary until you connect back to the initial point. If other "islands" are needed, move the cursor to a point on the boundary of the next "island", and press the MIDDLE mouse button. As before, continue selecting points around the new boundary with the LEFT mouse button until connected to the initial point for this "island". Repeat as often as needed, starting each new island with the MIDDLE mouse button. When the section is completed, move the cursor back to the "TRACE MASK" menu area and press the LEFT button again to let the program know you are done. The mask section will then be "filled" such that a blue dot will be placed at each grid point outside the envelope(s) to show you the mask you created. Note that such a mask must be created for each section which contains part of the envelopes. Note also that the mask creation is done for only one section at a time, thus if multiple sections are displayed (e.g. "ADD NEXT SEC" or "ADD PREV SEC" options were used), the created mask is only for the last (most recent) section. This will always correspond to the section indicated at the top of the plot area. Also be aware that the mask created for each section is initialized (with respect to the mask no. currently active) each time the option is invoked (although the screen display will not be redrawn). Thus once invoked you must complete the mask for that section. Once the process is completed (area "filled"), the mask is saved for that section, and can be displayed at a later time with the "SHOW MASK" option. You can then use the "SET MASK NO." option to select a different mask, and use "TRACE MASK" again on the same section. In this way multiple mask envelopes (with each encompassing a different molecule) can be created, which will be needed for averaging if the noncrystallographic symmetry is not purely rotational or the rotational symmetry is not N-fold where N is an integer. HINTS!!! When pure rotational noncrystallographic symmetry is present, the boundary is much easier to determine if the mask is constructed on a "skewed" map such that sections are orthogonal to the NC symmetry axis. Also, it is much clearer to see if three or four sections are accumulated via "ADD NEXT" or "ADD PREV" commands. A useful procedure therefore is to accumulate 4 sections and create an initial mask, then select "PREV SECTION" twice, followed by "ADD SECTION" three times. This will position you one section higher than the first mask, but with a projection over 4 sections visable. The same thing can be done in the reverse direction with the related commands. You can do this repeatedly as you step through the map. Also, often the density changes slowly, so you may be able to simply copy the previous or next sections mask, and use it unchanged. If you do such a copy but then decide to edit the mask (via TRACE MASK option), the new mask will overwrite the old, but the visable display will reflect the superposition of both old and new masks, and will be confusing. To convince yourself that the new one is in fact used, just select "CLEAR" followed by "SHOW MASK", to see the actual mask saved. Note that when pure rotational symmetry (N-fold) is involved, only one mask is needed which encompasses all related molecules. In that case the mask itself should generally display the same symmetry as the contents of the envelope, thus masks over noncrystallographic twofold symmetry regions should also look like they have twofold symmetry, and only density which at least approximately obeys the expected NC symmetry should be encompassed by the mask. ************** FILE FORMATS *************** FSFOUR style map files are described in the FSFOUR write-up If a map and/or mask file is generated by the program (or a non-FSFOUR style map is to be read in), the following formats apply. ***** "AVERAGED" style map file (binary) format ***** RECORD 1) A,B,C,AL,BE,GA,NX,NY,NZ,IXMN,IYMN,IZMN,IXMX,IYMX,IZMX with first 6 values REAL*4, next 9 INTEGER*4, lengths in angstroms, angles in degrees. NX = Number of grid points defining one "cell length" along NY = respective axis. Implicitly defines grid spacing as del x = A/NX, del y = B/NY and del z = C/NZ NZ = IXMN, IXMX = Minimum, maximum grid index defining map region such IYMN, IYMX = that x (fractional) = IX * (del x) / A etc. There are no restrictions on magnitudes or signs. IZMN, IZMX = The map follows as (IYMX-IYMN+1)*(IZMX-IZMN+1) records, with each containing one row (IXMX-IXMN+1 REAL*4 values) along x, starting at IXMN. y is slowest varying, i.e. the file could have been created with the following FORTRAN code: DO 30 IY=IYMN,IYMX DO 20 IZ=IZMN,IZMX 20 WRITE(LU)(RHO(IX,IY,IZ),IX=IXMN,IXMX) 30 CONTINUE ***** all mask files (binary) format ***** Header record identical to that in "AVERAGED" style map file. Mask records similar to "AVERAGED" style map records except that the mask values are written as FORTRAN type "BYTE" (INTEGER*1). Mask values which are 0, 10, 20, 30, 40 etc idicate corresponding grid point is part of the envelope for masks 1,2,3,4,5 etc, respectively. All other values indicate the points do not belong to any mask envelope, and should not be used for refinement or averaging.
2.07 GMAP WRITE-UP Gmap (Graphics Map) can be used to create map and skeleton files for use with external graphics programs. The program is interactive and prompts for all information. It reads a standard FSFOUR map and prompts the user to provide range information (fractional coordinate limits) for the region to be extracted. Any region is valid, including positive and negative coordinates, and spanning of unit cells. It prompts for the names of the input and output map files, and the type of output file desired. Currently one can output one of two map file types: TOM/O or CHAIN. TOM/O files can be read by the programs TOM (IRIS version of FRODO) as well as by O. CHAIN files can be read by the CHAIN program. If a CHAIN file is requested the user will be asked if CHAIN is to be run on an SGI or ESV workstaion. The standard deviation for the output map is then given to aid in contour level selection later in the graphics programs. The user is then asked if a skeleton should be generated, and if so, for the base and step level for skeletonisation. Suggested base and step levels are 1.25*sigma and sigma, respectively. The user is then prompted for the minimum length to designate main chain skeletons (typically 10). After skeletonisation, one is prompted for the type of skeleton output desired. Two types are available: TOM skeleton files and O skeleton data blocks. In both cases one is prompted for the output file name. If an O skeleton data block is requested, the O molecule name (not to exceed 5 characters) is also prompted for. One can create both TOM and O skeleton output in the same job. INPUT FILE: Standard FSFOUR map, in the default orientation (NORN=0). NOTES: 1) Generation of the graphics maps can be highly machine specific. The current versions of GMAP, if run on an IRIS, SUN, ESV, IBM R6000, or DEC ALPHA (OSF or OPENVMS) workstation will produce map files readable by TOM, O or CHAIN on SGI or ESV workstations. In all cases the map files are directly readable by TOM, O or CHAIN on the target workstation, but will have to be transferred to the graphics workstation (by ftp, with type BINARY set) unless the disks are cross mounted via NFS. At present there is no provision to create map files for older (non-workstation, eg PS300) graphics programs. 2) The map files produced are binary, random access "DSN6" like maps, and should be used directly in the graphics programs without the need for running "mappage", "vaxmap" or any other formatting programs. 3) With some versions of TOM, it may be necessary to recompile TOM on SGI workstations with the f77 flag -old_rl set for the map files to be used correctly. 4) TOM style skeleton files are binary files which can be used with the TOM program on an IRIS, even if they are created with the VAX version of GMAP. If such files are generated on a VAX they must be transferred to an IRIS (via ftp, with the type binary flag set) prior to use on the workstation. If GMAP is run on an IRIS, the TOM style skeleton file can be used directly. O style skeleton data blocks are ascii files, and can be created and used interchangeably on all computer systems. 5) Some versions of TOM limit skeleton files to a maximum of 16000 skeleton points. The current program can generate larger files since O can handle them, but it may be necessary to reduce the region and/or increase the base until the number of points is below 16000 if TOM is to be used. 6) The skeletonisation routine is essentially that which originated in Uppsala, and is simply incorporated in GMAP for convenience.
2.08 MISSNG WRITE-UP MISSNG is a program to compare reflections in the main phased data set with those in other files, and output those reflections absent in the main file but present in the others. The output file thus contains candidates for phase extension. Usually the main (phased) data set is the output from PHASIT and contains all reflections which survived the cutoffs applied. The additional input files should include the complete native set, from which the "merged" data files input to PHASIT were created, but can also include other "phased" files (i.e. files containing phase probability distribution coefficients). In this way additional reflections for which native amplitudes are available (and possibly phase probability distribution coefficients) can be selected for phase extension via option 3 in BNDRY. The program is interactive, and prompts for the input and output file names, whether or not an additional "phased" file is to be included ( perhaps obtained from a partial structure via PHASIT, SF mode with IHLCF=1 and ISIGA=0 or 1), a d spacing cutoff, and the parameter file. The output file contains all additional reflections currently missing from the PHASIT file out to DCUT resolution for which native amplitudes are available. If the additional phased file was included, then for those reflections the phase distribution coefficients are also output and they will be used during phase combination in BNDRY. For those reflections without distribution coefficients, the calculated phase will be output in BNDRY as there is no phase information to combine with. The output file is generally called "extrfl.d". First the main phased file is read. Then if an additional phased file is to be used it is read and the reflections compared with those in the main file. The additional reflections are then written out along with the distribution coefficients. The native file is then read and the indices are transformed to the standard asymmetric unit, and compared to all reflections previously encountered. Those reflections not yet utilized will then also be written to the output file, but will not contain distribution coefficients. ***** FILES ***** The input main phased file should be a PHASIT style output file (long format, i.e. includes probability distribution coefficients). Only the indices are used, so that this file may contain either FM*FO and FO or FO and FC in the amplitude slots. If an additional phased file is used, it should also be a PHASIT style output file (long format), but it MUST contain FM*FO and FO in the amplitude slots as indices, FO and the distribution coefficients are to be used. It could thus be generated in PHASIT, SF mode with IHLCF=1 and ISIGA=0 or 1, or in PHASIT, phasing mode. The input complete native file can be one of three types. If the filename ends with ".MU" or ".mu", then a XENGEN like MULIST is assumed. Thus the file should contain records with H, K, L, RES, F, Sig(F), F+, Sig(F+), F-, Sig(F-), Iflag in format (3I4, 1X, F6.4, 6(1X, F8.2) 1X, I2 ). The "Iflag" parameter is not used and may be absent. If the filename ends with ".SCA" or ".sca", then a SCALEPACK file is assumed. After a variable number of header records (see the FILE FORMATS section), reflection records follow and contain H, K, L, I+, sig(I+), I-, sig(I-) in format (3I4, 4F8.1) Note the use of intensities rather than F's. The last two items in each record may be omitted. If present, they would be used only if I+ was not measured. If the file name does not end with ".MU", ".mu", ".SCA" or ".sca" each record is assumed to contain H, K, L, F, Sig(F) and is read in free format, i.e. each item must be separated by at least one space or a comma. The indices must be INTEGERS and the F and Sig(F) values REALS. In all cases the corresponding F values should be identical (or at least on the same scale) as those input to PHASIT. The output file is generally named extrfl.d for compatability with the supplied template command procedures. Its records will contain H, K, L, FNAT, A_B, C_D in format (3I4, F10.2, 2I12) where the distribution coefficients are packed two per word in A_B and C_D according to A_B = ( IFIX(A*100) + 16384 )*32768 + IFIX(B*100) + 16384 C_D = ( IFIX(C*100) + 16384 )*32768 + IFIX(D*100) + 16384 If distribution coefficients are not available the A_B and C_D values are zero. This file can be used for phase extension in option 3 of BNDRY. Note that MISSNG should always be used to prepare the file for phase extension rather than some other program. This is because MISSNG will insure that the output indices correspond to the same standard asymmetric unit as the rest of the program package. If this is not done, it is possible for redundant (symmetry related) reflections to creep in to the data set.
2.09 MRGDF WRITE-UP MRGDF is a program to create coefficients for difference or cross-difference Fourier synthesis calculations, i.e. try to solve a new derivative from phase information obtained from one or more other derivatives. It can also be used simply to search for additional heavy atom sites once initial estimates of protein phases become available, although the "difference coefficients" file output from PHASIT may be better suited in this case since one can then also subtract out the heavy atoms already present in the model, and also generate a "calculated" Patterson for comparison with the "observed" one. The program is interactive and prompts for the names of the input and output files, and a d spacing cutoff. The output file can be used in FSFOUR with MAPTYP=3 to compute the difference or the cross-difference Fourier synthesis. If the input derivative file is one of the "merged" data files originally input to PHASIT, then the coefficients output can be used to compute a difference Fourier to identify other heavy atom sites which may have been overlooked. In that case it is not a "cross-difference" Fourier but a straight difference Fourier. Program PSRCH can be used to list the strongest peaks in the map. The program will also prompt the user to supply values for derivative to native scale and delta B factors, if rescaling is requested. If utilized, this option enables the user to change the scaling originally carried out in CMBISO, to reflect the fact that additional scattering power is present in the derivative data set. In that case the new scaling parameters should be those determined from PHASIT in phase refinement mode. ***** FILES ***** The input scaled data file is identical (in form) to the isomorphous "merged" data file input to PHASIT. Each record should contain H, K, L, FP, SIG(FP), FPH, SIG(FPH) with FP and FPH already properly scaled together as in CMBISO. This file refers to the new derivative which is to be solved, or to a current derivative for which one wants to search for additional heavy atom sites. It is read in free format. The input protein phase file can be one of two types. Usually it will be the last output file from BNDRY, or an output file from PHASIT (in protein phasing mode). In general, it should contain the best available phases. The form of the file would then be identical to that output from BNDRY or PHASIT (in protein phasing mode). It is also possible however, to input a protein phase file which contains records with the "short" reflection file form (only h,k,l,fo,fc,phi) as generated by GREF or PHASIT (in structure factor calculation mode with IHLCF=0). In that case there is no figure of merit present, thus FOM= 1. is used during generation of the output coefficients. This would be the case if the protein phases come from a complete (or partial) protein model based structure factor calculation. The program can automatically determine which type of file was input. Note however, that a protein phase file generated by GREF should NOT be used here if GREF was run using Bijvoet difference magnitudes as " FOBS", as the phases on the file are then shifted by 90 degrees relative to their true values. (See GREF writeup). The output file is binary and is suitable for input to FSFOUR. Each record contains H, K, L, FOM*FPH, FOM*FP, PHICALC where the indices are INTEGERS, the other quantities REALS and PHICALC is in degrees. The figure of merit and phase come from the phased file while FPH and FP come from the derivative data file.
2.10 MRGBDF WRITE-UP MRGBDF is a program to create coefficients for Bijvoet difference or cross Bijvoet difference Fourier synthesis calculations, i.e. try to determine anomalous scatterer locations from phase information obtained from one or more other derivatives. It can also be used simply to search for additional anomalous scatterer sites once initial estimates of protein phases become availabler, although the "difference coefficients" file output from PHASIT may be better suited in this case since one can then also subtract out the heavy atoms already present in the model, and also generate a "calculated" Patterson for comparison with the "observed" one. The program is interactive and prompts for the names of the input and output files, and a d spacing cutoff. A merged file input may be either a "native anomalous scattering" or "derivative anomalous scattering" type file (see PHASIT write-up), and the user will be prompted to identify the type. The output file can be used in FSFOUR with MAPTYP=8 to compute the Bijvoet-difference Fourier synthesis. If the input Bijvoet pair file is one of the "merged" data files originally input to PHASIT, then the coefficients output can be used to compute a Bijvoet difference Fourier to identify additional anomalous scatterer sites which may have been overlooked. In that case it is not a "cross" Fourier but a straight Bijvoet difference Fourier. If the Bijvoet pair file input corresponds to a new derivative, then the coefficients can be used for a "cross" Bijvoet difference Fourier to reveal the locations of anomalous scatterers in the new derivative. The program is also useful to determine whether the heavy atom hand designation agrees with the anomalous scattering data. When everything is consistent maps computed should reveal POSITIVE peaks at the appropriate anomalous scatterering sites for all Bijvoet pair data sets. If the hand for the heavy atoms used in phasing is inconsistant, then the map will reveal NEGATIVE peaks at the true sites, i.e. those related to the input (incorrect) set BY A CENTRE OF SYMMETRY. Program PSRCH can be used to list the strongest peaks (both positive and negative) in the map, and program HNDCHK can be used to aid in hand determination by examining the density precisely at any arbitrary location. In general, one could phase the data using both possible hands and check the results as just described. If derivative Bijvoet differences are used, The program will prompt the user to supply values for derivative to native scale and delta B factors, if rescaling is requested. If utilized, this option enables the user to change the scaling originally carried out in CMBANO, to reflect the fact that additional scattering power is present in the derivative data set. In that case the new scaling parameters should be those determined from PHASIT in phase refinement mode. ***** FILES ***** The input Bijvoet pair file is identical (in form) to one of the "merged" data files input to PHASIT. Each record should contain either H, K, L, F+, SIG(F+), F-, SIG(F-) or H, K, L, FP, SIG(FP), FPH+, SIG(FPH+), FPH-, SIG(FPH-) This file refers to the new derivative which is to be solved, or to a current data set for which one wants to search for additional anomalous scatterer sites. It is read in free format. The input protein phase file can be one of two types. Usuallly it will be the last output file from BNDRY, or an output file from PHASIT (in protein phasing mode). In general, it should contain the best available phases. The form of the file would then be identical to that output from BNDRY or PHASIT (in protein phasing mode). It is also possible however, to input a protein phase file which contains records with the "short" reflection file form (only h,k,l,fo,fc,phi) as generated by GREF or PHASIT (in structure factor calculation mode, IHLCF=0). In that case there is no figure of merit present, thus FOM= 1. is used during generation of the output coefficients. This would be the case if the protein phases come from a complete (or partial) protein model based structure factor calculation. The program can automatically determine which type of file was input. Note however, that a phase file generated by GREF should NOT be used here unless GREF was used to compute structure factors from a complete protein model. The output file is binary and is suitable for input to FSFOUR. Each record contains H, K, L, FOM*F+, FOM*F-, PHI+ where the indices are INTEGERS, the other quantities REALS and PHI+ is in degrees. PHI+ = PHICALC for + type output file. PHI+ = -PHICALC for - type output file. The figure of merit and PHICALC come from the phased file while F+ and F- come from the Bijvoet pair data file. The Bijvoet difference Fourier should be computed with coefficients -i * (FOM*F+ - FOM*F-) * exp (i * PHI+) where the -i factor is applied after expansion to a hemisphere, and during the expansion, care is taken to insure that the differences are "flipped" if putting the reflection into the desired hemisphere involves an inversion. This is taken care of automatically in FSFOUR if MAPTYP=8 is selected.
2.11 RD31 WRITE-UP RD31 is a program which reads the binary phase files output from either PHASIT, BNDRY, GREF, MAPINV, MRGDF, MRGBDF or IMPORT and converts it to a formatted file. The formatted file can be examined and edited if desired, or it can be used to interface the current phase information with other programs. Originally this program was used to convert the binary file (generated on a Silicon Graphics IRIS computer) to ASCII so that it can be transferred over an ethernet link for use on a VAX. The program is interactive and prompts only for the names of the input and output files. It can read both the "full" style reflection records (includes probability distribution coefficients, FOM etc), or the "short" style records produced by MAPINV, GREF, MRGDF etc, and automatically determine which type was input. Program MK31B can then be used to recreate the binary file from the output of this program. ***** FILES ***** The input binary file should be the phased file output from PHASIT (in protein phasing mode), BNDRY or IMPORT. It can also be the short phase file output from GREF, MAPINV, MRGDF, MRGBDF or PHASIT (in structure factor calculation mode, IHLCF=0). The output ASCII file contains records with H, K, L, FMFO, FO, PHIBEST, IPRAB, IPRCD, MK, FM in FORMAT ( 3I4, 2F10.2, F7.2, 2I12, I5, F6.3 ) H,K,L = Miller indices FMFO = Figure of merit weighted structure factor amplitude (either FOM * FP or FOM * F+) FO = Observed structure factor amplitude (either FP or F+) PHIBEST = Best (centroid) phase, in degrees. IPRAB Hendrickson-Lattman coefficients A,B,C,D for the phase = probability distribution used, packed two per word as IPRCD (IFIX(A*100)+16384)*32768 + IFIX(B*100)+16384 and (IFIX(C*100)+16384)*32768 + IFIX(D*100)+16384 MK = Restricted phase indicator. For general reflections MK=1, for centric reflections MK > 1 and one of the allowed phase values is (MK-1)*15 degrees (the other possibility is 180 degrees away). FOM = Figure of merit associated with PHIBEST and used for weighting. If a file with "short" records was input, only the indices, F values and phase are written in the output records.
2.12 MK31B WRITE-UP MK31B reads the ASCII version of the phase file created by RD31 and creates a binary version. Both "full" or "short" files can be input, and the program automatically determines which type was provided. See RD31 section for details on the files. The program is interactive and prompts only for the names of the input and output files.
2.13 PSRCH WRITE-UP PSRCH is a program which searches the map computed by FSFOUR and lists the coordinates and heights for the largest peaks (positive or negative). Only unique peaks will be listed. It has other options useful in small molecule crystallographic applications, but for protein work only the peak list is used, hence the other options are bypassed and will not be described. This is essentially the peak search program used in the MULTAN78 direct methods programming package. INPUT DATA (UNIT 5) CARD 1 PAMFIL (free format) PAMFIL = Name of input file containing cell parameters and symmetry information (used only to get running log file name). CARD 2 MAPFIL (free format) MAPFIL = Name of map file input. CARD 3 NPEAK, NEGP (free format) NPEAK = Number of largest peaks to list (Max=180) = 0 gives the default of (11*n)/9 where n = number of independent atoms (excluding H) as input to FSFOUR NEGP = 0 to list positive peaks, 1 to list negative peaks ***** FILES ***** The input binary map file should be the output file from FSFOUR. FSFOUR must have been run with the parameter NORN=0 specifying the XZ section orientation.
2.14 CMBISO WRITE-UP CMBISO is an interactive program to merge native data with derivative isomorphous replacement data. It is used to prepare a "merged" file for PHASIT to phase by the isomorphous replacment method, to make an input file for TOPDEL to create difference Patterson coeficients, to make input files for MRGDF for cross Fouriers and to make input files for GREF for refining heavy atom sites against isomorphous differences. The program is interactive and prompts for the names of input and output files, and whether or not one wants to include additional non-Wilson scaling corrections. The program matches up all derivative reflection data with the corresponding native data, scales the derivative data to the native, ond outputs merging R factor statistics and statistics regarding the isomorphous differences. The reflections need not be indexed identically in both input files as symmetry information is used to match up the data. Each input data set however, should contain only unique reflections. Note that Friedels law is assumed to hold when matching up the data. The data sets are initially scaled by computing a relative Wilson plot, and applying scale and thermal corrections derived from it. The user is then asked whether additional non-Wilson scaling should also be done. If it is, then the user is asked whether anisotropic or local scaling should be done. If anisotropic scaling is requested, the reciprocal lattice vectors are orthogonalized and the elements of a symmetric 3x3 scaling tensor are refined by two cycles of least squares. The anisotropic scaling is then applied to all of the derivative reflections, and the tensor elements are printed out. For an isotropic distribution the diagonal elements should be 1.0 and the off diagonal elements zero. Thus deviations from these quantities indicate the degree of anisotropy. If local scaling is requested, then a scale factor for each reflection is determined by a least squares fit of the F's for all neighboring reflections within a given sphere radius to the corresponding native F's, neglecting the central reflection to be scaled. For each reflection the sphere radius is initially chosen to encompass about 125 reflections, and the derived scale factor is accepted if at least 80 neighbors are found. If needed, the sphere radius will be incrementally adjusted until either a preset maximum is reached, or 80 neighbors are found. If the maximum is reached, then the scale factor will still be accepted if 40 neighbors are found. If not, the program will stop and indicate that the data set is too sparse for meaningful local scaling. The mean and minimum number of neighbors used is then listed. For both anisotropic and local scaling, the minimum and maximum scale factors that were applied are listed. ***** FILES ***** Each input file must contain records with Miller indices and the corresponding reflection data values. The files however, can be one of three types. If the file name ends with ".MU" or ".mu", then it is assumed to be a "MULIST" i.e. a file generated by program MAKEMU (in the XENGEN system) or by program FBSCALE. In that case each record is assumed to contain H, K, L, RES, F, Sig(F), F+, Sig(F+), F-, Sig(F-), Iflag in format (3I4, 1X, F6.4, 6(1X, F8.2) 1X, I2 ). Only the indices, F and Sig(F) are used. The "Iflag" parameter may be absent. If the filename ends with ".SCA" or ".sca", then a SCALEPACK file is assumed. After a variable number of header records (see the FILE FORMATS section), reflection records follow and contain H, K, L, I+, sig(I+), I-, sig(I-) in format (3I4, 4F8.1) Note the use of intensities rather than F's. The last two items in each record may be omitted. If present, they would be used only if I+ was not measured. If the file name does not end in ".MU", ".mu", ".SCA" or ".sca" each record is assumed to contain H, K, L, F, Sig(F) and is read in free format, i.e. each item must be separated by at least one space or a comma. The indices must be INTEGERS and the F and Sig(F) values REALS. The output file contains records with H, K, L, FP, Sig(FP), FPH, Sig(FPH) in format ( 3I4, 4F10.2). The last 2 quantities are rescaled to match the native data set. This file is suitable for input to PHASIT using the ISOFLG=0 option, or for input to MRGDF to prepare coefficients for difference or cross difference Fouriers.
2.15 CMBANO WRITE-UP CMBANO is a program to merge native data with derivative anomalous scattering data. It is used to prepare a "merged" file for use in PHASIT to phase by derivative anomalous scattering, to make an input file for TOPDEL to create Bijvoet difference Patterson coeficients, to make input files for MRGBDF for Bijvoet difference Fouriers and to get input files for GREF for refining heavy atom sites against anomalous scattering differences. The program is interactive and prompts for the names of the input and output files, and whether or not one wants to include anisotropic scaling corrections. The program matches up all derivative Bijvoet pairs with the corresponding native data, and scales the derivative data to the native. The reflections need not be indexed identically in both input files as symmetry information is used to match up the data. Each input data set however, should contain only unique reflections. The data sets are initially scaled by computing a relative Wilson plot and applying the scale and thermal corrections derived from it. The user is then asked whether any additional non-Wilson scaling should be done. If it is, then the user is asked whether anisotropic or local scaling should be done. If anisotropic scaling is requested, the reciprocal lattice vectors are orthogonalized and the elements of a symmetric 3x3 scaling tensor are refined by two cycles of least squares. The anisotropic scaling is then applied to all of the derivative reflections, and the tensor elements are printed out. For an isotropic distribution the diagonal elements should be 1.0 and the off diagonal elements zero. Thus deviations from these quantities indicate the degree of anisotropy. If local scaling is requested, then a scale factor for each reflection is determined by a least squares fit of the F's for all neighboring reflections within a given sphere radius to the corresponding native F's, neglecting the central reflection to be scaled. For each reflection the sphere radius is initially chosen to encompass about 125 reflections, and the derived scale factor is accepted if at least 80 neighbors are found. If needed, the sphere radius will be incrementally adjusted until either a preset maximum is reached, or 80 neighbors are found. If the maximum is reached, then the scale factor will still be accepted if 40 neighbors are found. If not, the program will stop and indicate that the data set is too sparse for meaningful local scaling. The mean and minimum number of neighbors used is then listed. For both anisotropic and local scaling, the minimum and maximum scale factors that were applied are listed. Note that the additional scaling (anisotropic or local), if invoked, is applied ONLY TO SCALE THE DERIVATIVE DATA TO THE NATIVE, and NOT to scale F+ to F- within the derivative. Thus one may still want to apply some type of additional scaling to the F+,F- values prior to running this program. ***** FILES ***** The native data file must contain Miller indices and the corresponding reflection data values. The derivative data file must contain Miller indices and the corresponding reflection data including Bijvoet pairs. The files however, can be one of three types. If the file name ends with ".MU" or ".mu", then it is assumed to be a "MULIST" i.e. a file generated by program MAKEMU (in the XENGEN system) or by program FBSCALE. In that case each record is assumed to contain H, K, L, RES, F, Sig(F), F+, Sig(F+), F-, Sig(F-), Iflag in format (3I4, 1X, F6.4, 6(1X, F8.2) 1X, I2 ). For the native file only the indices, F and Sig(F) are used. For the derivative file only the indices, F+, Sig(F+), F-, and Sig(F-) are used. If "Iflag" is present in the derivative file, then it will be used to screen for viable anomalous scattering data, and one can consult the XENGEN documentation for the meaning of the Iflag variable. If it is absent, then only reflections with both F+ and F- values greater than zero will be used. (This criteria also applied even if Iflag is present). If the filename ends with ".SCA" or ".sca", then a SCALEPACK file is assumed. After a variable number of header records (see the FILE FORMATS section), reflection records follow and contain H, K, L, I+, sig(I+), I-, sig(I-) in format (3I4, 4F8.1) Note the use of intensities rather than F's. The last two items in each record may be omitted FOR THE NATIVE SET. If present in the native file they would be used only if I+ was not measured. For the derivative file in general all quantities should be present in each record as if either member of the Bijvoet pair is missing the reflection will not be used. Only reflection with all F's greater than zero will be used. If the filename does not end in ".MU", ".mu", ".SCA" or ".sca" each record is assumed to contain H, K, L, F, Sig(F) if it is the native file and H, K, L, F+, Sig(F+), F-, Sig(F-) if it is the derivative file. The file is then read in free format i.e. each item must be separated by at least one space or a comma. The indices must be INTEGERS and all F and Sig values REALS. Reflections are accepted only if the F (or F+ AND F-, for derivative data) is/are greater than zero. It is absolutely ESSENTIAL that only valid measurements of BOTH F+ and F- are used by the program, and the screening criteria above is designed to insure that. However, depending on the source of the data it is still possible for invalid reflections to be used. For example, some data reduction programs output only the mean F and del ano, and the user then writes a small program to convert this to the individual F+, F- values. If a del ano of zero is encountered, it MAY mean that one of the F's (either F+ or F-) WAS NEVER MEASURED! Yet the conversion program might output this as a Bijvoet pair with a Bijvoet difference of zero! Likewise some data reduction programs, even if they output individual F+ and F- values, actually set one of them equal to the other if only one was measured. This again leads to erroneous Bijvoet differences. Know what your data reduction program is doing, and be very wary of any non-centric reflections with F+ EXACTLY equal to F-, or have del ano equal to zero! The output file contains records with H, K, L, FP, Sig(FP), FPH+, Sig(FPH+), FPH-, Sig(FPH-) in format ( 3I4, 6F10.2). The last 4 quantities are rescaled to match the native data set. This file is suitable for input to PHASIT using the ISOFLG=2 option, or for input to MRGBDF to prepare coefficients for Bijvoet difference or cross Bijvoet difference Fouriers.
2.16 TOPDEL WRITE-UP TOPDEL is a program to examine/select reflection data based on the magnitude of isomorphous or anomalous scattering differences. It can read in any of the "merged" files input to PHASIT and will select from them data based on user supplied d spacing and F/sigma cutoffs. All selected data is then sorted according to the magnitudes of the differences, and the 15 largest differences are displayed along with the mean and standard deviation for all selected data. Generally one uses the list to identify (and reject) outliers (reflections with abnormally large differences) for the purpose of Patterson map (and possibly phasing) calculations. If outliers are present they usually will show up as a "break" at the top of the otherwise smoothly diminishing difference magnitude list. The program is interactive and prompts for the names of files, type of data file input, and the d spacing and F/sigma cutoffs. After displaying the list, the user is prompted as to whether or not a file should be prepared for Patterson map calculations. If a map is requested the user is prompted for the output file name, what percentage of the selected data is to be output, and how many reflections (starting from the top of the sorted list) are to be rejected. Thus if 60% of the data is requested with two rejections, the two largest differences are rejected and the remaining largest differences are output until 60% of the total number of selected reflections is reached. Note! If outliers are rejected, the rejection only applies to the output Fourier file. If you want to reject the outliers from all subsequent phasing calculations, you must manually edit the outliers from the "merged" file input to PHASIT. ***** FILES ***** The program will prompt for the input file name and type of file input. The allowed types are all ASCII, read in free format and can be any of the "merged" files accepted by PHASIT, i.e. records of either h, k, l, FP, Sig(FP), FPH, Sig(FPH) or h, k, l, FP+, Sig(FP+), FP-, Sig(FP-) or h, k, l, FP, Sig(FP), FPH+, Sig(FPH+), FPH-, Sig(FPH-) For the first type isomorphous replacement difference magnitudes ABS(FP-FPH) are selected, displayed and (possibly) output. For the other types anomalous scattering difference magnitudes are used instead. If a Fourier output file for Patterson map calculations is requested by the user a binary file will be written with 6 words per record. The first three words are INTEGERS and the other three REALS. The records will contain h, k, l, FPH, FP, 0. or h, k, l, F+, F-, 0. depending on whether isomorphous or anomalous data was input. Note that the output file can be used in FSFOUR to compute either isomorphous or anomalous difference Patterson maps by setting MAPTYP=5. Native Patterson maps can also be computed with the file by setting MAPTYP=6.
2.17 GREF WRITE-UP GREF is program for the refinement of rigid groups against x-ray diffraction data. It can be used to refine entire protein domains, substructures or even individual atoms. It is space group general and can refine up to 24 groups with a total of up to 20000 atoms in the asymmetric unit. Each input atom must belong to a group, but the atoms can be partitioned into groups in any possible manner. The program is also used to refine heavy atom positions by defining each heavy atom to be a "one atom group", and refining only the group centroid position, occupancy and (possibly) thermal factor (e.g. omit refinement of the group orientation parameters). The program can read any of the "merged" reflection files input to PHASIT (for heavy atom refinement against isomorphous or anomalous scattering differences) or a general file for protein group refinement against native data. Scattering factors can be either the normal type for refinement against native data or isomorphous differences, or 2*delta f" for refinement against anomalous scattering differences. One can select all data, only centric data, or only acentric data for the calculations. The output refined coordinate file can be used as input to PHASIT for subsequent phase calculations. INPUT DATA (UNIT 5) CARD 1 PAMFIL (free format) PAMFIL = Name of input file containing cell parameters, symmetry information etc. CARD 2 INPCDS (free format) INPCDS = Name of file containing input atomic coordinates. CARD 3 INPREF (free format) INPREF = Name of file containing input reflection data. CARD 4 NCYCLS,IFOUR,SC,TO,CUTS,CUTMN,CUTMX,IWGHT,NXSCAT (free format) NCYCLS = # of refinement cycles (= 0 for a single structure factor calculation) IFOUR = 0 for no Fourier coefficient output. = 1 to write final Fourier coefficients to file. SC = scale factor, such that Fobs=SC*Fcalc. TO = Overall isotropic thermal factor. If zero, then individual thermal factors for each group must be supplied. If non-zero, applies to all atoms, and thermal factors for each group SHOULD NOT be input. CUTS = Data selection cutoff. Rejects data with Fobs < CUTS*Sig(Fobs). CUTMN = Data selection cutoff. Rejects data with sin(theta)/lambda < CUTMN. CUTMX = Data selection cutoff. Rejects data with sin(theta)/lambda > CUTMX. IWGHT = Weighting factor indicator. = 0 for weights of 1./Sig(Fobs)**2 = 1 for unit weights. NXSCAT = number of additional atomic types for which scattering factors will be input. Note that 20 types are already stored in the program (see below), thus this is usually nonzero only for exotic atoms or wavelengths other than CU K alpha. CARD 4A OUTREF (free format) ***** Include this card ONLY if IFOUR=1 ***** OUTREF = Name of output phased reflection file CARD 4B OUTCDS (free format) ***** Include this card ONLY if NCYCLS > 0 ***** OUTCDS = Name of output coordinate file CARDS 4C Extra scattering factors (all free format) ***** Include this set of records ONLY if NXSCAT > 0 ***** Up to 5 additional atomic types may be input. For each additional atomic type, include the following 3 records REC 1 (A(J),J=1,4) (free format) A(J) = Coefficients for analytical approximation to scattering factors, as in Int. Tables, Vol IV, pages 99-101. REC 2 (B(J),J=1,4) , C (free format) B(J) = Coefficients for analytical approximation to scattering C = factors, as in Int. Tables, Vol IV, pages 99-101. REC 3 DEL f' , DEL f'' (free format) DEL f' = real part of anomalous scattering correction term. DEL f'' = imaginary part of anomalous scattering correction term. CARD 5 IFLTYP,ICLTYP,ISFTYP,MINCEN,IMODE (free format) These parameters indicate what type of reflection file is input, what type of data is to be used, and what scattering factors are to be used. IFLTYP = 0 for h,k,l,FO,Sig(FO) input, uses FOBS=FO and SIG= Sig(FO). = 1 for h,k,l,FP,Sig(FP),FPH,Sig(FPH) input, uses FOBS=ABS(FP-FPH) and SIG=mean Sig. = 2 for h,k,l,FP+,Sig(FP+),FP-,Sig(FP-) input, uses FOBS=ABS(FP+ - FP-) and SIG=mean Sig. = 3 for h,k,l,FP,Sig(FP),FPH+,Sig(FPH+), FPH-,Sig(FPH-) input, uses FOBS=ABS(FPH+ - FPH-) and SIG= (Sig(FPH+)+Sig(FPH-))/2. ICLTYP = 0 to use all data types. = 1 to use only centric data. = 2 to use only acentric data. ISFTYP = 0 to use normal scattering factors. = 1 to use only 2.*delta f" as scattering factors. MINCEN = minimum number of centric reflections to be used without including 25% strongest differences for acentric reflections. Applied only if ICLTYP=1 (suggested value=75, but space group considerations may dictate otherwise, see NOTES). IMODE = 0 If atom types derived from first character in atom name (only C,N,O,S, Fe recognized). = 1 If atom type code number explicitly input for each atom. CARD 6 (NGP, NAG(I), I=1,NGP) (free format) NGP = # of groups (each atom must belong to a group) NAG(I) = # of atoms in group I. It is assumed that the first NAG(1) atoms form group 1, the next NAG(2) atoms form group 2 etc. CARD 7 (OCC(I), I=1,NGP) (free format) OCC(I) = Occupancy factor for group I. CARD 8 (BETA(I), I=1,NGP) (free format) BETA(I) = Isotropic thermal factor for group I. NOTE! Include this card only if TO is zero. FOLLOWING CARDS ARE TO BE INCLUDED ONLY IF REFINING ( NCYCLS > 0 ) CARD 9 ISC,ITO (free format) ISC = 0 to hold scale factor fixed. = 1 to refine ITO = 0 to hold overall thermal factor fixed. = 1 to refine. (it can be refined only if TO > 0 ). The following card must be repeated for each of the NGP groups. CARDS 10 ITX,ITY,ITZ,IRX,IRY,IRZ,IBT,IOC (free format) ITX = 0 to hold group centroid fixed at respective x,y, or z coordinate. ITY = 1 to refine group centroid translation for respective coordinate. ITZ = IRX = 0 to hold group orientation angle fixed with respect to x,y,or z IRY = axis (orthogonal axes). 1 to refine group rotation angle IRZ = about corresponding axis (all rotations about group centroid). IBT = 0 to hold group thermal factor fixed. = 1 to refine. (applicable only if TO is non zero.) IOC = 0 to hold group occupancy factor fixed. = 1 to refine. ***** FILES ***** INPREF - INPUT REFLECTION DATA (free format) This file can have any of 4 types of structures, with the particular type designated by IFLTYP. All types are read in free format, i.e. each item must be separated by at least one blank or a comma. For all types there should be data for one reflection in each record. For IFLTYP=0 records with h, k, l, FP, Sig(FP) and the data used as input. Typically native data for protein refinement. For IFLTYP=1 records with h, k, l, FP, Sig(FP), FPH, Sig(FPH) and the data used is ABS(FP-FPH), mean Sigma. Typically isomorphous replacement data for heavy atom refinement. For IFLTYP=2 records with h, k, l, FP(h,k,l), Sig( FP(h,k,l) ), FP(-h,-k,-l), Sig( FP(-h,-k,-l) ) and the data used as ABS( FP(h,k,l) - FP(-h,-k,-l) ), mean Sigma. Typically native anomalous scattering data for heavy atom refinement. For IFLTYP=3 records with h,k,l,FP,Sig(FP),FPH(h,k,l), Sig( FPH(h,k,l) ),FPH(-h,-k,-l), Sig( FPH(-h,-k,-l) ) and the data used is ABS( FPH(h,k,l) - FPH(-h,-k,-l) ) and (Sig( FPH(h,k,l) ) + Sig( FPH(-h,-k,-l) ) )/2. Typically derivative anomalous scattering data for heavy atom refinement. INPCDS - INPUT ATOMIC PARAMETERS FORMAT (1X,A1,5X,A1,I3,A4,5F10.5,I5) One atom per record, containing CHN,RTYPE,IRES,ATOM,X,Y,Z,B,OC,ITYP CHN = Single character chain identifier (not used) RTYPE = One letter amino acid code (not used) IRES = Sequence number (not used) ATOM = Atom name (used only if IMODE=0) X = Y = Fractional atomic coordinates Z = B = Isotropic thermal factor (not used as TO or BETA values superseed it) OC = Occupancy factor (not used as OCC value superseeds it) ITYP = 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15, 16,17,18,19 or 20 for C,N,O,S,Fe,Pt, Hg,Au,Pb,Os,I,Zn,Ca,Mg,Cd,U,P,Br,Cl, or Sm, respectively. ITYP= 21 through 20+NXSCAT for the additional types, in same order as originally input. (This field used only if IMODE=1) OUTCDS - OUTPUT ATOMIC PARAMETER FILE (same data and format as INPCDS) Generated only if NCYCLS > 0. Contains the new parameters. If heavy atom refinement was done this file can be inserted in a PHASIT deck for phase calculations. It can also be input back into GREF for further refinement. The format is compatible with the Hendrickson- Konnert program PROTIN, or with PHASIT's structure factor calculation mode. Note that this file may have the same name as INPCDS, although the original contents will then be destroyed. OUTREF - OUTPUT FOURIER COEFFICIENTS (BINARY) Generated following last structure calculation only if IFOUR=1. Contains records of h, k, l, FOBS, FCAL, PHI where the indices are INTEGERS and the other data REALS. PHI is in degrees. This file can be used in program FSFOUR for Fourier calculations, and converted to ASCII by program RD31. ***** NOTES ***** 1) Generally when refining heavy atom parameters against isomorphous differences only centric data is used. In space groups where there is an insufficent number of centric reflections to refine all needed parameters, the program will include the 25% strongest differences for acentric reflections. The input parameter MINCEN determines how many acceptable centric reflections MUST be found to SUPPRESS the automatic inclusion of the acentric data. A good value of MINCEN is 75, but in some space groups it may be necessary to set it to a large (unobtainable) number to force automatic inclusion of the strongest differences for acentric data. An example of this would be space group P2(1) with more than one heavy atom input. In that case there might be a hundred or more centric reflections, but all will be h,0,l reflections, thus the y coordinate of the second atom can not be refined without including some acentric data. Similarly, in space group P1 only centric reflections should be requested (even though there aren't any), but MINCEN should be nonzero. This will force ONLY THE 25% LARGEST ACENTRIC DIFFERENCES to be used. For orthorhombic space groups there generally are enough centric reflections to refine all parameters, thus MINCEN=75 is usually sufficient. The MINCEN parameter is used only if ICLTYP=1, i.e when only centric reflections are requested. 2) When refining heavy atom parameters against anomalous scattering differences, one should use only a certain percentage of the largest differences. If ICLTYP=2 and ISFTYP=1, then heavy atom refinement against anomalous scattering differences is assumed and the program will automatically select the 25% strongest Bijvoet differences for use in the calculations. 3) If anomalous scattering data is used (ISFTYP=1), then the output phases on file 31 are not correct as they are 90 degrees less than their true values. This results from use of scattering factors of 2*delta f" instead of i*(2*delta f"). The computed structure factor amplitudes however, are correct thus the refinement is still valid. Note also that the structure factor calculation is insensitive to the hand of the heavy atoms. 4) Although the refined values of the thermal factors and occupancies are output on the new parameter file, they are ignored on input as these parameters are set based on values in the input control file. Accordingly, one must update the control file if additional cycles are to resume where previous cycles left off. Only the new positional parameters are used from the input coordinate file. One must also update the scale factor in the input control file in a similar manner. The new values of the scale, thermal and occupancy factors are listed on the log file. 5) If a group contains only a single atom, then the three orientation angles can not be refined. If a group contains only two atoms, then only two of the orientation angles can be refined (it usually doesn't matter which two, although it may if the inter-atom vector happens to be parallel to one of the orthogonal axes ). If only one atom is input, then the scale factor and occupancy factor can not both be refined as they are identical in that situation. 6) With low resolution data, occupancy and thermal factors are highly correlated and often can not be refined simultaneously.
2.18 IMPORT WRITE-UP IMPORT is a program which allows the user to enter the PHASES package with externally derived phase information. It is generally used when one wants to bypass the PHASIT program, i.e. phases and Hendrickson-Lattman coefficients are computed via some external program, and one wants to use this phase information within the PHASES package, often for solvent flattening or phase combination with a partial structure. The program is interactive and prompts for the names of input and output files. It then reads the externally prepared reflection file, converts the indices, phase and Hendrickson-Lattman coefficients to correspond to the reflection in the "standard" PHASES asymmetric unit, identifies reflections with restricted phases, and writes the information in a PHASIT style file. This output file can be used within the package wherever a PHASIT file could be used, i.e. in FSFOUR, MISSNG, BNDRY, MRGDF, MRGBDF, RD31 etc. ***** FILES ***** The input file should be an ASCII (formatted) file with each record containing the following data: H, K, L, FOBS, FOM, PHI, A, B, C, D where H,K,L = Miller indices FOBS = Native structure factor amplitude FOM = Figure of merit PHI = Best (centroid) phase, in degrees A,B,C,D = Hendrickson-Lattman coefficients for phase probability distribution The file is read in free format, i.e. items must be separated by at least one blank space or comma. The indices are read as Fortran INTEGERS whereas all other data are read as REALS. The output PHASIT style binary file contains records with H, K, L, FMFO, FO, PHIBEST, IPRAB, IPRCD, MK, FM where H,K,L = Miller indices FMFO = Figure of merit weighted structure factor amplitude (either FOM * FP or FOM * F+) FO = Observed structure factor amplitude (either FP or F+) PHIBEST = Best (centroid) phase, in degrees. IPRAB Hendrickson-Lattman coefficients A,B,C,D for the phase = probability distribution used, packed two per word as IPRCD (IFIX(A*100)+16384)*32768 + IFIX(B*100)+16384 and (IFIX(C*100)+16384)*32768 + IFIX(D*100)+16384 MK = Restricted phase indicator. For general reflections MK=1, for centric reflections MK > 1 and one of the allowed phase values is (MK-1)*15 degrees (the other possibility is 180 degrees away). FOM = Figure of merit associated with PHIBEST and used for weighting. See the PHASIT write-up for more information
2.19 EXTRMAP WRITE-UP Program EXTRMAP extracts a region from an input electron density map prepared by FSFOUR, and writes it to a file in a form suitable for input to any of the averaging programs (MAPORTH,MAPAVG,SKEW,LSQROT etc.) The file can also be used in MAPVIEW if the non-fsfour map mode is specified. This program is generally used to extract a 3D region from the map which encompasses the dimer, trimer etc to be averaged, i.e. a crystallographic asymmetric unit. There are no restrictions on the specified output region, i.e. unit cell edges can be crossed, both in the positive and negative direction. Note that the same thing can be done with MAPVIEW in interactive mode, but EXTRMAP is better suited for incorporation into a batch control file for multiple cycle runs. INPUT DATA (UNIT 5) RECORD I PAMFIL (free format) PAMFIL = Input parameter file, used only to get the "running log" filename. RECORD II INPMAP (free format) INPMAP = Input map file, as generated by FSFOUR RECORD III OUTMAP (free format) OUTMAP = Output map file RECORD IV XMIN, XMAX, YMIN, YMAX, ZMIN, ZMAX (free format) XMIN = XMAX = Minimum and maximum coordinates, fractional YMIN = YMAX = defining volume to be extracted and output. ZMIN = ZMAX = ***** FILES ***** INPUT MAP FILE (BINARY) - standard FSFOUR output, in default orientation i.e. NORN=0 OUTPUT MAP FILE (BINARY) - contains extracted region record 1) A,B,C,AL,BE,GA,NX,NY,NZ,IXMN,IYMN,IZMN,IXMX,IYMX,IZMX with first 6 values REAL*4, next 9 INTEGER*4, lengths in Angstroms, angles in degrees. NX = Number of grid points defining one "cell length" along NY = respective axis. Implicitly defines grid spacing as del x = A/NX, del y = B/NY and del z = C/NZ NZ = IXMN, IXMX = Minimum, maximum grid index defining map region such IYMN, IYMX = that x (fractional) = IX * (del x) / A etc. There are no restrictions on magnitudes or signs. IZMN, IZMX = The map follows as (IYMX-IYMN+1)*(IZMX-IZMN+1) records, with each containing one row (IXMX-IXMN+1 REAL*4 values) along X, starting at IXMN. Y is slowest varying, i.e. the file could have been created with the following FORTRAN code: DO 30 IY=IYMN,IYMX DO 20 IZ=IZMN,IZMX 20 WRITE(LU)(RHO(IX,IY,IZ),IX=IXMN,IXMX) 30 CONTINUE
2.20 EXTRMSK WRITE-UP Program EXTRMSK extracts a region from an input solvent mask file (prepared by BNDRY) and writes it to a file in a form suitable for input to any of the averaging programs (MAPORTH,MAPAVG,SKEW,LSQROT etc.) This program is needed only if one wants to edit a SOLVENT mask, and even then only if the desired asymmetric unit volume must span a cell edge. Note that a similar result can frequently be obtained with MAPVIEW provided one selects the entire map region, i.e. x,y,z all going from 0 to .999, and the mask from BNDRY is read in as well. Then upon exiting from MAPVIEW one can save a subset of the map and mask. However, if the desired subregion to be extracted does not lie completely within the bounds of the input map (for example, a cell edge must be crossed), then this program must be used instead. In EXTRMSK there are no restrictions on the specified output region, i.e. unit cell edges can be crossed, both in the positive and negative direction. INPUT DATA (UNIT 5) RECORD I PAMFIL (free format) PAMFIL = Input parameter file, used only to get the "running log" filename. RECORD II INPMSK (free format) INPMSK = Input mask file, as generated by BNDRY RECORD III OUTMSK (free format) OUTMSK= Output mask file RECORD IV XMIN, XMAX, YMIN, YMAX, ZMIN, ZMAX (free format) XMIN = XMAX = Minimum and maximum coordinates, fractional, YMIN = YMAX = defining volume to be extracted and output. ZMIN = ZMAX = ***** FILES ***** INPUT (AND OUTPUT) MASK FILES (BINARY) record 1) A,B,C,AL,BE,GA,NX,NY,NZ,IXMN,IYMN,IZMN,IXMX,IYMX,IZMX with first 6 values REAL*4, next 9 INTEGER*4, lengths in Angstroms, angles in degrees. NX = Number of grid points defining one "cell length" along NY = respective axis. Implicitly defines grid spacing as del x = A/NX, del y = B/NY and del z = C/NZ NZ = IXMN, IXMX = Minimum, maximum grid index defining map region such IYMN, IYMX = that x (fractional) = IX * (del x) / A etc. There are no restrictions on magnitudes or signs. IZMN, IZMX = The mask follows as (IYMX-IYMN+1)*(IZMX-IZMN+1) records, with each containing one row (IXMX-IXMN+1 BYTE values) along X, starting at IXMN. Y is slowest varying, i.e. the file could have been created with the following FORTRAN code where the array MASK is FORTRAN type BYTE: DO 30 IY=IYMN,IYMX DO 20 IZ=IZMN,IZMX 20 WRITE(LU)(MASK(IX,IY,IZ),IX=IXMN,IXMX) 30 CONTINUE
2.21 MAPAVG WRITE-UP Program MAPAVG, for the averaging of electron density map regions according to noncrystallographic symmetry. The NC symmetry related regions may be in the same map, in different maps (crystals) or both. The program expects the names of all input (unaveraged) map files, all corresponding mask files, all output (averaged) map files and the operators defining the noncrystallographic symmetry. The input map files should be created from FSFOUR maps by running EXTRMAP or MAPVIEW to extract the map region which encompasses only the dimer, trimer etc to be averaged for each crystal. For cross-crystal averaging monomers may be used as well. Each mask map must cover EXACTLY the same region as its corresponding input map. The mask map is generally created by MAPVIEW, and possibly transformed by TRNMSK, although if it is derived from an atomic model it may be created by MDLMSK. The operators are generally refined by LSQROT or LSQROTGEN prior to use in averaging. If cross-crystal averaging is done an additional least squares refinement pass is automatically included prior to averaging to put the density maps from different crystals on a common scale. After averaging however, each output map, will be on the same scale as it was originally input. INPUT DATA (UNIT 5) CARD I PAMFIL (free format) PAMFIL = Name of input parameter file, used only to get the "running log" filename. CARD II NCRYST (free format) NCRYST = Number of different crystals (maps) to be used. (maximum = 6) The following block of cards III-VII must be repeated NCRYST times, once for each crystal. CARD III INPMAP (free format) INPMAP = Name of input (unaveraged) map file for this crystal. CARD IV INPMSK (free format) INPMSK = Name of input mask file for this crystal. CARD V OUTMAP (free format) OUTMAP = Name of output (averaged) map file for this crystal. CARD VI NMOL, (MSK(j), j=1,NMOL) (free format) NMOL = Total number of molecules related by noncrystallographic symmetry WITHIN THIS CRYSTAL (eg 2 for twofold, 3 for threefold etc, MAX=12. Note that it may be one if only cross-crystal averaging of monomers is used) MSK(1) = Mask no. identifying envelope mask for molecule 1 in this crystal MSK(2) = Mask no. identifying envelope mask for molecule 2 in this crystal . . MSK(NMOL) = Mask no. identifying envelope mask for molecule NMOL in this crystal Note that the mask numbers should correspond to those used during mask creation (1-12), and refinement of the operator(s). The following card must be repeated NMOL -1 times, with each entry providing the operator which moves molecule 1 to each additional NC related molecule WITHIN THIS CRYSTAL, eg for a pure threefold, operator which moves molecule 1 to molecule 2, and operator which moves molecule 1 to molecule 3 must be supplied, but the parameters however, will be the same except for CHI. In that case all three molecules may have the same mask no. Note that if nmol=1 this card should NOT be included! CARD(S) VII PHI, PSI, CHI, OX, OY, OZ, T (free format) Spherical polar angles defining direction and rotational order PHI = of noncrystallographic symmetry axis, oriented with respect to orthogonal frame with X along a, Y along c* cross a, and Z along x cross y (i.e. c*). PSI = Psi = angle between NC symmetry axis and +Y axis. Phi = angle between projection of NC symmetry axis on XZ plane and +X axis. CHI = +Phi = CCW rotation about +Y axis as measured from +X axis. +Chi = CW rotation about the directed axis, when viewed from the +axis toward the origin. All angles in degrees. OX = Origin of NC symmetry rotation axis, in angstroms with respect OY = to the orthogonal axes. The axis passes through this point. OZ = T = Post rotation translational shift (in angstroms) parallel to the rotation axis. Note that the transformation operator input is defined as that which moves molecule 1 to molecule J (both molecules within this crystal, with J ranging from 2 to NMOL) via Xj = (Rm) (X1 - Xo) + Xo + T*Rx where Rm is a 3x3 rotation matrix expressed in terms of the spherical polar angles, Xj, X1 are 3 element column vectors containing new and old coordinates, respectively, Xo is a 3 element column vector containing coordinates of the origin point for the rotation axis, T is a post rotation translation shift scalar (in angstroms) and Rx is a 3 element column vector containing direction cosines of the rotation axis. The translation shift T is for a translation parallel to the rotation axis (screw like) as translations in any other direction can be achieved simply by changing the rotation axis origin. An initial estimate of T can be obtained from two points P1, P2 related by the NC symmetry from T = DX cos(PHI)sin(PSI) + DY cos(PSI) -DZ sin(PHI)sin(PSI) where DX = P2x-P1x, DY = P2y-P1y, DZ = P2z-P1z and the P's are expressed in the orthogonal axial system. Note the directionality of the transformation (P1 going to P2 as opposed to P2 going to P1) affects the sign of T (and CHI). THIS IS THE END OF INPUT UNLESS DOING CROSS-CRYSTAL AVERAGING **** The following cards should be included ONLY if NCRYST > 1 **** Cards VIII must be repeated NCRYST -1 times, with each entry providing the operator which moves molecule 1 in crystal 1 to molecule 1 in crystal 2, molecule 1 in crystal 1 to molecule 1 in crystal 3, molecule 1 in crystal 1 to molecule 1 in crystal 4 etc. CARD(S) VIII PHI, PSI, CHI, OX, OY, OZ, T (free format) PHI = PSI = All defined as described above. Note that the operator CHI = is applied to ORTHOGONAL coordinates in crystal 1 to OX = generate ORTHOGONAL coordinates in the target crystal. OY = OZ = T = NOTES: Each input mask must coincide exactly with its corresponding input map. CROSS-CRYSTAL AVERAGING: If the different crystals contain different aggegation states of the molecule within their respective asymmetric units (eg monomer in one crystal, dimer in another). Then the crystal with the lowest agregation state should come first in the input list, and mask assignments in the other crystals must uniquely identify molecules of this same size. Thus for example, if a crystal contained a dimer having pure NC twofold symmetry and it was the only crystal used, normally a single mask encompassing the entire dimer would be supplied (mask numbers would be identical for molecules 1 and 2). If however, in addition to averaging over this twofold, one also averages with another crystal form containing only a monomer, then the monomer crystal should come first in the list, and different mask numbers must be used within the dimer crystal to distinguish the individual monomers. If all crystal forms contain the same basic unit (eg dimers, trimers etc), then individual mask numbers for each monomer are not required, but may still be used as long as it is done consistantly in all crystals. ***** FILES ***** INPUT MAP FILES (BINARY) record 1) A,B,C,AL,BE,GA,NX,NY,NZ,IXMN,IYMN,IZMN,IXMX,IYMX,IZMX with first 6 values REAL*4, next 9 INTEGER*4, lengths in Angstroms, angles in degrees. NX = Number of grid points defining one "cell length" along NY = respective axis. Implicitly defines grid spacing as del x = A/NX, del y = B/NY and del z = C/NZ NZ = IXMN, IXMX = Minimum, maximum grid index defining map region such IYMN, IYMX = that x (fractional) = IX * (del x) / A etc. There are no restrictions on magnitudes or signs. IZMN, IZMX = The map follows as (IYMX-IYMN+1)*(IZMX-IZMN+1) records, with each containing one row (IXMX-IXMN+1 REAL*4 values) along X, starting at IXMN. Y is slowest varying, i.e. the file could have been created with the following FORTRAN code: DO 30 IY=IYMN,IYMX DO 20 IZ=IZMN,IZMX 20 WRITE(LU)(RHO(IX,IY,IZ),IX=IXMN,IXMX) 30 CONTINUE INPUT MASK FILES (BINARY) Header record identical to map file. Mask records similar to normal map records except that the mask values are written as FORTRAN type "BYTE" (INTEGER*1). Only grid points with mask values of 0, 10, 20, 30, 40 etc will be used (i.e. inside envelope masks 1,2,3,4,5 etc, respectively). OUTPUT MAP FILES (BINARY) Identical (in structure) to input map file, but contains density "averaged" over the specified points.
2.22 MAPORTH WRITE-UP This program orthogonalizes an electron density map for later use with programs LSQROT or LSQROTGEN. This program is only needed if one wants to refine the noncrystallographic symmetry operator(s), and even then only if the unit cell is not orthogonal. One generally extracts a region from the map encompassing one dimer, trimer etc. with MAPVIEW or EXTRMAP, and inputs only the extracted map here for orthogonalization. One can optionally othogonalize an input mask file in addition to the map if desired, as may be the case if the mask will be used to delineate volumes to use in the refinement. INPUT DATA (UNIT 5) CARD I PAMFIL (free format) PAMFIL = Input file specifying cell parameters, symmetry, used only to get "running log" file. CARD II INPMAP (free format) INPMAP = Input (non-orthogonal) map, from EXTRMAP or MAPVIEW CARD III IRANGE (free format) IRANGE = 0 for normal operation. = 1 to simply compute orthogonal coordinates for all points in input map, and ouput range (in orthogonal coordinates) which just encompasses all of the input map. The range can then be used to determine map parameters for a subsequent run with IRANGE=0. ******** following CARDS read only if IRANGE = 0 ******** CARD IV OUTMAP (free format) OUTMAP = Output map file to contain orthogonal map. CARD V a', b', c' (free format) a' = Cell lengths (in angstroms) for the output orthogonal map. They should be large enough to cover the same b' = volume as the input map, when it is referenced in the orthogonal system. New orthogonal cell a',b',c' has a' c' = along old a, b' along old c* cross old a, c' along a' cross b' ( i.e. old c*) CARD VI MGX,MGY,MGZ,LXMN,LXMX,LYMN,LYMX,LZMN,LZMX (free format) MGX = Number of grid points defining one "cell length" along MGY = the respective orthogonal axis. Implicitly defines grid spacing as del x = a'/MGX, del y = b'/MGY and del z = c'/MGZ MGZ = LXMN, LXMX = Minimum, maximum grid index defining output map region LYMN, LYMX = such that x (fractional) = LX * (del x) / a' etc. There are no restrictions on magnitudes or signs. LZMN, LZMX = CARD VII IMASK (free format) IMASK = 0 For no mask input. = 1 for input mask corresponding to input map. The mask will also be "orthogonalized" so that it can be used by LSQROT or LSQROTGEN. ******** following cards read ONLY if MASK=1 ******** CARD VIII INPMSK (free format) INPMSK = Input mask file, from MAPVIEW or EXTRMSK CARD IX OUTMSK (free format) OUTMSK = Output (orthogonal) mask file NOTES: If a mask is input, it must coincide exactly with the input map. ******** FILES ******** INPUT MAP FILE (BINARY) record 1) A,B,C,AL,BE,GA,NX,NY,NZ,IXMN,IYMN,IZMN,IXMX,IYMX,IZMX with first 6 values REAL*4, next 9 INTEGER*4, lengths in Angstroms, angles in degrees. NX = Number of grid points defining one "cell length" along NY = respective axis. Implicitly defines grid spacing as del x = A/NX, del y = B/NY and del z = C/NZ NZ = IXMN, IXMX = Minimum, maximum grid index defining map region such IYMN, IYMX = that x (fractional) = IX * (del x) / A etc. There are no restrictions on magnitudes or signs. IZMN, IZMX = The map follows as (IYMX-IYMN+1)*(IZMX-IZMN+1) records, with each containing one row (IXMX-IXMN+1 REAL*4 values) along X, starting at IXMN. Y is slowest varying, i.e. the file could have been created with the following FORTRAN code: DO 30 IY=IYMN,IYMX DO 20 IZ=IZMN,IZMX 20 WRITE(LU)(RHO(IX,IY,IZ),IX=IXMN,IXMX) 30 CONTINUE INPUT MASK FILE (BINARY) Header record identical to map file. Mask records similar to normal map records except that the mask values are written as FORTRAN type "BYTE" (INTEGER*1). Only grid points with mask values of 0, 10, 20, 30, 40 etc will be transformed (i.e. inside envelope masks 1,2,3,4,5 etc, respectively). OUTPUT MAP FILE (BINARY) Identical (in structure) to input map file, but cell, map and density values correspond to the orthogonal "cell". OUTPUT MASK FILE (BINARY) Header record identical to map file. Mask records similar to normal MASK records, but cell, map and mask values correspond to the orthogonal "cell".
2.23 LSQROT WRITE-UP This program refines the orientation and position of a noncrystallographic symmetry axis by least squares. It is applicable only to pure rotational noncrystallographic symmetry axes, and the rotation must be N-FOLD where N is an integer. An input map (which MUST be orthogonal) is read in along with control information specifying initial values for the operator, and what area in the map to consider. The map area considered can be all points within a given distance from an arbitrary input point, all points within an input mask, or all points simultaneously satisfying both conditions. The input map usually encompasses a dimer, trimer etc and was extracted from a FSFOUR map via programs MAPVIEW or EXTRMAP. The input mask, if one is used, must correspond precisely to the input map. If the input map (and mask) does not correspond to an orthogonal system, program MAPORTH should be used to convert them before they can be used here. After each cycle of refinement the correlation coefficient is printed along with the new parameters. One should always start with a low resolution map (roughly 6 angstrom data, on a 2 angstrom grid) in case the initial estimates are inaccurate. As the refinement converges higher resolution data and finer map grids should be used. It is often sufficient to refine within a sphere of radius 25-35 angstroms centered on a point on the rotational axis near the center of gravity of the dimer, trimer etc. This enables refinement of the operator without the need for an input mask (although one will be needed later for averaging). Usually correlation coefficients of about 0.4 or higher (in a 4 angstrom map) indicate the noncrystallographic symmetry axis is well positioned, and that averaging will be useful. INPUT DATA (UNIT 5) CARD I PAMFIL (free format) PAMFIL = Input file specifying cell and symmetry parameters, used only to get "running log" file CARD II INPMAP (free format) INPMAP = Input map (orthogonal) CARD III PHI, PSI, OX, OY, OZ, NFOLD (free format) PHI = Spherical polar angles defining direction of the non- crystallographic symmetry axis, oriented with respect PSI = to orthogonal frame with X along a, Y along c* cross a, and Z along x cross y (i.e. c*). PSI = angle between NC symmetry axis and +Y axis. PHI = angle between projection of NC symmetry axis on XZ plane and +X axis. +PHI= CCW rotation about +Y axis as measured from +X axis. OX = Origin of NC symmetry rotation axis, in angstroms OY = with respect to the orthogonal axes. The axis passes through this point. OZ = NFOLD = Order of the rotational axis, e.g 2,3,4 etc. CARD IV NOBS, NCYCLE, ISPHER, IMASK (free format) NOBS = 2 times number of reflections used to compute map (used only to compute sigmas) NCYCLE = Number of refinement cycles ISPHER = 0 use all points in map = 1 use only grid points within a specified sphere. IMASK = 0 for no mask input = 1 to only use grid points within envelope specified by input mask. (also subject to ISPHER criteria) CARD V INPMSK (free format) ******** include this card ONLY if IMASK=1 ******** IMASK = Input mask file CARD VI XCEN, YCEN, ZCEN RAD (free format) ******** include this card ONLY if ISPHER=1 ******** XCEN = Sphere center, in Angstroms, with respect to orthogonal YCEN = coordinate system. ZCEN = RAD = Sphere radius, in Angstroms CARD VII ( IVAR(I), I=1,5 ) (free format) Variable selection information IVAR(1) = 1 to refine PHI, 0 to hold fixed IVAR(2) = 1 to refine PSI, 0 to hold fixed IVAR(3) = 1 to refine OX, 0 to hold fixed IVAR(4) = 1 to refine OY, 0 to hold fixed IVAR(5) = 1 to refine OZ, 0 to hold fixed NOTES: Input map must be orthogonal. If the crystal system does not have orthogonal axes, program MAPORTH must be run to orthogonalize the map (and mask, if one is to be used). If a mask is input, it must coincide exactly with the input map. Normally all parameters are refined, but occasionally one must use the IVAR selection flags to hold a parameter fixed. An example would be the case where PSI is close to 0, in which case PHI is then indeterminate (and irrelevant!). One could then hold PHI fixed to avoid matrix singularities. ******** FILES ******** INPUT MAP FILE (BINARY) record 1) A,B,C,AL,BE,GA,NX,NY,NZ,IXMN,IYMN,IZMN,IXMX,IYMX,IZMX with first 6 values REAL*4, next 9 INTEGER*4, lengths in Angstroms, angles in degrees. NX = Number of grid points defining one "cell length" along NY = respective axis. Implicitly defines grid spacing as del x = A/NX, del y = B/NY and del z = C/NZ NZ = IXMN, IXMX = Minimum, maximum grid index defining map region such IYMN, IYMX = that x (fractional) = IX * (del x) / A etc. There are no restrictions on magnitudes or signs. IZMN, IZMX = The map follows as (IYMX-IYMN+1)*(IZMX-IZMN+1) records, with each containing one row (IXMX-IXMN+1 REAL*4 values) along X, starting at IXMN. Y is slowest varying, i.e. the file could have been created with the following FORTRAN code: DO 30 IY=IYMN,IYMX DO 20 IZ=IZMN,IZMX 20 WRITE(LU)(RHO(IX,IY,IZ),IX=IXMN,IXMX) 30 CONTINUE INPUT MASK FILE (BINARY), if needed Header record identical to map file. Mask records similar to normal map records except that the mask values are written as FORTRAN type "BYTE" (INTEGER*1). Only grid points with mask values of 0, 10, 20, 30, 40 etc will be used (i.e. inside envelope masks 1,2,3,4,5 etc, respectively).
2.24 LSQROTGEN WRITE-UP This program refines the orientation and position of a general noncrystallographic symmetry operator by least squares. It is applicable to any noncrystallographic symmetry transformation, including arbitrary rotation angles and post rotation translations. The operator being refined may relate regions of density within the same crystal or in different crystals, so that cross-crystal averaging is possible. The input map(s) (which MUST be orthogonal) are read in along with control information specifying initial values for the operator, the number of crystals, and what areas in the map(s) to consider. The map areas considered can be all points within a given distance from arbitrary input points, all points within input masks, or all points simultaneously satisfying both conditions. The input map(s) usually encompasses a dimer, trimer etc and are extracted from FSFOUR maps via programs MAPVIEW or EXTRMAP. The input masks, if used, are prepared by MAPVIEW or MDLMSK and must correspond precisely to the input maps. In the single crystal case if masks are used, different mask values corresponding to the different molecules present allow a single mask file to be used (see MAPVIEW). One merely specifies which molecules (mask numbers) are to be used in refinement. If the input map (and its corresponding mask) do not correspond to an orthogonal system, program MAPORTH should be used to convert them before they can be used for operator refinement here. After each cycle of refinement the correlation coefficient is printed along with the new parameters. In the two crystal case a scale factor relating the density within the appropriate envelopes in each crystal is automatically refined. In most cases one should start with a low resolution map (roughly 6 angstrom data, on a 2 angstrom grid) in case the initial operator parameters are inaccurate. As the refinement converges higher resolution data and finer map grids should be used. It is often sufficient to refine considering only density within spheres of radius 15-25 angstroms, centered on either the aggregate centroid (if pure rotational symmetry is present), or centered on each molecules center of gravity (the general case). This enables one to refine the operator without the need for an input mask (although a mask will be needed later for averaging). Usually correlation coefficients of about 0.4 or higher (in a 4 angstrom map) indicate the noncrystallographic symmetry axis is well positioned, and that averaging will be useful. INPUT DATA (UNIT 5) CARD I PAMFIL (free format) PAMFIL = Input file specifying cell and symmetry parameters, used only to get "running log" file CARD II NCYCLE, NCRYST (free format) NCYCLE = No. of cycles of least squares refinement. NCRYST = No. of crystals (1 or 2). Normally 1 but for cross-crystal averaging it should be 2. CARD III MAPFILE1 (free format) MAPFILE1 = Name of file containing map for crystal 1. ***** Include card IIIA ONLY if NCRYST=2 ***** CARD IIIA MAPFILE2 (free format) MAPFILE2 = Name of file containing map for crystal 2. CARD IV PHI, PSI, CHI, OX, OY, OZ, T (free format) Spherical polar angles defining direction and rotational PHI = order of noncrystallographic symmetry axis, oriented with respect to orthogonal frame with X along a, Y along c* cross a, and Z along x cross y (i.e. c*). PSI = angle PSI = between NC symmetry axis and +Y axis. PHI = angle between projection of NC symmetry axis on XZ plane and CHI = +X axis. +PHI = CCW rotation about +Y axis as measured from +X axis. +CHI = CW rotation about the directed axis, when viewed from the +axis toward the origin OX = Origin of NC symmetry rotation axis, in angstroms with OY = respect to the orthogonal axes. The axis passes through this point. OZ = T = Post rotation translational shift (in angstroms) parallel to the rotation axis. Note that the transformation operator refined is defined as that which moves molecule "A" to molecule "B" via Xb = (Rm) (Xa - Xo) + Xo + T*Rx where Rm is a 3x3 rotation matrix expressed in terms of the spherical polar angles, Xb, Xa are 3 element column vectors containing new and old coordinates, respectively, Xo is a 3 element column vector containing coordinates of the origin point for the rotation axis, T is a post rotation translation shift scalar (screw like, in angstroms) and Rx is a 3 element column vector containing direction cosines of the rotation axis. All components of the operator are given in terms of orthogonal coordinates in Angstroms, and the operator is applied to (and yields) orthogonal coordinates. For cross-crystal averaging applications molecule "A" is assumed to reside in crystal 1 and molecule "B" in crystal 2. The ORTHOGONAL coordinate systems in both crystals are then simply superimposed. Note that the operator is defined relative to only the (common) ORTHOGONAL axes, so one need not be concerned about its orientation realatve to each set of CRYSTAL axes. CARD V ISPHERE_A, MSK_A (free format) ISPHERE_A = 0 consider all points in map (subject only to MSK_A criteria below) for molecule A. = 1 consider only grid points within specified sphere for molecule A. (also subject to MSK_A criteria below). MSK_A = 0 no mask input for molecule A. = 1 consider only grid points within envelopes specified by input mask for molecule A. (also subject to ISPHERE_A criteria) Note that ISPHERE_A and MSK_A should not BOTH be 0, although both can be 1 in which case both criteria are applied for grid point selection. ***** Include cards VA and VB ONLY if MSK_A = 1 ***** CARD VA MASK_A (free format) MASK_A = Mask number (from 1-12) identifying points within molecular envelope for "A". The value should correspond to that used during mask creation. CARD VB MASKFILE1 (free format) MASKFILE1 = Name of file containing mask for crystal 1. ***** Include card VC ONLY if ISPHERE_A = 1 ***** CARD VC XCENA,YCENA,ZCENA,RADA (free format) XCENA = Sphere center, in Angstroms, with respect to orthogonal YCENA = coordinate system, situated in molecule "A". ZCENA = RADA = Sphere radius, in Angstroms, for molecule "A" CARD VI ISPHERE_B, MSK_B (free format) ISPHERE_B = 0 consider all points in map (subject only to MSK_B criteria below) for molecule B. = 1 consider only grid points within specified sphere for molecule B. (also subject to MSK_B criteria below). MSK_B = 0 no mask input for molecule B. = 1 consider only grid points within envelopes specified by input mask for molecule B. (also subject to ISPHERE_B criteria) Note that ISPHERE_B and MSK_B should not BOTH be 0, although both can be 1 in which case both criteria are applied for grid point selection. ***** Include cards VIA and VIB ONLY if MSK_B = 1 ***** CARD VIA MASK_B (free format) MASK_B = Mask number (from 1-12) identifying points within molecular envelope for "B". The value should correspond to that used during mask creation. CARD VIB MASKFILE2 (free format) MASKFILE2 = Name of file containing mask for crystal 2. (may be same as MASKFILE1, if NCRYST=1). ***** Include card VIC ONLY if ISPHERE_B = 1 ***** CARD VIC XCENB,YCENB,ZCENB,RADB (free format) XCENB = Sphere center, in Angstroms, with respect to orthogonal YCENB = coordinate system, situated in molecule "B". ZCENB = RADB = Sphere radius, in Angstroms, for molecule "B" CARD VII ( IVAR(I), I=1,7 ) (free format) Variable selection information IVAR(1) = 1 to refine PHI, 0 to hold fixed IVAR(2) = 1 to refine PSI, 0 to hold fixed IVAR(3) = 1 to refine CHI, 0 to hold fixed IVAR(4) = 1 to refine OX, 0 to hold fixed IVAR(5) = 1 to refine OY, 0 to hold fixed IVAR(6) = 1 to refine OZ, 0 to hold fixed IVAR(7) = 1 to refine T, 0 to hold fixed NOTES: Input maps (and masks, if used) must be orthogonal. If the crystal systems do not have orthogonal axes, program MAPORTH must be run to orthogonalize the maps (and masks, if used). If masks are input, they must coincide exactly with their corresponding maprs. Normally all parameters are refined, but occasionally one must use the IVAR selection flags to hold one or more parameters fixed. An example would be the case where PSI is close to 0, in which case PHI is then indeterminate (and irrelevant!). One could then hold PHI fixed to avoid matrix singularities. Also, in cases where pure translations are involved, one could hold all of the angles fixed. The translation shift T is for a translation parallel to the rotation axis (screw like) as translations in any other direction can be achieved simply by changing the rotation axis origin. An initial estimate of T can be obtained from two points P1, P2 related by the NC symmetry from T = DX cos(PHI)sin(PSI) + DY cos(PSI) -DZ sin(PHI)sin(PSI) where DX = P2x-P1x, DY = P2y-P1y, DZ = P2z-P1z and the P's are expressed in the orthogonal axial system. Note the directionality of the transformation (P1 going to P2 as opposed to P2 going to P1) affects the sign of T (and CHI). If the transformation operator is available as a simple 3x3 matrix and 1x3 vector WHICH OPERATES ON ORTHOGONAL COORDINATES AS IN THE PROTEIN DATA BANK FRAMEWORK, then the program O_to_sp can be used to convert that representation to the spherical polar angles, axis offset and post rotation translation needed here. An example of this usage would be to get the matrix and vector from one of the "lsq" options in the graphics program "O", and use o_to_sp to convert the information to PHASES style. ******** FILES ******** INPUT MAP FILES (BINARY) record 1) A,B,C,AL,BE,GA,NX,NY,NZ,IXMN,IYMN,IZMN,IXMX,IYMX,IZMX with first 6 values REAL*4, next 9 INTEGER*4, lengths in Angstroms, angles in degrees. NX = Number of grid points defining one "cell length" along NY = respective axis. Implicitly defines grid spacing as del x = A/NX, del y = B/NY and del z = C/NZ NZ = IXMN, IXMX = Minimum, maximum grid index defining map region such IYMN, IYMX = that x (fractional) = IX * (del x) / A etc. There are no restrictions on magnitudes or signs. IZMN, IZMX = The map follows as (IYMX-IYMN+1)*(IZMX-IZMN+1) records, with each containing one row (IXMX-IXMN+1 REAL*4 values) along X, starting at IXMN. Y is slowest varying, i.e. the file could have been created with the following FORTRAN code: DO 30 IY=IYMN,IYMX DO 20 IZ=IZMN,IZMX 20 WRITE(LU)(RHO(IX,IY,IZ),IX=IXMN,IXMX) 30 CONTINUE INPUT MASK FILES (BINARY), if needed Header record identical to map file. Mask records similar to normal map records except that the mask values are written as FORTRAN type "BYTE" (INTEGER*1). Only grid points with mask values of 0, 10, 20, 30, 40 etc will be used (i.e. inside envelope masks 1,2,3,4,5 etc, respectively).
2.25 SKEW WRITE-UP Program SKEW, for the conversion of a "normal" input map (and optionally, mask) to a "skewed" cell, such that the new b axis will correspond to a specified direction. The input map is usually created by MAPVIEW or EXTRMAP, and the input mask (if any) is usually created by MAPVIEW or EXTRMSK. This program is generally used if the input noncrystallographic symmetry operator is purely rotational, with the rotational order given by 360/N where N is an integer, i.e. pure twofolds, threefolds etc. In that case it is much easier to create the averaging mask (in MAPVIEW) when looking directly down the noncrystallographic symmetry axis, and convert it back to the standard orientation (via program TRNMSK) for use in averaging cycles. Thus one would use MAPVIEW or EXTRMAP to extract a region from the map encompassing the dimer, trimer etc. to be averaged, skew that extracted map, input it into MAPVIEW to trace out the mask in the skewed direction, and then convert the skewed mask from MAPVIEW back into the standard orientation with TRNMSK. One would then input the standard (non-skewed) mask into MAPVIEW again, along with the submap from which the skewed map was originally created, and invoke the MAKE ASU option. This last step allows one to check for redundant entries (by CRYSTAL symmetry) within the envelope, and to correct them prior to saving the final mask for averaging. INPUT DATA (UNIT 5) CARD I PAMFIL (free format) PAMFIL = Input parameter file containing cell and symmetry information, used only to get "running log" file CARD II INPMAP (free format) INPMAP = Input map file, from MAPVIEW or EXTRMAP CARD III PHI, PSI, OX, OY, OZ (free format) Spherical polar angles defining direction of PHI = noncrystallographic symmetry axis, oriented with respect to orthogonal frame X,Y,Z with X along a, Y along c* cross PSI = a, and Z along X cross Y (i.e. c*). PSI = angle between NC symmetry axis and +Y axis. PHI = angle between projection of NC symmetry axis on XZ plane and +X axis. +PHI= CCW rotation about +Y axis as measured from +X axis. OX = Origin of NC symmetry rotation axis, in angstroms with OY = respect to the orthogonal axes. The axis passes through this point. OZ = CARD IV IRANGE, IMASK (free format) IRANGE = 0 for normal operation (coordinate range for OUTPUT skewed map will be input to the program by the user) = 1 to determine coordinate range for OUTPUT map which just encompasses the input volume, output it, and stop. IMASK = 0 for normal operation (only skewed map created) = 1 to additionally create skewed mask, from input mask ****** FOLLOWING CARDS READ ONLY IF IRANGE=0 ****** CARD V OUTMAP (free format) OUTMAP = Output (skewed) map file CARD VI CELL,MX,MY,MZ, LXMN,LXMX,LYMN,LYMX,LZMN,LZMX (free format) CELL = length (in Angstroms) for "cell" parameters in "skewed" cell. MX = Number of grid points defining one "cell length" along MY = respective axis in "skewed" cell. Implicitly defines grid spacing as del x = CELL/MX, del y = CELL/MY and del z = CELL/MZ MZ = LXMN, LXMX = Minimum, maximum grid index defining output map region LYMN, LYMX = such that x (fractional) = LX * (del x) / CELL etc. There are no restrictions on magnitudes or signs. LZMN, LZMX = ****** FOLLOWING CARDS READ ONLY IF IMASK=1 ****** CARD VII INPMSK (free format) INPMSK = Input mask file (standard orientation) CARD VIII OUTMSK (free format) OUTMSK = Output (skewed) mask file. ******** FILES ******** INPUT (AND OUTPUT) MAP FILES (BINARY) record 1) A,B,C,AL,BE,GA,NX,NY,NZ,IXMN,IYMN,IZMN,IXMX,IYMX,IZMX with first 6 values REAL*4, next 9 INTEGER*4, lengths in Angstroms, angles in degrees. NX = Number of grid points defining one "cell length" along NY = respective axis. Implicitly defines grid spacing as del x = A/NX, del y = B/NY and del z = C/NZ NZ = IXMN, IXMX = Minimum, maximum grid index defining map region such IYMN, IYMX = that x (fractional) = IX * (del x) / A etc. There are no restrictions on magnitudes or signs. IZMN, IZMX = The map follows as (IYMX-IYMN+1)*(IZMX-IZMN+1) records, with each containing one row (IXMX-IXMN+1 REAL*4 values) along X, starting at IXMN. Y is slowest varying, i.e. the file could have been created with the following FORTRAN code: DO 30 IY=IYMN,IYMX DO 20 IZ=IZMN,IZMX 20 WRITE(LU)(RHO(IX,IY,IZ),IX=IXMN,IXMX) 30 CONTINUE NOTES: If a mask is input, it must coincide exactly with the input map. INPUT (AND OUTPUT) MASK FILES (BINARY) Header record identical to map files. Mask records similar to normal map records except that the mask values are written as FORTRAN type "BYTE" (INTEGER*1). Grid points with mask values of 0, 10, 20, 30, 40 etc correspond to envelope masks 1,2,3,4,5 etc, respectively.
2.26 BLDCEL WRITE-UP BLDCEL is a program to rebuild an electron density (and optionally a mask) map covering one complete unit cell, from an input map (and mask) covering an asymmetric unit. Typically an asymmetric unit encompassing a dimer, trimer etc is extracted from a FSFOUR map by EXTRMAP or MAPVIEW, and averaging is done only within that submap in program MAPAVG. The averaged asymmetric unit map and corresponding mask is input here, along with the FSFOUR map from which the asymmetric unit map was extracted. An exact copy of the FSFOUR map is first created, but then density values at grid points which lie within the averaging envelopes are replaced by their values from the averaged map. Averaged density also replaces values at grid points related by crystal symmetry to the input averaged map. Thus the output map corresponds to the averaged map, except that it covers one full cell and conforms to the space group symmetry. Density at values which were not averaged simply retain their values from the original map. The format is identical to that produced by FSFOUR, thus this map is suitable for inversion, peak search etc. If desired, the input asymmetric unit mask can also be expanded to a full cell mask. There is never any need to do this for averaging purposes, but it is useful if one wants to use SOLVENT FLATTENING masks which were edited by program MAPVIEW. In that case one would extract a region from the solvent mask (created by BNDRY, option 1) which covers an asymmetric unit, by using MAPVIEW or EXTRMSK. Edit the extracted solvent mask in MAPVIEW. Then use the MAKE ASU expansion option in MAPVIEW to create an edited mask file obeying crystal symmetry. Finally, input the edited mask file here, and request that it also gets expanded to a full cell. The output mask file then can be used for solvent flattening, since it covers a full cell, obeys space group symmetry and has the same structure as a normal solvent flattening mask. INPUT DATA (UNIT 5) RECORD 1 PAMFIL (free format) PAMFIL = Input parameter file specifying cell and symmetry information. RECORD 2 INPMAP (free format) INPMAP = Input map file (unaveraged, full cell, from FSFOUR) RECORD 3 OUTMAP (free format) OUTMAP = Output map file (full cell, averaged) RECORD 4 INPASU (free format) INPASU = Input map (averaged, asymmetric unit) RECORD 5 INPMSK (free format) INPMSK = Input mask (asymmetric unit) RECORD 6 NEWMSK (free format) NEWMSK = 0 for no expansion of mask to full cell = 1 to also expand input mask to full cell RECORD 7 OUTMSK (free format) ******* include this record only if NEWMSK=1 ******* OUTMSK = Output mask file (full cell) ***** FILES ***** INPUT MAP FILE "INPMAP" (BINARY) Standard FSFOUR map, default orientation i.e. NORN=0 OUTPUT MAP FILE "OUTMAP" (BINARY) Same structure as input file "INPMAP" INPUT MAP FILE "INPASU" (BINARY) record 1) A,B,C,AL,BE,GA,NX,NY,NZ,IXMN,IYMN,IZMN,IXMX,IYMX,IZMX with first 6 values REAL*4, next 9 INTEGER*4, lengths in Angstroms, angles in degrees. NX = Number of grid points defining one "cell length" along NY = respective axis. Implicitly defines grid spacing as del x = A/NX, del y = B/NY and del z = C/NZ NZ = IXMN, IXMX = Minimum, maximum grid index defining map region such IYMN, IYMX = that x (fractional) = IX * (del x) / A etc. There are no restrictions on magnitudes or signs. IZMN, IZMX = The map follows as (IYMX-IYMN+1)*(IZMX-IZMN+1) records, with each containing one row (IXMX-IXMN+1 REAL*4 values) along X, starting at IXMN. Y is slowest varying, i.e. the file could have been created with the following FORTRAN code: DO 30 IY=IYMN,IYMX DO 20 IZ=IZMN,IZMX 20 WRITE(LU)(RHO(IX,IY,IZ),IX=IXMN,IXMX) 30 CONTINUE INPUT MASK FILE "INPMSK" (BINARY) Header record identical to map file INPASU. Mask records similar to normal map records in INPASU except that the mask values are written as FORTRAN type "BYTE" (INTEGER*1). Only grid points with mask values of 0, 10, 20, 30, 40 etc will be used (i.e. inside envelope masks 1,2,3,4,5 etc, respectively). OUTPUT MASK FILE OUTMSK (BINARY) Same structure as input mask file INPMSK, but covers full cell
2.27 MDLMSK WRITE-UP This program creates a mask file by identifying all points on a map grid which are within a given distance from any atom in an input model. It therefore can be used to create model based masks for use in averaging (or solvent flattening, if the mask is symmetrized in MAPVIEW and expanded to a full cell in BLDCEL). The program is interactive and prompts for the names of the standard parameter file, an input coordinate file, an output mask file, the atomic radius, a mask number (from 1 to 12) uniquely identifying the molecule and parameters for the map grid. The output mask file can be viewed in MAPVIEW, provided one selects a map region identical to the mask region, and uses the same grid. If multiple molecules are present in the asymmetric unit, MDLMSK can be run several times over each molecule separately (but covering a volume which encompasses all molecules), specifying a unique mask number for each, but covering the same region and using the same map grid. The separate ouput files can then be combined into a single mask file with MRGMSK, which will retain the identity of each molecular envelope mask. This combined mask file can then be used for averaging or for refinement of noncrystallographic symmetry operators in LSQROTGEN. If the noncrystallographic symmetry is purely rotational with periodicity N where N is an integer, then only a single mask is needed which encompases the entire dimer, trimer etc. In that case the output mask can be used for refinement in LSQROT. ****** FILES ****** INPUT COORDINATE FILE - ASCII with format ( 7X, A1, I3, A4, 5F10.5, I5) Each record should contain RT, IRES, ATOM, X, Y, Z, B, OCC, ITYP where RT = single letter amino acid code (not used) IRES = sequence number (MUST be present and unique for each residue) ATOM = atom name (not used) X, Y, Z = fractional atomic coordinates B = Isotropic thermal factor (not used) OCC = Occupancy factor (not used) ITYP = Atomic type identifier (not used) Note! This format is identical to that used by PHASIT, in structure factor calculation mode. OUTPUT MASK FILE (BINARY) record 1) A,B,C,AL,BE,GA,NX,NY,NZ,IXMN,IYMN,IZMN,IXMX,IYMX,IZMX with first 6 values REAL*4, next 9 INTEGER*4, lengths in Angstroms, angles in degrees. NX = Number of grid points defining one "cell length" along NY = respective axis. Implicitly defines grid spacing as del x = A/NX, del y = B/NY and del z = C/NZ NZ = IXMN, IXMX = Minimum, maximum grid index defining map region such IYMN, IYMX = that x (fractional) = IX * (del x) / A etc. There are no restrictions on magnitudes or signs. IZMN, IZMX = The mask follows as (IYMX-IYMN+1)*(IZMX-IZMN+1) records, with each containing one row (IXMX-IXMN+1 BYTE values) along X, starting at IXMN. Y is slowest varying, i.e. the file could have been created with the following FORTRAN code, where array MASK is defined to be FORTRAN type BYTE: DO 30 IY=IYMN,IYMX DO 20 IZ=IZMN,IZMX 20 WRITE(LU)(MASK(IX,IY,IZ),IX=IXMN,IXMX) 30 CONTINUE
2.28 MRGMSK WRITE-UP This program is used to merge two different mask files into a single mask file. It is needed when one wants to create an averaging mask from a set of input atomic coordinates, and the noncrystallographic symmetry present is not purely rotational with order of rotation N where N is an integer, i.e. if arbitrary rotations and/or post rotation translations are required. In that case MDLMSK should be run two or more times, with each run generating a mask for a different molecule in the asymmetric unit, and SPECIFYING A UNIQUE MASK NUMBER. Program MRGMSK can then combine all of the individual mask files into one, which can be used for averaging, operator refinement etc. All of the input masks must cover precisely the same map volume, and use the same map grid spacing. The output mask will correspond to this volume and spacing as well. If both of the input masks identify the same grid point as being within its envelope, the status of that point is changed to indicate non-molecule, since it is not clear to which molecule it should belong. The number of such overlapping points is output. The output mask can be examined in MAPVIEW, in which case the different molecular masks will be shown in different colors. This program is interactive and prompts for the input and output mask files, and the standard parameter file. ******** FILES ******** INPUT (AND OUTPUT) MASK FILES (BINARY) record 1) A,B,C,AL,BE,GA,NX,NY,NZ,IXMN,IYMN,IZMN,IXMX,IYMX,IZMX with first 6 values REAL*4, next 9 INTEGER*4, lengths in Angstroms, angles in degrees. NX = Number of grid points defining one "cell length" along NY = respective axis. Implicitly defines grid spacing as del x = A/NX, del y = B/NY and del z = C/NZ NZ = IXMN, IXMX = Minimum, maximum grid index defining map region such IYMN, IYMX = that x (fractional) = IX * (del x) / A etc. There are no restrictions on magnitudes or signs. IZMN, IZMX = The mask follows as (IYMX-IYMN+1)*(IZMX-IZMN+1) records, with each containing one row (IXMX-IXMN+1 BYTE values) along X, starting at IXMN. Y is slowest varying, i.e. the file could have been created with the following FORTRAN code, where array MASK is defined to be FORTRAN type BYTE: DO 30 IY=IYMN,IYMX DO 20 IZ=IZMN,IZMX 20 WRITE(LU)(MASK(IX,IY,IZ),IX=IXMN,IXMX) 30 CONTINUE
2.29 TRNMSK WRITE-UP TRNMSK is a program to transform a mask file constructed in a "skewed" cell back to its conventional cell. In certain situations (when the noncrystallographic symmetry is purely rotational), it is highly advantageous to trace out the mask (via MAPVIEW) in terms of a cell "skewed" such that the new b axis corresponds to the noncrystallographic symmetry axis direction. In that case it is usually obvious where the NC symmetry breaks down, and the envelope mask for averaging is readily obtained. However, skewing the map is not necessary for the averaging cycles, thus it is desirable to transform the "skewed" mask, once created, back to the original cell for all subsequent calculations. TRNMSK accomplishes this. INPUT DATA (UNIT 5) CARD I PAMFIL = Input file containing cell parameters and symmetry, used only to get "running log" file CARD II INPMAP (free format) INPMAP = Input map file (corresponds to region extracted from original map before it was "skewed") CARD III INPMSK (free format) INPMSK = Input mask file (skewed) CARD IV OUTMSK (free format) OUTMSK = Output mask file (unskewed), covers same region as INPMAP CARD V PHI, PSI, OX, OY, OZ (free format) PHI = PSI = Input spherical polar angles and origin as OX = originally input to SKEW OY = OZ = ******** FILES ******** INPUT MAP - As described for SKEW INPUT (AND OUTPUT) MASKS - As described for SKEW
2.30 RDHEAD WRITE-UP This program can be used to print out the header information on any map or mask file used in the noncrystallographic symmetry averaging options. Since all of the programs assume that map and mask files refer to the same structure, cover precisely the same region and use the same grid, it is important to be sure that this is the case. All of the software in fact, verifies this at run time anyway, but it is still useful at times to see exactly what's on the header record, in order to find out what went wrong if inconsistancies are reported. The program prompts for the name of the map or mask file, and prints the header information regarding cell constants, map periods and grid range covered by the contents of the file. Note that this program can be used for ANY mask file, including solvent masks created by BNDRY. It can also be used for ANY submap map file arising from the averaging related software (MAPVIEW output, EXTRMAP, SKEW, MAPORTH, MAPAVG) but NOT for FSFOUR maps, which always cover a full cell anyway. INPUT (AND OUTPUT) MASKS - As described for SKEW
2.31 PRECESS WRITE-UP PRECESS can be used to construct and display "pseudo" precession photographs created from input reflection data files. On SGI hardware either precess or precess_X can be used. On other hardware only precess_X can be used. Note also that if precess is run on an SGI workstation, it should be invoked from a WINTERM window and NOT from an XTERM window. Precess_X can be invoked from either window. The user can interactively select the zone to display, and scroll up or down through neighboring zones, selecting information for any reflection by moving the cursor to it. Several input file formats are recognized, including any of the "scaled" files used within PHASES, XENGEN style "MULISTS" or "UREFLS" files, SCALEPACK style files or a simple free format input file. Data within the displayed zone are grouped into bins (256 for the IRIS GL version, and 101 for X-window version) based on intensity, and displayed with a corresponding gray scale scheme. Alternatively, a full color display can be used. If requested, a pseudo background based on the mean sigma's as a function of resolution can be added to the display, creating a realistic image complete with beam stop shadow. If a "scaled" file is input, the user is prompted for the file type and to select either the native intensities or intensity differences (isomorphous or anomalous, depending on the input file type) for display. The program will first prompt for an input "parameter file" The program then prompts for a file name, which should be the name of a data file with a .mu, .urf, .scl, .sca or a .dat extension, for a resolution cutoff, for the desired zone, whether a pseudo background is to be included, and whether a color or continuous gray scale photograph display is desired. After reading in the data file, a reasonable color scheme is determined and the desired zone is displayed along with a menu. The data is selected, when possible, in a manner which preserves anomalous scattering information and, when possible, the actual measurements for symmetry related reflections are used. Finally however, rather than leaving "holes" in the picture, symmetry operations (including freidel's relationship) are used to fill in missing data. Thus the resulting display will conform to the true diffraction symmetry if all required data were present on the input file, but if some reflections were missing but their symmetry mates were present, the intensities for the mates are used. Moving the cursor to a menu item and pressing a mouse button will then carry out the selected option. In most cases any of the mouse buttons will suffice, but for some items the buttons have different functions. The "UP" and "DOWN" menu items change the color map intensity thresholds, and thus the image intensity scale. For each of these options the left mouse button makes a slight change, the middle button a moderate change and the right button a substantial change. Pressing any mouse button while in the "EXIT" field terminates the program. Pressing the left or right mouse button while in the "ZONE" field will toggle the next zone index direction (indicated by the arrow). Pressing the middle mouse button while in this field will read in and display the next (or previous) zone as desired, using the current color intensity scheme. Moving the cursor in the data display area results in the resolution and intensity at the current cursor position to be displayed. If one is near a bragg reflection however, the indices, integrated intensity and its standard deviation are displayed along with the resolution. Pressing any mouse button while in the "NEW DIRECTION" field will allow the user to select another zonal direction (e.g. hk0 when h0l was originally choosen), and/or select a new resolution cutoff. Pressing any mouse button while in the "SAVE IMAGE" menu area will save the entire screen contents as an "image" file, with the name "prec_N.rgb", where N is a one or two digit number. Numbers start from zero and are automatically incremented each time an image is saved. Up to 100 images can be made in any job. NOTE!! This option is not yet functional on the X-window version of precess. For the purpose of photographing the display, it is often desirable to remove the menu and color map since there is usually too much contrast variation between them and the frame data for both to be reliably recorded with the same exposure. The menu display can be toggled on/off by pressing any mouse button while the cursor is anywhere to the right of the menu items. Note however, that when the menu is off all other functions are disabled, thus it must be toggled back on to restore interactive functionality, and to enable exiting from the program. ***** FILES ***** The type of input reflection file is deduced from the ending part of the filename. Recognized endings are: .MU, .mu, .SCL, .scl, .SCA, .sca, .URF, .urf, .DAT or .dat XENGEN output will typically be either .MU format ( having F's, NOT I's), or .URF format, although the .URF files can currently be used only if they were created on a UNIX computer. Any of the "scaled" file formats accepted by PHASIT can also be used here, and will be assumed if a ".SCL" or ".sca" ending is used. If this option is input, the user also will be asked as to what type of data is in the file, and whether to display the native intensities or intensity differences (isomorphous or anomalous, as appropriate for the file type). If the filename ends with ".SCA" or ".sca", then a SCALEPACK file is assumed. After a variable number of header records (see the FILE FORMATS section), reflection records follow and contain H, K, L, I+, sig(I+), I-, sig(I-) in format (3I4, 4F8.1) Note the use of intensities rather than F's. The last two items in each record may be omitted. If present, they would be used only if I+ was not measured. A general, free format file can be used and is assumed if the file ends in ".DAT" or ".dat", in which case each record must contain IH, IK, IL, F, SIG(F) readable in free format, i.e. at least one blank or a comma separates the entries.
2.32 O_TO_SP WRITE-UP Interactive program, to extract from a 3D transformation matrix and translation vector, the corresponding spherical polar angles, axis location and post rotation translation along the axis direction. Thus, for example, one can obtain an estimate of the transformation operator via the "O" program, use this program to convert the NC symmetry operator information to PHASES format, and use the PHASES routines for operator refinement averaging, skewing etc. The user is prompted for the elements of the transformation matrix and vector, which can be found in the appropriate O data block. Note that the program is not limited to transformations from "O". As long as the input transformation operation was to be applied to orthogonal coordinates which are orthogonalized as in the PDB, then the output will be valid.
2.33 XPL_PHI WRITE-UP Program to convert the PHASES binary phase file (long format) to a reflection file suitable for input to the program XPLORE. The program prompts for input and output file names. All reflections in the input file are passed to the output file. The output file will contain the indices, Fobs, phi and the figure of merit. It is suitable for refinement in XPLORE, with or without phase restraints. ***** FILES ***** INPUT - Binary phase file (long format), as generated from PHASIT or BNDRY OUTPUT - ASCII file, containing indices, Fobs, phi (centroid) and associated figure of merit.
2.34 PDB_CDS WRITE-UP Interactive program to interchange coordinate files between PDB and PHASES formats. The user is first prompted to determine if the conversion will be from an input PDB file to a PHASES file. If the response is no, the opposite direction is assumed. The user is then prompted for input and output file names, and whether or not occupancy and/or thermal factors are to be reset. If either is to be reset, a prompt for the appropriate new value is given. The user is then asked what residue range is to be selected for output, and what chain ID (single character). The chain ID must match that given in the input file. If no chain ID is present, supply a blank in response to the prompt. After writing the selected atoms to the output file, the user is then prompted again for another range/chain ID. Processing continues until no more ranges/chains are requested. If a PHASES style output file is requested, it is suitable for use in programs PHASIT or GREF. The program will recognize the 20 standard amino acids with their appropriate three letter codes (PDB format, all upper case) or single letter codes (PHASES format, upper case). It will also recognize several additional "residue" types, with the following three letter and one letter codes, respectively. SUL U, WAT O, TDP Z, HEM X, CAL B, CAD J, MAG m, ZNC z, PO4 p, ADE a, CYT c, GUA g, THY t, EXT e Note that the codes are case sensitive. If a residue type is input that is not one of the above, it still will be processed but will be converted to the "extra" type (EXT or e) and a message will be output. This presents no problem as residue types are never used anywhere in the PHASES package. Howver, if a residue type was converted when going to PHASES coordinates, one will have to remember to manually convert it back to its original designation following each run of PDB_CDS regenerating PDB coordinates. When going from PDB to PHASES coordinates, the atom type must also be deduced from the atom name. To do this the program will use the first one or two characters of the atom name, possibly in combination with the residue type. If the atom name starts with the letters C, N, O, S, P, FE, ZN or MG (all upper case) the appropriate atom type will be recognized. However, if the residue type is CAL or CAD, then an atom name starting with CA or CD will be recognized as calcium or cadmium, respectively. If the atom name is inconsistent with any of the above, it is still processed but the type will be set to carbon and a message will be output. Note that even if the atom type is not recognized, you can still use the file within the PHASES package. You will just have to manually reset the atom type code number in the output file to an appropriate number, and possibly input the additional scattering factor information to PHASIT and/or GREF. PHASIT and GREF always recognize the code numbers 1,2,...20 as corresponding to C, N, O, S, Fe+3, Pt+2, Hg+2, Au+3, Pb+2, Os+4, I-, Zn+2, Ca+2, Mg+2, Cd+2, U+6, P, Br-, Cl- and Sm+3, respectively. Thus any additional atom types must start with the code number 21. See the PHASIT and GREF writeups for details on how to input the scattering factors. The residue names are never used within the PHASES package. ***** FILES ***** INPUT - either a standard PDB file or a PHASES style coordinate file OUTPUT - the inverse of what was input.
2.35 RMHEAVY WRITE-UP This program is used to temporarily remove electron density near heavy atom sites from an input map, so that a solvent mask can be accurately created from the map via the automatic boundary procedure. The strong density often found near heavy atom sites in initial MIR or SIR maps can lead to an extension of the protein mask near the heavy atom sites. Since these sites are nearly always on the protein surface, the effect is that the protein envelope can incorrectly extend into the solvent region (and if the envelope is tight, therefore be depleted elsewhere in the protein region). A list of heavy atom sites is read in along with a map and a distance cutoff. The atoms are expanded by space group symmetry, and if any grid point is within the cutoff distance from any atom in the expanded list, its electron density value is set to zero. The modified map is then output. Note that this map is normally used ONLY for the purpose of SOLVENT MASK GENERATION. The original MIR or SIR map is still used for all solvent flattening computations. The procedure DOALL.SH will automatically accomplish this, provided the file names in the supplied template procedures are adhered to. If one does not wish to do this, simply comment out the two lines rmheavy < rmhv.d >> mask1.l mv nohv.map four.map in each of the files mask1.sh, mask2.sh and mask3.sh Typically the input file will contain coordinates of all heavy metals that were used in the phasing, and a distance cutoff of about 2.5 angstroms. INPUT DATA (UNIT 5) RECORD 1 PAMFILE (free format) PAMFILE = Input parameter file specifying cell and symmetry information. RECORD 2 INPMAP (free format) INPMAP = Input map file (full cell, from FSFOUR) RECORD 3 OUTMAP (free format) OUTMAP = Output map file (full cell, as in input) RECORD 4 NA, RAD (free format) NA = Number of input heavy atom sites RAD = Distance cutoff, in angstroms. the NA input atomic coordinate records now follow RECORDS 5 ATNAME, X, Y, Z, B, OCC, ITYPE FORMAT(7X,A8,5F10.5,I5) ATNAME = not used ITYPE = not used OCC = not used X,Y,Z = Fractional atomic coordinates B = not used Note that the coordinate format is identical to that used in PHASIT, thus a copy of the earlier phasing deck can be made and edited for use here. ***** FILES ***** INPUT MAP - Standard FSFOUR map, covering a full cell OUTPUT MAP - Same format as input map, but with heavy atom density removed
2.36 CTOUR WRITE-UP CTOUR is a program to create contoured plots of electron density maps which can then be displayed or printed. The program accepts an input map which is prepared by FSFOUR in the default orientation (NORN=0), along with limit, direction and contouring information. The output consists of one or more generic metafiles which can be converted to the format needed for a given display by the appropriate driver program, several of which are provided. Multiple plots can be created within a single run, with each plot consisting of either an individual map section, a mono projection over multiple sections, or a stereo projection over multiple sections. Any map region may be selected and viewed down either a direct cell axis or a reciprocal cell axis (the latter used for projections). The metafiles created will have the names plt001.plt, plt002.plt etc and can be viewed via the driver programs VIEWPLT or VIEWPLT_X (on SGI or X-window supporting workstations), by PLTTEK (on terminals supporting TEKTRONIX 4010 graphics) or converted to PostScript for subsequent printing by MKPOST. Note that in general program MAPVIEW (or MAPVIEW_X) would be preferred to examine contoured plots, since it allows the interactive selection (and modification) of orientation, region and contouring intervals. In some instances CTOUR has advantages however, as it facilitates creation of hard copies for examination away from the terminal or workstation, for creation of minimaps, and for stereo plots. CTOUR is very useful for examination of difference Patterson maps, where for example, all of the Harker sections can be generated and then displayed simultaneously with the program VIEWPLT (or VIEWPLT_X). It is recommended that one first examine the plots with VIEWPLT or VIEWPLT_X before converting to PostScript as this can be done extremely rapidly, whereas printing and even simply displaying PostScript files can be much more time consuming. INPUT DATA (UNIT 5) CARD I MAPFIL (free format) MAPFIL = Input map file (from FSFOUR, in default orientation i.e. NORN=0) CARD II CMIN,CMAX,CSTEP,IGRID,PSIZE,VDIS,RSCALE (free format) CMIN = Minimum, maximum and increment for contour CMAX = levels, on the scale set by RSCALE (see below) CSTEP = IGRID = 0 To include labels and border on plots = 1 To include labels, border and grid lines on plots (facilitates coordinate measurment for Pattersons) = 2 To eliminate labels, grid lines and border on plots PSIZE = Plot size in inches (usually 10. if hard copy is to be produced). VDIS = View distance in inches (usually 30., used only for stereo plots. Decreasing it increases the stereo effect). RSCALE = Sets density scale for contours. If 0., then density is scaled such that the largest value in the unit cell is 999. If > 0., then the density is on an absolute scale (minus the F000/V term) when the F's used in map creation are related to an absolute scale by the factor RSCALE, i.e when F(abs)=RSCALE*F(input). Regardless of the choice, the min, max and sigma for the map on the chosen scale will be listed on the output, and can be used to set contour levels for a subsequent run. **** The following card can be repeated as many times as desired **** CARDS III NSEC,XMN,XMX,YMN,YMX,ZMN,ZMX,NORN (free format) NSEC = 0 for individual sections, one plot per section = 1 for mono projection, one plot for entire range = 2 for stereo projection, one plot for entire range XMN = XMX = YMN = Minimum, maximum coordinates (fractional) in a, b and c directions defining map YMX = volume to be contoured. ZMN = ZMX = NORN = 1 view as YZ sections (look down a or a*) = 2 view as XZ sections (look down b or b*) = 3 view as XY sections (look down c or c*) ************** EXAMPLES ************** 1) The following script will compute a Patterson map and contour three Harker sections. Three generic plot files (having the names pltNNN.plt where NNN is a three digit number) will be created. We start contouring at about 3% of the origin peak height, which will be scaled to 999. and increase in steps of 1% of the origin peak. We request that labels and a grid are included to facilitate coordinate measurement. Finally, we convert the generic plot files to Postscript. The corresponding PostScript files will have the names pltNNN.pst #compute the difference Patterson map # fsfour << eod > fsfour.l seb.pam Difference Patterson, 3A 0 48 72 80 5 0 20 0 0 0 0. patt.ref patt.map eod # #now contour three Harker sections # ctour << eod2 > ctour.l patt.map 30. 999. 10. 1 10. 30. 0. 0 0.5 0.5 0.0 1.0 0.0 1.0 1 0 0.0 1.0 0.5 0.5 0.0 1.0 2 0 0.0 1.0 0.0 1.0 0.5 0.5 3 eod2 # #now convert all generic plot files to PostScript # mkpost *.plt # 2) The following script will compute an MIR map and generate a series of plots. First a small mono projection down the b* axis is created. Then a minimap is made, contouring individual sections. We start contouring at one sigma and increase to the maximum in steps of sigma. Min, max and sigma values were obtained from the log from a prior short run which contoured only a single section. Labels and a border are requested, but no grid lines. Finally, all generic plot files are created to PostScript. #compute the solvent flattened MIR map # fsfour << eod > fsfour.l pdc.pam PDC MIR MAP, 3A 0 144 80 120 1 0 20 0 0 0 0. phi16cy.31 mir.map eod # #now contour both a projection and individual sections # ctour << eod2 > ctour.l mir.map 146. 999. 146. 0 10. 30. 0. 1 -.5 .5 -.05 .05 -.5 .5 2 0 -.42 .45 -.45 .42 -.08 .56 2 eod2 # #now convert all generic plot files to PostScript # mkpost *.plt #
2.37 VIEWPLT WRITE-UP VIEWPLT is an interactive program to display one or more plots created by CTOUR on an IRIS workstation. The analogous program VIEWPLT_X can be used to display the plots on any workstation supporting the X-Window protocol. As with MAPVIEW and PRECESS, if VIEWPLT is run on an SGI workstation it should be invoked only from a WINTERM window, and NOT from an XTERM window. VIEWPLT_X can be invoked from either window type. When invoked, the program prompts for the number of plot files to display (max = 10), and then for each file name. The plots will be scaled to fit on the display regardless of the size requested at plot creation time. If a "/R" is appended to the end of a plot file name, then the plot will be rotated by 90 degrees, which sometimes allows a better fit of the plot to its allowed space. When the plot is finished, the terminal bell will ring to notify the user. Pressing the "return" will then terminate the program. VIEWPLT is particularly useful for displaying contoured Harker sections, as the multiple sections can be simultaneously displayed. It is also useful to screen contoured projections or sections prior to conversion to Postscript, as viewing or printing the PostScript versions takes much longer. ****** FILES ****** INPUT FILES - Generic plot files created by CTOUR
2.38 PLTTEK WRITE-UP PLTTEK is an interactive program to display plots created by CTOUR on terminals supporting TEKTRONIX 4010 emulation. When invoked, the program simply prompts for the name of a plot file, and proceeds to display it on the terminal. After the plot is completed, the terminal bell will ring. Pressing the "return" then terminates the program. Prior to plotting, the program sends an escape sequence appropriate to place DEC VT240 series terminals in 4010 emulation mode. Upon termination an escape sequence to return to native emulation is sent. This sending of escape sequences can be eliminated by appending "/NOEM" to the plot file name, which would be appropriate for a terminal already in 4010 mode. The plot will be scaled to fit on the display regardless of the size requested at plot creation time. If a "/R" is appended to the end of a plot file name, then the plot will be rotated by 90 degrees, which sometimes allows a better fit of the plot to its allowed space. On UNIX systems use of the directory delimiter "/" interferes with interpretation of the "/NOEM" and "/R" switches, thus one should only specify file names of files resident in the current working directory, where no "/" other than for the switches will be present. Note that this type of graphic display is orders of magnitude slower than VIEWPLT, however in some instances it may be the only way to see a plot; as for example, when one is working at home from a "dumb" terminal. Also note that changing terminal emulation modes sometimes alters behavior of terminals. After viewing the plots, it may be necessary to log off and back in again, possibly powering down the terminal in between, or to reissue terminal initialization commands explicitly. ****** FILES ****** INPUT FILES - Generic plot files created by CTOUR
2.39 MKPOST WRITE-UP MKPOST is a command to convert generic plot files created by CTOUR to PostScript. It accepts one or more command line arguments, which must be names of generic plot files. For each file, a new PostScript version with extension ".pst" is created. The command supports filename expansion options such as use of wildcards. Thus the command mkpost *.plt will convert all generic plot files to PostScript, while using mkpost plt001.plt will convert only one file, creating the PostScript version plt001.pst. The PostScript files can then be examined with a previewer such as psview or xpsview, or printed on a PostScript printer. MKPOST is actually just a shell script (UNIX) or command procedure (VMS) to enable file name expansion. It simply creates a list of filenames and pipes the list to program POSTPLOT, which actually does the conversions. The program POSTPLOT will automatically scale the plot so it will fit on standard A4 paper. When doing this scaling, it may also rotate the plot by 90 degrees to minimize any required shrinkage. If one insists on an orientation which is inconsistant with this rotation, then the plot size requested in CTOUR must be reduced to the point where no shrinkage is needed at all. Also, note that on UNIX systems a new shell is spawned, and it is important that the original working directory be maintained or else one will encounter failures with "file not found" type messages. This can happen if one has change directory commands "cd" in their .cshrc or .login files. In that case, upon spawning the new shell the working directory is changed and the plot files originally present will not be found. ***** FILES ***** INPUT FILES - Generic plot files produced by CTOUR OUTPUT FILES - PostScript versions of the input files
2.40 PSTATS WRITE-UP PSTATS (Phase Statistics) is a program to compute and tabulate mean phase differences between two phase sets as a function of d spacing. The program is interactive and prompts for the parameter file and the names of two phase files. The phase files can be either the long or short forms (as produced from PHASIT, BNDRY, or GREF), or even one of each. Reflections need not be indexed identically in both files, but each file should contain only unique reflections. The statistics enable one to determine how different two phase sets are. When combining experimental phase information with that obtained from a partial structure, it may be useful to monitor these statistics and use them to adjust the damping factor defining relative weights between the two sources of information. ***** FILES ***** INPUT FILES - Binary "phased" files, either in long or short format as produced by PHASIT, BNDRY or GREF.
2.41 HNDCHK WRITE-UP HNDCHK is an interactive program to examine electron density values at specified locations within a map, usually for the purpose of determining the absolute configuration (hand). The program prompts for the parameter file, name of the map file and for the coordinates to be examined. The map file is read and the minimum and maximum density values along with sigma for the map are listed. Density values are then interpolated both at the specified coordinates and at places related to them by a centre of symmetry, and are listed. In general, one would compute MIR, SIR etc phases, possibly solvent flattened, and then generate a Bijvoet difference Fourier map (with coefficients from MRGBDF) to be used with HNDCHK. One would then examine density values exactly at the input heavy atom positions used in the phasing. If the hand was correct, then large positive peaks should occur at the input sites, whereas if the hand was incorrect larger NEGATIVE peaks should occur at the TRUE heavy atom locations, i.e. at places related to the input (incorrect) positions by a centre of symmetry. ***** FILES ***** INPUT FILE - Standard FSFOUR map, usually computed with the Bijvoet difference Fourier option (MAPTYP=8)
2.42 SLOEXT WRITE-UP SLOEXT is a program to control the rate and range of phase extension when extending phases to higher resolution by solvent flattening, negative density truncation and/or NC symmetry averaging. The program is automatically invoked by the "extnd.sh", "extnda.sh" and "extndavg.sh" scripts, and functions by updating the "extnd.d" or "extnda.d" files used by these scripts periodically during the extension process. The initial and final resolution cutoffs are specified by the user along with the number of map modification/phase combination cycles to be carried out at each resolution increment. The resolution is incremented in steps corresponding to roughly one reciprocal lattice point in the direction of the shortest reciprocal cell axis. If the initial and final resolutions are equal, then there is no gradual extension and only the number of cycles input is carried out. The controlling scripts assume that the input to MAPINV specifies indices out to the highest resolution to be encountered anywhere in the entire process. Likewise, the "extrfl.d" file prepared by MISSNG also should contain reflections out to the highest resolution. INPUT DATA (UNIT 5) RECORD 1 PAMFILE (free format) PAMFILE = Input parameter file specifying cell and symmetry information. RECORD 2 DINIT, DFIN, NC_INC (free format) DINIT = Initial d spacing cutoff (starting value, usually the resolution of the initial phase set) DFIN = Final d spacing cutoff (DFIN must be less than or equal to DINIT. If DFIN = DINIT, then only NC_INC cycles will be performed. Otherwise for EACH RESOLUTION INCREMENT NC_INC cycles will be performed). NC_INC = Number of refinement cycles per resolution increment (between 2 and 25). RECORD 3 CNTFIL (free format) CNTFIL = Name of file controlling phase extension. This should be either "extnd.d" or "extnda.d" (UNIX systems) or "extnd.dat" or "extnda.dat" (VMS systems), depending on whether phase only or phase plus amplitude extension is being done. This file is referenced by the "extnd.sh", "extnda.sh" and "extndavg.sh" scripts, or by their VMS counterparts. The file is assumed to exist and contains information as described in the BNDRY write-up (option 3). It will be updated periodically during the run.
3.00 EXAMPLES This section contains samples of input for various programs and procedures. In general, template files containing these examples are also provided along with the programs on the distribution media. In some cases (for example, solvent levelling), some practical considerations are also discussed.
3.01 ***** SAMPLE INPUT PARAMETER FILE ***** LOGFILE=seb.log LATTICE=P 45.33 68.33 79.62 90. 90. 90. 4 X,Y,Z 1/2-X,-Y,1/2+Z 1/2+X,1/2-Y,-Z -X,1/2+Y,1/2-Z
3.02 ***** SAMPLE INPUT DECKS FOR PHASIT ***** EXAMPLE I The deck below will compute SIR phases from a single isomorphous replacement derivative data set. The resulting phase file can then be used in the procedure DOALL to carry out Wang's ISIR process, or in MRGDF or MRGBDF to solve new derivatives or look for additional sites. The "difference coefficients" file can be used to compute "observed" and "calculated" difference Pattersons, heavy atom difference maps or heavy atom "double difference" maps to find new sites. The following data is assumed to be in a file called phasit.d seb.pam 0 0 1 0 1 phasit.31 DIAMINO DICHLORO PT ( ISOMORPHOUS REPLACEMENT DATA ) monopt.scl pt_iso_diff.31 4. 6. 0 1. 0. 0. 0. 0. 0. 0. 3 PT1 0.2539 0.1918 0.1376 35. 1.000 6 PT2 0.1754 0.0439 0.4578 20. 0.664 6 PT3 0.5474 0.0523 0.6964 50. 0.450 6 EXAMPLE II The deck below can be used to compute SIRAS phases from isomorphous and anomalous scattering data from a single derivative. The resulting file can then be used directly for map computation; used in the procedure DOALL to carry out solvent flattening/negative density truncation, phase extension etc, starting with (and tying to) the SIRAS phases; or in MRGDF or MRGBDF to solve new derivatives or look for additional sites. The "difference coefficients" files can be used to compute "observed" and "calculated" difference Pattersons, heavy atom difference maps or heavy atom "double difference" maps to find new sites. The following data is assumed to be in a file called phasit.d seb.pam 0 0 2 0 1 phasit.31 DIAMINO DICHLORO PT ( ISOMORPHOUS REPLACEMENT DATA ) monopt.scl pt_iso_diff.31 4. 6. 0 1. 0. 0. 0. 0. 0. 0. 3 PT1 0.2539 0.1918 0.1376 35. 1.000 6 PT2 0.1754 0.0439 0.4578 20. 0.664 6 PT3 0.5474 0.0523 0.6964 50. 0.450 6 DIAMINO DICHLORO PT (DERIVATIVE ANOMALOUS DISPERSION DATA ) monoptano.scl pt_ano_diff.31 4. 6. 2 1. 0. 0. 0. 0. 0. 0. 3 PT1 0.2539 0.1918 0.1376 35. 1.000 6 PT2 0.1754 0.0439 0.4578 20. 0.664 6 PT3 0.5474 0.0523 0.6964 50. 0.450 6 EXAMPLE III The deck below assumes isomorphous replacement data is available for two derivatives, and 5 passes of phase refinement, each consisting of 3 cycles for each derivative will be done to refine nearly all possible derivative parameters (except B's), i.e. MIR phases will be computed and refined. The resulting file can then be used directly for map computation; used in the procedure DOALL to carry out solvent flattening/negative density truncation, phase extension etc, starting with (and tying to) the MIR phases; or in MRGDF or MRGBDF to solve new derivatives or look for additional sites. The "difference coefficients" file can be used to compute "observed" and "calculated" difference Pattersons, heavy atom difference maps or heavy atom "double difference" maps to find new sites. The following data is assumed to be in a file called phasit.d seb.pam 0 0 2 1 1 phasit.31 DIAMINO DICHLORO PT ( ISOMORPHOUS REPLACEMENT DATA ) monopt.scl pt_iso_diff.31 4. 6. 0 1. 0. 0. 0. 0. 0. 0. 3 PT1 0.2539 0.1918 0.1376 35. 1.000 6 PT2 0.1754 0.0439 0.4578 20. 0.664 6 PT3 0.5474 0.0523 0.6964 50. 0.450 6 HGCL2 ( ISOMORPHOUS REPLACEMENT DATA ) monohg.scl hg_iso_diff.31 4. 6. 0 1. 0. 0. 0. 0. 0. 0. 2 HG1 0.3639 0.2218 0.1776 20. 1.000 7 HG2 0.4454 0.0939 0.2878 20. 0.800 7 5 0.2 6 2 1 0 1 1 SET 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 SET 1 1 1 1 0 0 1 1 1 1 0 1 1 1 1 0 1 1 1 1 SET 1 1 1 1 0 0 1 1 1 1 0 1 1 1 1 0 1 1 1 2 SET 2 0 0 0 0 0 0 0 0 0 0 1 1 1 2 SET 2 1 1 1 0 0 1 1 1 1 0 1 1 1 2 SET 2 1 1 1 0 0 1 1 1 1 0 1 1 1 EXAMPLE IV Similar to example III, except that one of the temperature factors is converted to anisotropic and is also refined, with the isotropic equivalent restrained to its original value. The following data is assumed to be in a file called phasit.d seb.pam 0 0 2 1 1 phasit.311 DIAMINO DICHLORO PT ( ISOMORPHOUS REPLACEMENT DATA ) monopt.scl pt_iso_diff.31 4. 6. 0 1. 0. 0. 0. 0. 0. 0. 3 PT1 0.2539 0.1918 0.1376 35. 1.000 6 PT2 0.1754 0.0439 0.4578 -20. 0.664 6 0. 0. 0. 0. 0. 0. 20. 0.5 PT3 0.5474 0.0523 0.6964 50. 0.450 6 HGCL2 ( ISOMORPHOUS REPLACEMENT DATA ) monohg.scl hg_iso_diff.31 4. 6. 0 1. 0. 0. 0. 0. 0. 0. 2 HG1 0.3639 0.2218 0.1776 20. 1.000 7 HG2 0.4454 0.0939 0.2878 20. 0.800 7 5 0.2 6 2 1 0 1 1 SET 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 SET 1 1 1 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 SET 1 1 1 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 2 SET 2 0 0 0 0 0 0 0 0 0 0 1 1 1 2 SET 2 1 1 1 0 0 1 1 1 1 0 1 1 1 2 SET 2 1 1 1 0 0 1 1 1 1 0 1 1 1 For all examples, PHASIT can be run with the following control information. For UNIX, use the following in a shell script phasit < phasit.d > phasit.l For VMS, use the following in a .COM file $ASSIGN PHASIT.D FOR005 $ASSIGN PHASIT.L FOR006 PHASIT $DEASSIGN FOR005 $DEASSIGN FOR006
3.03 ***** SAMPLE INPUTS FOR PHASING BY SOLVENT LEVELING ***** A complete solvent flattening run can be executed by creating a few small data files, and running the procedure DOALL. This will carry out the complete sequence of protein-solvent boundary determination, solvent flattening, and phase combination steps, in a manner equivalent to that suggested by Wang in his ISIR process, although the initial phases can be SIR, SAS, MIR, MIRAS or any combination generated by PHASIT. It will generate an initial solvent mask, use it for 4 cycles of solvent flattening/ phase combination, create a new mask, use it for 4 cycles, create a third mask, use it for 8 cycles, and, if desired, do additional phase extension cycles, and then possibly phase AND AMPLITUDE extension cycles. A series of files, all given .d extensions (UNIX) or .dat extensions (VMS) should be created containing control information for the forward and inverse Fourier transforms, for each option of the BNDRY program and for RMHEAVY. In general, these are the only files which will have to be changed for a new application, PROVIDED THE FILE NAME CONVENTION IN THE CONTROL FILES IS ADHERED TO. The output from PHASIT should be called phasit.31 and if phase extension is to be done, then the output from MISSNG should be called extrfl.d and the file sloext.d should also be prepared. If phase extension is not desired, then one does not have to run MISSNG and create sloext.d, but the line invoking the "extnd" procedure (@EXTND.COM in DOALL.COM for VMS systems or sh extnd.sh in doall.sh for UNIX systems) should be commented out. The individual program writeups should be consulted for the meaning of the parameters. It is important that the grid spacing selected in the input to FSFOUR be appropriate for the highest resolution data to be used anywhere in the process, including phase extended reflections. A grid spacing of about 1/3 of the smallest d spacing is recommended. It is also VERY important that the index range requested in the inputs to MAPINV cover at least a complete asymmetric unit out to the maximum resolution to be used anywhere in the process, including phase extended reflections. The particular asymmetric unit covered need not be identical to that originally input implicitly to PHASIT via the reflection files, but all reflections in the input files should at least have symmetry related counterparts in the MAPINV asymmetric unit. Since index limits in MAPINV are restricted to minimum and maximum values along each reciprocal axis, in high symmetry systems it may be necessary to cover more than an asymmetric unit (this causes no problem). Note also that MAPINV can compute structure factors only in the hemisphere with L non-negative, thus one MUST request an asymmetric unit in this hemisphere. This also creates no problem SINCE ANY REFLECTION CAN ALWAYS BE RELATED TO ONE IN THIS HEMISPHERE BY application of the Friedel symmetry operator, and this is automatically done in the programs. THUS WHEN IN DOUBT, ONE CAN ALWAYS SPECIFY A FULL HEMISPHERE, I.E. A RANGE OF -HMAX,HMAX, -KMAX,KMAX AND 0,LMAX WHICH WILL WORK, but may not be the most efficient way of doing things. For this reason one will NEVER have to reindex the input data, as an appropriate range in MAPINV can ALWAYS BE GIVEN! Example inputs are now given. If the supplied doall and related scripts are to be used without modification, then the filenames in these samples should NOT be changed (except for the parameter file, of course). One need change only the parameter file, solvent content and resolution related parameters, the map periods and index range, and the heavy atom coordinate file. ---- file fft.d (input to FSFOUR, for map calculation)---------------- seb.pam COMPUTE ELECTRON DENSITY MAP 0 48 72 80 1 0 20 0 0 0 0. four.ref four.map --- file minv1.d (input to MAPINV, for solvent boundary determination) seb.pam INVERT ELECTRON DENSITY MAP AFTER TRUNCATING NEGATIVES four.map minv.ref 0 0 0 16 0 24 27 0. 0. 1 0 --- file minv2.d (input to MAPINV, for normal map inversion) ----- seb.pam INVERT ELECTRON DENSITY MAP AFTER SOLVENT FLATTENING mod.map minv.ref 0 0 0 16 0 24 27 0. 0. 0 0 --- file rmhv.d (input to RMHEAVY, for removal of heavy atoms ) ---- seb.pam four.map nohv.map 2 2.5 PT1 0.2539 0.1918 0.1376 35. 1.000 6 PT2 0.1754 0.0439 0.4578 20. 0.664 6 --- file bnd0.d (input to BNDRY, option 0, prepare SF for protein- solvent boundary determination )------------------ seb.pam 0 9. minv.ref four.ref --- file bnd1.d (input to BNDRY, option 1, create solvent mask)---- seb.pam 1 four.map mask.map .4 --- file bnd2.d (input to BNDRY, option 2, do solvent flattening and negative density truncation) ---------------- seb.pam 2 four.map mask.map mod.map .086 --- file bnd3.d (input to BNDRY, option 3, combine new phases with original) -------- seb.pam 3 0 0. 1. 0 0 phasit.31 minv.ref newphi.ref --- file extnd.d (input to BNDRY, combine new phases with original, including phase extension ) ------------ seb.pam 3 1 3.5 1. 0 0 phasit.31 minv.ref extrfl.d newphi.ref --- file extnda.d (input to BNDRY, combine new phases with original, including phase AND AMPLITUDE extension) -------- seb.pam 3 2 3.5 1. 0 0 phasit.31 minv.ref extrfl.d newphi.ref --- file sloext.d (controls range and rate of phase extension) -- seb.pam 4. 3.5 8 extnd.d --- file sloext2.d (controls range and rate of phase AND AMPLITUDE extension -------- seb.pam 4. 3.5 8 extnda.d Once the input is prepared, the phasing process can be carried out either by running a series of command procedures as individual steps, or by running a single command procedure which invokes all others. The single procedure, called doall.sh or doall.com follows. In the procedures that follow, it is assumed that phase extension will be carried out, and that the additional files "extrfl.d" (prepared by MISSNG) and "sloext.d" (see SLOEXT write-up) are available.
3.04 For UNIX, use the following commands in a shell script, called doall.sh # COMMAND PROCEDURE TO CARRY OUT THE ENTIRE CYCLING PROCESS FOR PHASING # DATA BY SOLVENT LEVELLING # # COMPUTE THE FIRST SOLVENT MASK sh mask1.sh # # COMPUTE 4 CYCLES OF SOLVENT LEVELLING (USING FIRST MASK) sh cycle4.sh # # COMPUTE THE SECOND SOLVENT MASK sh mask2.sh # # COMPUTE 4 CYCLES OF SOLVENT LEVELLING (USING SECOND MASK) sh cycle8.sh # # COMPUTE THE THIRD SOLVENT MASK sh mask3.sh # # COMPUTE 8 CYCLES OF SOLVENT LEVELLING (USING THE THIRD MASK) sh cycle16.sh # # DO ADDITIONAL CYCLES OF SLOW PHASE EXTENSION (TO REFLECTIONS WITH # NATIVE AMPLITUDES SUPPLIED), EITHER TO HIGHER RESOLUTION OR TO # INITIAL RESOLUTION sh extnd.sh # # IF DESIRED, DO ADDITIONAL CYCLES OF PHASE EXTENSION (INCLUDING # DATA FOR WHICH THERE IS NO AMPLITUDE INFORMATION). THIS OPTION IS # NOT ALWAYS DESIRABLE, THUS IT IS COMMENTED OUT. TO INVOKE IT, SIMPLY # REMOVE THE # FROM THE FOLLOWING LINE #sh extnda.sh # # THATS ALL For VMS, use the following commands in a command procedure, called DOALL.COM $SET NOVERIFY $! COMMAND PROCEDURE TO CARRY OUT THE ENTIRE CYCLING PROCESS FOR $! PHASING DATA BY SOLVENT LEVELLING $! $! COMPUTE THE FIRST SOLVENT MASK @MASK1.COM $! $! COMPUTE 4 CYCLES OF SOLVENT LEVELLING (USING FIRST MASK) @CYCLE4.COM $! $! COMPUTE THE SECOND SOLVENT MASK @MASK2.COM $! $! COMPUTE 4 CYCLES OF SOLVENT LEVELLING (USING SECOND MASK) @CYCLE8.COM $! $! COMPUTE THE THIRD SOLVENT MASK @MASK3.COM $! $! COMPUTE 8 CYCLES OF SOLVENT LEVELLING (USING THE THIRD MASK) @CYCLE16.COM $! $! DO ADDITIONAL CYCLES OF PHASE EXTENSION (TO REFLECTIONS WITH $! NATIVE AMPLITUDES SUPPLIED), EITHER TO HIGHER RESOLUTION OR TO $! INITIAL RESOLUTION @EXTND.COM $! $! IF DESIRED, DO ADDITIONAL CYCLES OF PHASE EXTENSION (INCLUDING $! DATA FOR WHICH THERE IS NO AMPLITUDE INFORMATION). THIS OPTION IS $! NOT ALWAYS DESIRABLE, THUS IT IS COMMENTED OUT. TO INVOKE IT, $! SIMPLY REMOVE THE ! FROM THE FOLLOWING LINE $! @EXTNDA.COM $! $! THATS ALL
3.05 EXPECTED OUTPUT FILES Execution of the "doall" procedure will result in the following files being present. (phasit.31 and phasit.log should be present prior to running "doall.") phasit.31 contains original MIR, SIR etc phases from PHASIT phasit.l contains phasit printed output mask1.14 contains first solvent mask mask1.l contains mask1 printed output phi4cy.31 contains phases after 4 cycles using first mask cycle4.l contains printed output from first 4 cycles mask2.14 contains second solvent mask mask2.l contains mask2 printed output phi8cy.31 contains phases after 4 cycles using second mask cycle8.31 contains printed output from next 4 cycles mask3.14 contains third solvent mask mask3.l contains mask3 printed output phi16cy.31 contains phases after 8 cycles using third mask cycle16.l contains printed output from next 8 cycles phiextnd.31 (if generated) contains phases after 8 cycles using third mask, plus additional cycles of phase extension to known amplitudes. extnd.l (if generated) contains printed output from next 12 cycles phiextnda.31 (if generated), contains phases after 8 cycles using third mask, plus additional cycles of phase extension to known amplitudes, plus additional cycles of phase and amplitude extension. extnda.l (if generated), contains printed output from next 12 cycles. cycles.
4.00 NATIVE, DIFFERENCE AND "CALCULATED" PATTERSON MAPS In protein crystallography one is generally interested in difference Patterson maps to locate heavy atoms, in which the Fourier coefficients are the squares of the DIFFERENCE in AMPLITUDES between native and derivative data, or between members of a Bijvoet pair. Sometimes however, it is useful to compute native Patterson maps, or to compute "calculated" Patterson maps (generated from intensities computed explicitly from an input atomic model). The native maps may provide information about non-crystallographic symmetry, while the "calculated" maps obtained from a tentative heavy atom structure can be compared with the observed difference Pattersons to see how well the major features are being explained. The latter method is particularly useful in high symmetry systems, where even a small number of heavy atom sites gives rise to many Patterson peaks. Examining the observed and calculated Pattersons side by side (perhaps in VIEWPLT) can then provide confidence in the heavy atom interpretation. DIFFERENCE PATTERSONS - Difference Pattersons (either isomorphous or anomalous) can be computed by two different routes in PHASES. The first approach is to generate a standard "phased" file containing h,k,l,Fo,Fc,Phi, and use it in FSFOUR with the MAPTYP=5 option. Generally programs CMBISO or CMBANO do the initial data preparation, and their output files are then fed to TOPDEL to select the data according to various criteria, screen for and reject outliers, and write the appropriate information to the output file for FSFOUR. The output file will then contain either FPH and FP, or F+ and F- in the amplitude slots, depending on whether isomorphous or anomalous data were input. The second approach, which is useful only after at least one site is found in the derivative, is to use the "difference coefficient" file output from PHASIT in FSFOUR with MAPTYP=6. In the isomorphous case if the input site(s) are correct this should lead to a cleaner map, since the FPH to FP scale factor has been refined, and also because the angular difference between the FP and FPH vectors are compensated for. The FO and FC slots in the file then contain (FPH-FP)obs,corrected and FHcal, respectively. For anomalous data these slots contain (FPH+ - FPH-)obs and (FPH+ - FPH-)calc or their counterparts for native anomalous data. NATIVE PATTERSONS - Native Patterson maps can be generated in several ways, depending on what information is currently available. In all cases one must prepare a standard input "phased" file containing h,k,l,Fo,Fc,Phi, and request the appropriate option in FSFOUR to create the desired coefficients from the input data. One way to do this is to run CMBISO inputting the native file twice (as both the native and derivative data sets), and then run TOPDEL selecting ALL coefficients to be output (you can still use d and F/sigma cutoffs, but output 100% of the data!). The R factor and all differences will of course, be zero, but the output file will contain native amplitudes in both the Fo and Fc slots, and thus the native Patterson can be generated by requesting MAPTYP=6 in FSFOUR. Another approach would be to run PHASIT, SF mode with IHLCF=0 and ISIGA=0, using a single "dummy" atom arbitrarily positioned as the model. The output file will then give a bad R factor, but it will contain Fo and Fc in the amplitude slots, and selecting MAPTYP=6 in FSFOUR will again give the desired native Patterson. The first method allows one to use d spacing and F/sigma cutoffs, while the second always uses all of the data. In either case the native Pattersons can be searched for peaks, contoured, displayed, printed etc. with PSRCH, MAPVIEW, CTOUR, MKPOST etc. "CALCULATED PATTERSONS" - Patterson maps corresponding to an input atomic model are also generated by preparing the normal "phased" file containing h,k,l,Fo,Fc,Phi, and by selecting the appropriate coefficient option (MAPTYP=7) in FSFOUR. In this case it is important that the second amplitude slot truely contains Fc. One way to do this is to run PHASIT, SF mode with IHLCF=0 and ISIGA=0, and to include all of the desired atoms in the model. If a heavy atom model is used as the input, the R factor will be meaningless (since scaling is to the NATIVE amplitudes rather than differences), but the output file would still be appropriate for the "calculated" difference Patterson as the map scale is arbitrary anyway. Another way is to use GREF to prepare the file, by requesting that an output Fourier file be written. In that case the file created can contain the proper DIFFERENCE amplitude in the Fo slot, and the model based Fc in the Fc slot. Then the SAME file could be used in FSFOUR to create both the observed difference Patterson (MAPTYP=6) and the modeled version of it based on the heavy atoms (MAPTYP=7). Once again, the FSFOUR map can be searched for peaks, contoured etc. as any normal map. Finally, the "difference coefficients" file written by PHASIT can be used in FSFOUR with MAPTYP=7 to compute the "calculated" difference Patterson based on the input heavy atom model. The advantage of doing it this way is that the model, and hence the FC's, then would reflect all refined scaling parameters (possibly including anisotropic B's), and also models based solely on anomalous scatterers could be used.
5.00 REFINING HEAVY ATOM PARAMETERS There are two general ways to refine heavy atom parameters within the PHASES package: refinement against isomorphous or anomalous amplitude differences; or "phase refinement", i.e. by minimizing lack of closure. Isomorphous/anomalous difference refinement is carried out with the program GREF, and has the advantage that only data from the crystal being refined is used (and the native, in the isomorphous case). It is therefore independent of all other derivatives, and is particularly useful in the case of common sites between multiple derivatives since there can be no "cross talk" or bias. Also, if this refinement is carried out against centric data only, there are few assumptions made about the protein phase and the refinement is usually very reliable. It is nearly always used for the first derivative as no reliable protein phase estimates are available at that time, but it's not a bad idea to do this initially for each derivative. The disadvantage is that refinement of all parameters with centric data may not be possible in some space groups. For example, in P2 the only centric data available are of the type h0l, thus one can not refine ANY y coordinates. In GREF one can have the program automatically include the 25% strongest differences for acentric data along with the centric data to enable refinement of SOME y's, but then one is introducing assumptions about the protein phase which are only approximately valid, thus weakining the refinement. Also, in a space group like P2 the origin is not fixed in the y direction, so even the RELATIVE y coordinates BETWEEN DERIVATIVES can not be refined, even when acentric data ARE included. For refinement in GREF one would generally start by assigning the major site an occupancy of 1.0, other sites appropriate occupancies and all heavy atoms B values of 15 or 20. Then do a single cycle refining only the scale factor (which you can initially assign any positive value, usually 0.1). After an estimate of the scale factor is obtained, one can refine coordinates as appropriate, along with the scale factor. One can then refine coordinates, scale factor and occupancies simultaneously, but the occupancy of the MAJOR SITE should ALWAYS be held fixed at 1.0. Also, when polar axes are present even with sufficient data available for refinement (e.g. acentrics included) the coordinates of the MAJOR SITE along POLAR DIRECTIONS should still NOT be refined, as they are needed to fix the origin in the polar directions. Finally, if the resolution is sufficiently high one can then include B values in the refinement, but if there are indications of instability the B's can usually be held at their initial values without introducing much error. Attempts to simultaneously refine parameters which are not independent (i.e. coordinates for ALL atoms along polar directions, both scale factor AND occupancy when only one atom is input, ALL coordinates of an atom on a special position in the space group) or to refine parameters when there is no data determining that parameter (coordinates of ANY atom along polar direction when using ONLY centric data) will result in a singular matrix being obtained, and an aborted refinement. Paramaters obtained from refinement against amplitude differences are generally well suited to initiate subsequent phasing calculations or further "phase refinement" in program PHASIT. "Phase refinement" is carried out with the program PHASIT, and is ideally suited for refinement with multiple derivatives/data sets although it can also be used with a single derivative. In PHASIT either conventional refinement, or "maximum likelihood" options may be selected, and if only one derivative is used the program will automatically switch to maximum likelihood mode. Phase refinement requires an estimate of the protein phase, which is why it's better suited for the multiple derivative case, since SIR or SAS estimates alone are usually very poor. The advantages of phase refinement are that in general, all parameters may be refined including native to derivative scaling parameters, and the corresponding weights (expected lack of closure estimates) are also implicitly "refined". Since the origin is fixed by the protein phase estimates, refinement is possible for coordinates along polar directions, and the origin can thus be properly established between derivatives. Phase refinement is however, sensitive to the hand of the heavy atom sets, and it is assumed that all input sets correspond to the SAME origin and hand. A useful procedure is to initially start with all parameters as described above, (after correlating origins and hand between derivatives with cross difference Fouriers) and then refine only the FH scale factor for one cycle. Then refine the FH scale factor along with coordinates, then along with coordinates and occupancies (again always holding the MAJOR site occupancy to 1.0 in each derivative). Then one can include the FPH scale factor along with the other parameters, and finally include the B factors. When refining anomalous scattering data sets one would generally do the same, except that both the FPH and FH scale factors should NOT be refined simultaneously (they can be alternated), and for NATIVE anomalous scattering the FPH scale factor should NEVER be refined! A useful option is to use protein phase estimates obtained from an external source during phase refinement rather than calculated from the current heavy atom parameters and data. Thus one could refine initially as described, then modify the phases by solvent flattening and/or NC symmetry averaging, and then refine the parameters again this time against the modified phases. The new parameters are then used to compute phases to start another round of density modification. This procedure has been helpful in several cases, and usually is particularly good for refining the FPH scale factor. For conventional phase refinement in PHASIT one would select a figure of merit cutoff in the range 0.4 to 0.6 and use weights of 1/E**2. For maximum likelihood mode one would select a figure of merit cutoff in the range 0.1 to 0.2 and use unit weights. Most successful protein structure determinations have utilized phase refinement to obtain the final MIR type phases, although refinement against differences is often done first to obtain starting values for the parameters.
6.00 HEAVY ATOM DIFFERENCE, DOUBLE DIFFERENCE AND CROSS DIFFERENCE FOURIER MAPS There are several ways to compute heavy atom based difference or cross difference type Fourier maps within the PHASES package. 1) HEAVY ATOM DIFFERENCE or DOUBLE DIFFERENCE FOURIERS. The first approach is to refine heavy atom parameters against isomorphous or anomalous difference AMPLITUDES in program GREF, and request that a Fourier file be written. If this file is used in FSFOUR with MAPTYP=1, then the observed difference Fourier, i.e. that which should reveal all heavy atom sites can be obtained. If the file is used with the MAPTYP=3 option, then a "double difference" map is computed, i.e. the heavy atoms included in the structure factor calculation are subtracted out, so that the map should show only additional sites. The limitations with this approach are that the "observed" amplitudes ABS(FPH-FP) or ABS(F+ - F-) are approximations, since vector differences rather than amplitude differences should be used, and that the heavy atom model may be crude since the FPH to FP scale factor has not been refined, and anisotropic thermal parameters for the heavy atoms can not be used. Also, if used with anomalous data the absolute configuration can not be obtained since absolute values of delta F were used. The second approach is to compute phases and/or refine heavy atom parameters in program PHASIT, and use the "difference coefficients" files it produces in FSFOUR with the MAPTYP=1 or MAPTYP=3 options. If MAPTYP=1 the "observed" difference Fourier showing all heavy atoms will be obtained, however the results should be improvements over those obtained with the previous method. This results from the fact that the FPH to FP scaling parameters can be refined, the heavy atom thermal factors may be refined anisotropically, and phase difference information is used to correct the "observed" amplitudes to account for the fact that the two vectors are not colinear. In this case for isomorphous data sets the corrected "observed" differences, calculated heavy atom amplitudes and calculated heavy atom phases are used to compute the map. For anomalous data the "observed" and "calculated" Bijvoet differences are used along with the protein phases shifted by 90 degrees, to give true "Bijvoet difference" or "Bijvoet double difference" maps, so that the absolute configuration is preserved. Again, MAPTYP=1 should show all anomalous scatterers while MAPTYP=3 should have those included in the model subtracted out. These methods use phase information computed only from the heavy atoms or anomalous scatterers, although in the anomalous case all such information is combined to estimate protein phases. A third approach is to combine observed AMPLITUDE differences [ i.e. (FPH-FP) or (F+ - F-) ] directly with estimates of the protein phases to compute difference or Bijvoet difference Fouriers. One would then generate protein phases either by MIR, SIR, BNDRY, or from a model in PHASIT, and combine the phases with observed amplitude differences in programs MRGDF or MRGBDF for isomorphous or anomalous data, respectively. The maps would then be computed in FSFOUR using the MAPTYP=3 or MAPTYP=8 options for difference or Bijvoet difference maps, respectively. The advantage of this approach is that the protein phases themselves may be better, since one can use solvent flattened and/or NC symmetry averaged phases in the synthesis. For the isomorphous case the output coefficients file would then contain indices, FPH, FP, PHI_pro, and for the anomalous case the file would contain indices, F+, F-, PHI_pro. A disadvantage is that one can not "subtract out" the heavy atoms used in the phasing, so that they will also appear in the maps possibly making it more difficult to detect minor sites. 2) CROSS DIFFERENCE FOURIERS. This is accomplished similarly to the third option above, except that in MRGDF or MRGBDF a data file corresponding to a new derivative, i.e. one which was never used in phasing, is merged with an existing protein phase file. The cross difference Fourier (or cross Bijvoet difference Fourier) is then obtained in FSFOUR with MAPTYP=3 or MAPTYP=8, respectively. These maps should show all heavy atom or anomalous scatterer sites in the new derivative, which can then be checked against the appropriate difference Patterson. The advantage of doing this, in addition to helping solve the new derivative, is to assure that heavy atom sites in the new derivative correspond to the same origin and hand as those used in the original phasing.
7.00 CREATING/EDITING SOLVENT MASKS In most cases adequate solvent masks are prepared as part of the "doall" procedure, which carries out a reciprocal space equivalent of the automated protein-solvent boundary determination method described by Wang with the added modification that density in the immediate vicinity of heavy atoms is ignored during mask construction. Solvent masks however, can also be created by hand, from coordinates for an input model or by starting with any of these masks and editing them. Solvent masks MUST have a one-to-one correspondence with FSFOUR maps, and thus they also MUST cover one full cell on the same grid used for the map, and be oriented as xz sections. They also must have the structure as described in the "file formats" section. This happens automatically if the masks are constructed by the "doall" procedure, but care must be taken to insure these features if the masks are created by other means. Pre-existing solvent masks can be examined and/or edited in MAPVIEW, or MAPVIEW can be used to create the masks "from scratch" by hand tracing boundaries in contoured maps. Several options are now described. *** Examining/editing "normal" (i.e. full cell) solvent masks *** These masks (named mask1.14, mask2.14 and mask3.14, if created by the "doall" procedure) can be examined in MAPVIEW or MAPVIEW_X by inputting any FSFOUR map with the same grid, specifying that masks will be used, selecting 0 to 0.999 for each of the x, y and z ranges, specifying the xz section orientation and "recovering" the pre-existing mask file. From the menu contoured sections can then be selected and displayed. Clicking the mouse with the cursor in the "show mask" menu area will then display the solvent mask as blue dots on the solvent grid points. One can then use the menu options to scroll through the sections, displaying both contoured density and the solvent mask. One could also use the "trace mask" menu option as described in the MAPVIEW writeup to edit the mask with the mouse, but at this point it is not desirable to do this as the full cell map is displayed, and one may have to make identical edits in each symmetry related envelope. If this is not done very carefully one could easily destroy the space group symmetry in the mask. A better approach, if editing is to be done, is simply to examine the map and mask to determine the coordinate range which would carve out only one contiguous molecule (asymmetric unit) by following the fractional coordinates as the cursor moves across the screen (displayed in the lower right hand corner). Note that when determining the range one can cross into neighboring cells, although only the one-cell-translated map region is displayed. Once an appropriate range is deduced, write it down and exit MAPVIEW without saving any files. Then run EXTRMAP and EXTRMSK to extract that same range from the FSFOUR map and solvent mask, respectively, to create the corresponding "submaps". This allows one to deal only with a contiguous asymmetric unit, and to select regions spanning cell edges. Now run MAPVIEW again this time inputting the non-fsfour (i.e. submap) and its corresponding mask file. Editing can then be done on the submask. After editing all appropriate sections, use the "MAKEASU" menu option to symmetrize the submask, and scroll through the masks again to confirm that everything is as desired. Once you are happy with it, exit MAPVIEW and when prompted, request that the entire submask region be saved to a file. At this point you have the edited mask covering an asymmetric unit. Run BLDCEL inputting the submap, edited submask and original FSFOUR map to expand the submask (and submap) to a full cell. You can delete the output map file, but the output mask file now corresponds to the edited solvent mask, expanded to a full cell obeying space group symmetry. It can now be used for solvent flattening (or examined again in MAPVIEW just as the original mask was to confirm the expansion). ***** Creating solvent masks from a model ***** If atomic coordinates are available from a tentative model, these coordinates can be used to create a solvent mask. To do this one should first prepare a PHASES style file containing the atomic coordinates (possibly from a PDB file via PDB_CDS), and determine the range (in fractional coordinates) which encompasses the model atoms. Then enlarge the range (on each end) slightly to account for the radius to be assigned to each atom. MDLMSK can then be run to create a mask file just encompassing the molecule. When prompted in MDLMSK, the periods (number of grid points along each axis) should be specified EXACTLY as in the input to FSFOUR, to insure that the maps to be computed later will have the same grid as the mask. The adjusted fractional coordinate range for the model should then be specified along with a mask number (use 1 for pure solvent masks), and a radius of about 1.8 angstroms. In the mask the outer boundary will be appropriate, but there will typically be many small holes in the interior caused by use of a Van der Waal's size radius. Use of a larger radius could avoid these holes, but would artificially extend the outer boundary. To avoid this one generally uses the smaller radius, and then edits the masks to preserve the outer boundary but fill in the interior holes. This can be done very quickly in MAPVIEW. To do this run MAPVIEW inputting a FSFOUR map (any one will do, as long as the periods are the same as that used in MDLMSK) and request that masks will be used. Then input the same coordinate range as in MDLMSK, request the xz section orientation and "recover" the mask file from MDLMSK. You can effectively turn off the density display by selecting a high contour level, and scroll through the sections editing each via the "show mask" and "trace mask" options described in the MAPVIEW writeup. Just quickly trace around the already displayed outer boundary to preserve it, and the interior holes will be filled automatically when you are done with each section. When finished, use the "MAKASU" option to symmetrize the mask region. Then exit MAPVIEW and request that both the entire map and mask regions be written to files. You then will have an edited mask file encompassing the model, and the corresponding submap file. The last step is to convert the edited mask to a full cell mask. To do this, run BLDCEL inputting the submap, corresponding edited mask and original FSFOUR map. The output map file can be deleted, but the output mask file will be a full cell version of the edited, model based mask which now also obeys space group symmetry. It can then be used for solvent flattening (for example, replacing the mask3.14 file in the cycle16.sh, extnd.sh or extndavg.sh procedures), and also can be examined in MAPVIEW as described earlier.
8.00 INCORPORATION OF PARTIAL STRUCTURE INFORMATION In many cases a significant fraction of the structure can be reliably determined from an electron density map, but some regions in the map are less well defined. In that case it is often useful to incorporate phase information obtained from the partial structure into the phasing process. This can be done in several ways, all of which require running the PHASIT program once (in SF calculation mode, IHLCF=0, ISIGA=0) to generate partial structure phases and amplitudes, and running the BNDRY program once (option 3, with ICMB=0 or 1) to combine the partial structure phase information with prior phase probability distributions cast in terms of Hendrickson- Lattman coefficients. Different strategies can be employed depending on which prior distributions are used, weighting during the phase combination and what is done AFTER the phase combination step. The most common procedures are now described. In all procedures, first run PHASIT in SF calculation mode using IHLCF=0 and ISIGA=0, and call the output phase file MODEL.31. This file contains the partial structure phase and amplitude information. Now you have some choices. 1) Combine the partial structure information with the original (MIR, SIR etc) probability distributions (usually in file called PHASIT.31 generated by PHASIT, but possibly introduced via the IMPORT program). This can be done with a small control file to run BNDRY, option 3, using ICMB=0 or 1 for either Sim or Sigma_A weighting, respectively, (see BNDRY write-up) during phase combination. Call the output file PHICOMBINED.ORG. This file will contain phase, figure of merit and HL coefficients for the COMBINED data. If the partial structure was large enough, you may be able to use these phases directly to get a good map. 2) If you want to proceed with solvent flattening cycles, just copy the file PHICOMBINED.ORG to PHASIT.31 (first saving the ORIGINAL PHASIT.31 i.e. no partial structure contributions, in another file). Now you can invoke the default procedure DOALL.COM without changing anything, and at each phase combination step the MODEL+SIR etc distributions will serve as the "anchored" phases with which those newly obtained from solvent flattening will be combined. 3) If you wish instead to combine the partial structure information with distributions obtained AFTER solvent flattening, do the same as in 1), but use the best phases available (usually in file obtained from a previous run called phi16cy.31, phiextnd.31 etc) instead of the original PHASIT.31 file. Call the output file PHICOMBINED.FIN. One could then proceed with solvent flattening cycles as in 2), but usually this is not necessary and the phases in file PHICOMBINED.FIN are used for the final map. These 3 options (partial structure + MIR etc with no flattening, [partial structure + MIR etc] followed by flattening, and parital structure + flattened MIR) seem to be the most useful, and can all be carried out without tampering with the default control files. One only has to create additional small control files for single runs of PHASIT (SF mode) and BNDRY (option 3). Other options making use of partial structure information are described in the section on "REDUCED BIAS NATIVE, COMBINED AND DIFFERENCE FOURIERS." Note that in the case where a molecular replacement solution was obtained, then one has no MIR like phase probability distributions to combine solvent flattened (e.g. map inverted) phases with. In that case (or if one has MIR phase information, and simply wants to abandon it), one can use PHASIT in SF calculation mode but with IHLCF=1 and ISIGA=0 or 1. That will create Hendrickson-Lattman coefficients for the partial structure, and these distributions can then be used as the "anchored" phases with which those newly obtained from solvent flattening will be combined. This may also be useful if want wants to tie noncrystallographic symmetry averaged phases to model phases. Another option is available involving phase extension. Suppose one has MIR, SIR etc data to only 4.0 angstrom resolution, native data to 3.0 angstrom resolution, and a partial structure available. One could first compute the partial structure phases out to 3.0 angstrom resolution (PHASIT SF mode, IHLCF=0, ISIGA=0) and MIR etc phases out to 4.0 angstrom resolution (PHASIT, phasing mode). Then run MISSNG to get the file "extrfl.d" containing reflections between 4 and 3 angstroms. The three output files could then be combined in a single run of BNDRY (option 3, ICMB=0 or 1, with phase extension requested) to get a hybrid file. The output file would then contain MIR combined with partial structure phases to 4 angstroms, and partial structure phases between 4 and 3 angstroms. This file could then be used for direct calculation of a map or to initiate solvent flattening cycles as described earlier. Yet another variation would be to do the same thing but requesting IHLCF=1 and ISIGA=0 or 1. In that case during solvent flattening iterations the map inverted phases would also be tethered to the partial structure phases for the high resolution data. Clearly many other options or sequences are available. The key to successful use of the programs in this fashion is understanding that the phase combination program (BNDRY, option 3) merges at least two files, one of which must contain phase information cast in Hendrickson-Lattman coefficients, and the other containing only calculated phases and amplitudes (possibly to higher resolution). If phase extension is also desired, a third file with the additional reflections may contain only indices and amplitudes, but it may also contain phase probability distribution coefficients for some or all of the reflections. The output file always contains the COMBINED information cast in the HL coefficient form. It is thus suitable for use either for direct map calculations or as an input file for the BNDRY, MISSNG, MRGDF, MRGBDF, RD31 etc programs.
9.00 REDUCED BIAS NATIVE, COMBINED AND DIFFERENCE FOURIER MAPS When making use of partial structure information, either obtained from a model via structure factor calculations or from inversion of a density map, the resulting phases are always biased towards the partial structure. Read (Acta Cryst. A42, 140-149, 1986) has shown how this bias can be reduced significantly when using the partial structure phases directly for map calculations, and how to properly weight partial structure derived phase information when combining it with other (e.g MIR, SIR etc.) phase information. Both procedures require first determining "Sigma_A", which is related to the contributions from "missing" or "incorrect" parts of the structure and varies with resolution, to compute the proper weight (and thus FOM) for the partial structure phases. The procedures described by Read have been implemented as options in both PHASIT and BNDRY, and can be invoked as follows: (1) COMBINED PHASE MAPS For simple phase combination the Sigma_A procedure can be invoked in the BNDRY program (option 3), by setting ICMB=1. Sigma_A weighting is then used instead of Bricogne's modification of Sim's weighting scheme during phase combination. This can be used either with model phases or map inverted phases, and thus can be done automatically in the "doall" or "extndavg" procedures. (2) REDUCED BIAS DIFFERENCE MAPS These maps are similar to conventional Fo-Fc maps phased with the partial structure, but the coefficients are FOM * FOBS - D * FCALC * exp(i * phicalc) where D is derived from the Sigma_A values and phicalc is the phase from the partial structure. The appropriate map can be produced by running PHASIT, SF mode with ISIGA=2, and then requesting a Fo-Fc map in FSFOUR. Note however, that if chosen, FOM*FOBS and D*FCALC will occupy the Fo and Fc slots in the output file, thus other map types requiring pure Fo and/or pure Fc values will be inaccessible. (3) REDUCED BIAS NATIVE MAPS These maps are similar to conventional 2Fo-Fc maps phased with the partial structure, but the coefficients are 2 * FOM * FOBS - D * FCALC * exp(i * phicalc) for acentric reflections where D is derived from Sigma_A and FOM * FOBS * exp(i * phicalc) for centric reflections. The appropriate map can be produced by running PHASIT, SF mode setting ISIGA=3, and then requesting a 2Fo-Fc map in FSFOUR. Note however, that if chosen, FOM*FOBS and D*FCALC (acentric) or FOM*FOBS/2 and 0 (centric) will occupy the Fo and Fc slots in the output file, thus other map types requiring pure Fo and/or pure Fc values will be inaccessible. (4) SIGMA_A WEIGHTED PROBABILITY DISTRIBUTION COEFFICIENTS Phase probability distribution coefficients and corresponding FOM based on Sigma_A weighting for structure factors computed entirely from an atomic model can be obtained by running PHASIT, SF mode with ISIGA=1. The file may then be used as the "anchor" phases to which map inverted phases are tethered. It is particularly useful for merging information during phase extension when high resolution phases come from a partial structure and lower resolution phases come from MIR type calculations.
10.00 INCORPORATION OF NONCRYSTALLOGRAPHIC SYMMETRY AVERAGING Whenever there are multiple copies of identical molecules present in the crystallographic asymmetric unit and/or the same molecule is present in multiple crystal forms, one has the opportunity to improve the phases by averaging the corresponding electron density in the related molecules, replacing the density for each molecule with the average, and inverting the "averaged" density map(s) to obtain new structure factor amplitudes and phases. These new amplitudes and phases can then be accepted immediately, but are more frequently combined with the original MIR, SIR etc phase information in a probabilistic manner, just like those obtained from solvent flattening or from a partial structure. Indeed, solvent flattening and imposition of non-negativity of electron density can be applied in addition to the noncrystallographic symmetry averaging, leading to powerful phasing algorithms. The resulting phases (either alone or combined with MIR, SIR etc), are typically combined with the observed amplitudes, and the process is cycled until convergence is obtained. The power of the method increases as the number of molecules averaged increases, but averaging over even a dimer is still extremely useful when combined with MIR, SIR data, etc. Programs required to carry out the steps needed for successful noncrystallographic symmetry averaging are currently included in the PHASES package, and sample control scripts are given (called "extndavg.sh and extndavg_mc.sh", for the single and multiple crystal averaging cases, respectively) which replace the "extnd.sh" script in a normal solvent flattening run. The scripts insert the averaging related steps into the normal solvent flattening process, thus the complete multi-cycle task can be carried out by executing them. Prior to running the scripts however, there are several related tasks to be performed, which include determination of the location, direction and nature (rotational order) of the noncrystallographic symmetry operator(s), and construction of one or more "averaging envelopes" or "averaging masks" delineating the volume(s) occupied by the molecules to be averaged. Initial estimates for the noncrystallographic symmetry operator(s) are usually obtained from rotation/translation functions which are not included in the PHASES package as they are readily available elsewhere, however if the operators are specified by 3x3 rotation matrices and 3 element translation vectors (as for example, in the program "O"), then the PHASES program O_TO_SP can be used to convert them to PHASES format. Everything else, including refinement of the operator(s) and construction of the envelope mask(s) is part of PHASES. All map interpolation programs (MAPAVG, SKEW, MAPORTH etc) utilize powerful 64 point spline algorithms, thus the map grids for averaging need not be any finer than for normal calculations. Many of the noncrystallographic symmetry averaging routines in PHASES were derived from programs originally written by W. Hendrickson & J. Smith. In most instances they have been heavily modified for use in PHASES, mostly to generalize the algorithms, to optimize the code, and to provide compatabilty with the rest of the package. The general averaging process as implemented in PHASES is described below. For both simplicity and reasons related to computational efficiency, all of the averaging related calculations are best performed on electron density "submaps," which cover only the map region encompassing an asymmetric unit containing the molecules to be averaged. This "asymmetric unit" need not be complete in the crystallographic sense (that is, it may differ from a true asymmetric unit in volume and have irregular borders), but it must encompass at least the molecules to be averaged, although solvent regions may be omitted. It may also span cell edges, if necessary. Since the standard FSFOUR maps always cover a complete unit cell, the "submaps" (which have a different format) can be created from them via the programs MAPVIEW or EXTRMAP. Indeed, MAPVIEW will almost certainly be needed to determine which region to extract in the first place. All envelope creation, averaging, operator refinement, skewing etc will be done using the submaps. After appropriate regions in the submaps are averaged, program BLDCEL is used to regenerate complete unit cell maps (FSFOUR format) conforming to the space group symmetry, which can then be inverted by MAPINV. Thus MAPVIEW (or EXTRMAP) and BLDCEL serve as the gateways between normal FSFOUR maps and submaps. Note that MAPVIEW can display either type of map (and mask). Descriptions of the inputs required for each of the programs mentioned can be found in the appropriate program write-ups. The keys to successful averaging are to obtain good "envelope" masks which accurately identify the volume(s) in space in which the noncrystallographic symmetry operator(s) is/are valid, and to obtain accurate values for the operators themselves. These tasks always will take one of two routes, depending on the nature of the noncrystallographic symmetry. Within a given crystal, if the NC symmetry is purely rotational with the order of rotation being N-FOLD, where N is a small integer, then the task is simplified since one needs only a single "envelope mask" which encompasses all N of the molecules related by NC symmetry. That is to say the averaging can be done without having to specify where one molecule stops and the next starts. One only needs to know the bounds of the TOTAL AGGREGATION of molecules. The procedure (A) below is then adequate to carry out the necessary computations. If an arbitrary rotation angle and/or a translational (eg screw like) shift is involved, the task is more complicated since one then must create a SEPARATE ENVELOPE MASK identifying each molecule. The procedure (B) below is then adequate to carry out the computations. For multiple crystal averaging the same steps and considerations are required, but multiple submaps (one for each crystal form, along with corresponding envelope masks) are used. Details related to multiple crystal averaging are described later. (A) MASK PREPARATION STEPS AND OPERATOR REFINEMENT WITH PURE ROTATIONAL SYMMETRY OF ORDER N 1) start with best possible map (usually solvent flattened MIR map, as obtained via the "doall" procedure). 2) compute a map via "FSFOUR" (default orientation, i.e NORN=0) 3) run EXTRMAP (or MAPVIEW) to extract a submap from the FSFOUR map which encompasses at least the dimer, trimer etc, related by KNOWN (at least approximately) noncrystallographic symmetry. 4) if the unit cell is not orthogonal, run MAPORTH to convert the submap to an orthogonal grid (but save the input submap as well) 5) run LSQROT (using orthogonal map), to refine the noncrystallographic symmetry axis location and direction. Start with low resolution (~6A map, 2A grid) refining only within a sphere of suitable radius (usually 12-25A), centered about a point on the rotation axis which is near the dimer, trimer etc center. Then gradually extend the map resolution to about 3A (1A grid) and repeat the refinemnt. In a 4A map, the correlation coefficient after refinement should be about 0.4 or higher. (Ignore the R factor, its always very high). 6) run SKEW (using the submap from 3), to generate "skewed" map with new "b" axis aligned with noncrystallographic symmetry axis. 7) run MAPVIEW (using "skewed" map) to create mask (via "trace mask" option) which encompasses only the region to be averaged. This should include the entire dimer, trimer etc. In MAPVIEW, use only a single mask (Mask No. 1). When exiting, save the "skewed" mask file. 8) run TRNMSK (using both the original submap from 3, and "skewed" mask from 7 to convert the skewed mask to one corresponding to the default (non-skewed) orientation (its grid will have one-to-one correspondence with the original submap). Save this standard mask. 9) run MAPVIEW (using the original submap from 3), and "recover" the standard mask file from 8. Then use "Make Asu" option, and possibly edit masks until only non redundant density associated with the desired dimer, trimer etc is within the mask. When exiting, save the ENTIRE mask (no subset). It will be used in all future averaging cycles. Optionally, run LSQROT again this time using the default mask output from 9 as basis for refinement (you may have to orthogonalize it), instead of a sphere. If you do this, expect a drop in the correlation coefficient. If the orientation changes significantly, repeat steps 6-9. Proceed to AVERAGING STEPS (B) MASK PREPARATION STEPS AND OPERATOR REFINEMENT WITH ARBITRARY ROTATIONAL ANGLE AND/OR TRANSLATION Steps 1-4 same as in (A) 5) Run LSQROTGEN (using orthogonal map), to refine the noncrystallographic symmetry operators relating molecule 1 (arbitrarily selected) to each other molecule. Start with low resolution (~6A map, 2A grid) refining only within spheres of suitable radius (typically 15A) centered on points near the centers of molecule 1 and the target molecule, respectively. Then gradually extend the map resolution to about 3A (1A grid) and repeat the refinement. In a 4A map, the correlation coefficient after refinement should be about 0.4 or higher. (Ignore the R factor, its always very high). For N related molecules, there will be N-1 operators to refine. 6) Run MAPVIEW (using the submap from 3) to create SEPARATE envelope masks for EACH MOLECULE to be averaged. Do this by making use of the "set mask no." and "trace mask" options. When exiting, save the mask file, as it now contains separate envelope information for each molecule. Also, remember which mask No. you assigned to which molecule. 7) Run MAPVIEW (using original submap from 3), and "recover" the standard mask file from 6. Then use "Make Asu" option, and possibly edit masks until only non redundant density associated with the desired dimer, trimer etc is within molecular envelope masks. When exiting, save the ENTIRE mask (no subset). It will be used in all future averaging cycles. Optionally, run LSQROTGEN again this time using the default mask output from 7 as basis for refinement (you may have to orthogonalize it), instead of spheres. If you do this, expect a drop in the correlation coefficient. If the operator(s) change significantly, repeat steps 6-7, otherwise continue. AVERAGING STEPS Prior to brute force cycling, run MAPAVG (using the original submap from 3, and the corresponding mask from 9A or 7B) to generate an "averaged" map. If the translation is small (or absent) use "SKEW" to convert it so you can look down the NC symmetry axis. You can then use "MAPVIEW" to view the map, and verify that averaging has indeed been done successfully, that you are in fact looking down the NC symmetry direction, and the axis goes through the origin. If so, proceed to averaging cycles. If not, something went wrong earlier. Check program inputs, outputs, polar axis conventions, etc. At this point refined values of the noncrystallographic symmetry operator(s) are available, along with envelope masks isolating the regions to be averaged within the submap. 1) create the file "extrmap.d", which will specify what submap region to extract from the FSFOUR map. It MUST correspond EXACTLY to the same region used when creating the envelope masks. (You can read the envelope mask header with RDHEAD if you forgot). Rename the final mask file "asu.msk" See EXTRMAP write-up for information. 2) create the file "mapavg.d", to specify the transformation operator(s) for averaging, and the envelope mask file. See MAPAVG write-up for information. 3) create the file "bldcel.d", to specify the file names and options. BLDCEL will take the "averaged" asymmetric unit submap from mapavg, and build a complete cell FSFOUR style map from it. See BLDCEL write-up. 4) Create the file "sloext.d" specifying phase extension information and cycles to be performed (see SLOEXT write-up). If no phase extension is to be done, make the upper and lower resolution cutoffs identical and specify 16 cycles. Otherwise, specify the resolution cutoffs and cycles per resolution increment, and run MISSNG to create the "extrfl.d" file. 5) Create the file "extnd.d" specifying file names, extension options and I/O type. 6) Verify that the phase files (phasit.31 and phi16cy.31), solvent mask (mask3.14), and data files (bnd2.d, fft.d, minv2.d) from a previous "doall" run are available. 7) Run the procedure "extndavg.sh". It will carry out the cycles of NC symmetry averaging/solvent flattening/phase combination/phase extension steps to combine "averaged" phases with the original MIR phases. ***** CREATING AVERAGING ENVELOPE MASKS FROM A MODEL ***** If coordinates from a tentative model are available, they can also be used to create the averaging envelope masks. The procedure is esssentially that described in the CREATING/EDITING SOLVENT MASKS section, with a couple of minor exceptions. First, after the initial mask is constructed in MDLMSK and edited in MAPVIEW as described, one is finished since unlike solvent masks, there is no need to make "full cell" averaging masks with BLDCEL. Second, if the NC symmetry operation involves arbitrary rotations and/or post rotation translations, then MDLMSK must be run multiple times; once for each NC symmetry related molecule. In each run a separate file should be written and a different mask number must be used, but each file must cover the same range (which is large enough to cover ALL copies). The particular mask numbers used must be remembered as they will be needed later when specifying which transformation operators are to be used in MAPAVG to relate the molecules. The individual mask files should then be edited and saved as described in the CREATING/EDITING SOLVENT MASKS section. Once edited, the individual mask files must be combined into a single mask file with program MRGMSK. The output file from MRGMSK then can be used for averaging (i.e. as "asu.msk" in the "extndavg.sh" procedure). The output masks from MDLMSK or MRGMSK can be used for averaging as long as a corresponding map region is provided. Thus the input used to create the submaps ("extrmap.d" in the "extndavg.sh" procedure) must specify the same range, and the map must have the same periods. If the "extndavg.sh" procedure is to be used, then the solvent mask must also be created from a map having the same periods. The output masks can be examined/edited in MAPVIEW, again as long as the corresponding map region (either explicitly selected from a FSFOUR map in MAPVIEW, or previously extracted from a FSFOUR map by EXTRMAP and input to MAPVIEW as a non-FSFOUR map) is provided. Once the output masks from MDLMSK or MRGMSK are obtained, they can be used just like any other averaging mask file, i.e. used for operator refinement in LSQROT or LSQROTGEN, used in MAPAVG, manipulated in SKEW, BLDCEL etc.
10.01 AVERAGING WITH MULTIPLE CRYSTALS All of the submap extraction and mask preparation steps used in single crystal averaging as described earlier must be carried out independently for each crystal, thus multiple submap and corresponding mask files must be created. If a given crystal also contains multiple NC symmetry related copies WITHIN IT, then the operators relating molecule 1 to each of them must also be refined exactly as described in the single crystal case. This will allow both intra and inter crystal averaging to be carried out simultaneously. In addition, the operators relating MOLECULE 1 in CRYSTAL 1 to MOLECULE 1 in EACH OTHER CRYSTAL must also be refined. This can be done in program LSQROTGEN by specifying the appropriate input. Once all of the required operators and envelope masks are obtained, averaging can proceed by specifying the appropriate input to program MAPAVG, and by preparing the required input files for EXTRMAP and BLDCEL for each crystal. A script file "extndavg_mc.sh" is supplied for multiple crystal averaging in the case where there are two crystals. It can easily be modified to include more crystals (up to 6), and comments are embedded in it explaining where modifications are to be made. The main difference in the procedure for multiple crystal averaging is that all of the normal input files must be duplicated for each crystal, and the standard file names for maps, masks, data files etc must be be modified to uniquely identify the appropriate crystal. During each cycle of multiple crystal averaging the full cell maps are created and the submaps are extracted independently for each crystal. Then an averaged version (averaged over ALL copies) is created for each submap. For each crystal, the averaged submap is then expanded to its full cell version, solvent flattened, Fourier inverted and the resulting phases and amplitudes combined probabilistically with the appropriate MIR or SIR phases. Thus a new improved map can be obtained for each crystal. With multiple crystal averaging however, there is currently no facility for slow phase extension, thus the file "sloext.d" is not needed and the number of cycles to be done is hard wired into the "extndavg_mc.sh" script. Phase extension however, still can be done. It's just that the appropriate cutoffs are supplied only in the "extnd.d" files for each crystal, and are constant for all iterations. One can of course, still extend the resolution gradually by repeating the process with iteratively with different cutoffs and input files.
10.02 AVERAGING DIFFERENCE OR 2FO-FC MAPS One usually does the averaging/solvent flattening iterations on normal electron density maps, but in some cases it may be desirable to average FO-FC or 2FO-FC maps. Examples might be when trying to identify inhibitors, activators etc. soaked in to known crystal structures, or when trying to build up density for missing sections of the macromolecule itself. This can be accomplished by proper preparation of the input files, and changing the map type specification in the fft.d input file. To do this one must assure that both FO and FC are available on the INITIAL file (called phi16cy.31 in the "extndavg.sh" or "extndavg_mc.sh" scripts) used to create the first map, and on the OUTPUT file ("newphi.ref" or "newphi_N.ref," etc.) produced in each iteration. For the output files this is done by specifying IOTYP=1 in the BNDRY option 3 input ("bnd3.d" or "bnd3_N.d" etc.). For the INITIAL file, one could obtain it from a single run of BNDRY, option 3, again specifying IOTYP=1, or from a run of PHASIT, structure factor mode specifying IHLCF=0 and ISIGA=0, depending on whether the phase information comes from MIR type calculations or from atomic coordinates for a model. Note however, that the "anchor" phase file (called "phasit.31" or "phasit_N.31" etc.) which the map inverted phases will be combined with MUST contain FM*FO and FO in the amplitude slots along with probability distribution coeficients, as would be the case if the file was created with a normal PHASIT run in protein phasing mode (or structure factor mode if the "long format" output was requested). As long as these files are properly prepared and the appropriate coefficients are selected in the fft input, iterations using map types involving FC's will be obtainable. One must be aware however, that the final output file (phiextndavg.31) will then also have FO and FC in the amplitude slots, and thus can only be used in FSFOUR for straight or difference type Fouriers, and NOT for figure of merit weighted Fouriers. If one is averaging with molecular replacement derived phase information and has already proceeded as described in the DENSITY MODIFICATION WITH MOLECULAR REPLACEMENT DERIVED PHASE INFORMATION section, i.e. the "doall" procedure has been run using the modified inputs, one need only to change the filename "phasit.31" in the "extnd.d" input file to "anchor.31". Then averaging can proceed with the "extndavg.sh" script using all of the preexisting files, once the averaging mask, extrmap.d, mapavg.d., bldcel.d sloext.d (and possibly extrfl.d) files are created.
10.03 SAMPLE INPUT FILES FOR AVERAGING ***** SAMPLE INPUT FILES FOR AVERAGING WITHIN ONE CRYSTAL ***** Sample input files for the averaging steps follow, along with a listing of the supplied template command files "extndavg.sh" and "extndavg.com". The command files can be used in place of the normal "extnd.sh" or "extnd.com" file in a solvent levelling run. They will perform additional cycles of averaging/solvent flattening/phase combination/extension starting with the phases in file "phi16cy.31", combining the "averaged" phases with MIR, SIR phase information in file "phasit.31", and extending phases to additional amplitudes on file "extrfl.d". They assume that all of the files needed for a normal solvent flattening run (fft.d, bnd2.d, etc) are available, and that the third mask from a previous run (mask3.14) is still available for solvent flattening. If the template script file is to be used unchanged, then all filenames should be EXACTLY as in the examples (except for standard parameter file). Only the data relating to submap ranges, resolution limits, number of cycles and the NC operators should be changed. The final phases will be written to file "phiextndavg.31", and printed information to "extndavg.l". The procedure is run simply by entering "sh extndavg.sh" (UNIX) or "@EXTNDAVG.COM" (VMS). -- Sample input file extrmap.d or extrmap.dat, to extract submap ---- pdc.pam four.map asu.map -.42 .45 -.45 .42 -.08 .56 --- Sample input file mapavg.d, for averaging over pure twofold ----- pdc.pam 1 asu.map asu.msk asu.avg 2 1 1 -102.16 83.81 180.0 1.082 -.746 .316 0.0 --- Sample input file bldcel.d, builds complete cell from averaged submap --- pdc.pam four.map avgcell.map asu.avg asu.msk 0 --- Sample input file extnd.d, specifying phase combination data and options --- pdc.pam 3 1 2.75 1. 0 0 phasit.31 minv.ref extrfl.d newphi.ref Note that if one doe NOT want to include phase extension, then the first "1" should be changed to zero and the line containing "extrfl.d" should be omitted (see BNDRY write-up). --- Sample input file sloext.d, controlling no. of averaging cycles and phase extension information --- pdc.pam 3. 2.75 8 extnd.d Note that if one doe NOT want to include phase extension, then the value of the two resolution limits should be made equal and the 8 changed to 16 to do a total of 16 averaging cycles (see SLOEXT write-up). ************* procedure extndavg.sh *************** # MODIFIED TO INCLUDE NONCRYSTALLOGRAPHIC SYMMETRY AVERAGING # # RESUME WHERE WE LEFT OFF AFTER FIRST 16 CYCLES # cp phi16cy.31 newphi.ref ln phasit.31 minv.ref # # CREATE THE TEMPORARY FILE "EXTND.TMP" CONTROLLING EXTENSION sloext < sloext.d > extndavg.l # # PERFORM THE REFINEMENT/PHASE EXTENSION ITERATIONS USING # THE THIRD MASK # # DO ITERATIONS AS LONG AS THE FILE EXTND.TMP EXISTS (IT WILL # AUTOMATICALLY BE DELETED WHEN THE SPECIFIED RESOLUTION/NUMBER # OF ITERATIONS IS REACHED) while test -r EXTND.TMP do # # COMPUTE ELECTRON DENSITY MAP FROM COMBINED PHASE INFORMATION # rm minv.ref mv newphi.ref four.ref fsfour < fft.d >> extndavg.l rm four.ref # # EXTRACT REGION FROM MAP APPROPRIATE FOR AVERAGING extrmap < extrmap.d >> extndavg.l # # AVERAGE THE ELECTRON DENSITY, WITHIN THE MOLECULAR ENVELOPE mapavg < mapavg.d >> extndavg.l rm asu.map # # REBUILD THE COMPLETE UNIT CELL bldcel < bldcel.d >> extndavg.l rm four.map asu.avg mv avgcell.map four.map # # MODIFY AVERAGED ELECTRON DENSITY MAP ACCORDING TO MASK # ln mask3.14 mask.map # use .06 for 3A,.086 for 3.5 and .112 for 4A bndry < bnd2.d >> extndavg.l rm four.map mask.map # # INVERT MODIFIED MAP TO OBTAIN STRUCTURE FACTORS # mapinv < minv2.d >> extndavg.l rm mod.map # # COMBINE PRIOR PHASE INFORMATION WITH THAT FROM MAP INVERSION # AND EXTEND PHASING TO ADDITIONAL AMPLITUDES # bndry < extnd.d >> extndavg.l # # ADJUST ITERATION COUNT AND/OR RESOLUTION CUTOFF (THIS WILL ALSO # DELETE THE "EXTND.TMP" FILE AT THE APPROPRIATE TIME, THUS EXITING # THE LOOP) sloext < sloext.d >> extndavg.l # done # mv newphi.ref phiextndavg.31 mv minv.ref allcoef.31 # THATS ALL ************* procedure extndavg.com *************** $SET NOVERIFY $! $! SEND ALL PRINTED OUTPUT TO EXTNDAVG.L $ASSIGN EXTNDAVG.L SYS$OUTPUT $! $! MODIFIED TO INCLUDE NONCRYSTALLOGRAPHIC SYMMETRY AVERAGING. $! RESUME WHERE WE LEFT OFF AFTER FIRST 16 CYCLES $! $COPY PHI16CY.31 NEWPHI.REF $COPY PHASIT.31 MINV.REF $COPY MASK3.14 MASK.MAP $! $! CREATE THE TEMPORARY FILE "EXTND.TMP" CONTROLLING EXTENSION $ASSIGN SLOEXT.DAT FOR005 $SLOEXT $DEASSIGN FOR005 $! $! PERFORM THE PHASE EXTENSION/REFINEMENT ITERATIONS USING THE $! THIRD MASK $! $LOOP3: $! $! DO ITERATIONS AS LONG AS THE FILE EXTND.TMP EXISTS (IT WILL $! AUTOMATICALLY BE DELETED WHEN THE SPECIFIED RESOLUTION/NUMBER $! OF ITERATIONS IS REACHED) $FILESPEC=F$SEARCH("EXTND.TMP") $IF FILESPEC .EQS. "" THEN GOTO DONE3 $! $! COMPUTE ELECTRON DENSITY MAP FROM COMBINED PHASE INFORMATION $DELETE MINV.REF;* $RENAME NEWPHI.REF FOUR.REF $ASSIGN FFT.DAT FOR005 $FSFOUR $DEASSIGN FOR005 $DELETE FOUR.REF;* $! $! EXTRACT REGION FROM MAP APPROPRIATE FOR AVERAGING $ASSIGN EXTRMAP.DAT FOR005 $EXTRMAP $DEASSIGN FOR005 $! $! AVERAGE THE ELECTRON DENSITY, WITHIN THE MOLECULAR ENVELOPE $ASSIGN MAPAVG.DAT FOR005 $MAPAVG $DEASSIGN FOR005 $DELETE ASU.MAP;* $! $! REBUILD THE COMPLETE UNIT CELL $ASSIGN BLDCEL.DAT FOR005 $BLDCEL $DEASSIGN FOR005 $DELETE FOUR.MAP;* $DELETE ASU.AVG;* $RENAME AVGCELL.MAP FOUR.MAP $! $! MODIFY AVERAGED ELECTRON DENSITY MAP ACCORDING TO MASK3 $! use .06 for 3A,.086 for 3.5 and .112 for 4A $ASSIGN BND2.DAT FOR005 $BNDRY $DEASSIGN FOR005 $DELETE FOUR.MAP;* $! $! INVERT MODIFIED MAP TO OBTAIN STRUCTURE FACTORS $ASSIGN MINV2.DAT FOR005 $MAPINV $DEASSIGN FOR005 $DELETE MOD.MAP;* $! $! COMBINE PRIOR PHASE INFORMATION WITH THAT FROM MAP INVERSION $! AND EXTEND PHASING TO ADDITIONAL AMPLITUDES $! $ASSIGN EXTND.DAT FOR005 $BNDRY $DEASSIGN FOR005 $! $! ADJUST ITERATION COUNT AND/OR RESOLUTION CUTOFF (THIS WILL ALSO $! DELETE THE "EXTND.TMP" FILE AT THE APPROPRIATE TIME, THUS EXITING $! THE LOOP) $ASSIGN SLOEXT.DAT FOR005 $SLOEXT $DEASSIGN FOR005 $! $GOTO LOOP3 $! $DONE3: $! $DELETE MASK.MAP;* $RENAME NEWPHI.REF PHIEXTNDAVG.31 $RENAME MINV.REF ALLCOEF.31 $PURGE ALLCOEF.31 $! $DEASSIGN SYS$OUTPUT $! $! THATS ALL ***** SAMPLE INPUT FILES FOR AVERAGING WITH MULTIPLE CRYSTALS ***** Sample input files for the averaging steps follow, along with a listing of the supplied template command files "extndavg_mc.sh" and "extndavg_mc.com". The command files can be used in place of the normal "extnd.sh" or "extnd.com" file in a solvent levelling run. It will perform 16 cycles of averaging/solvent flattening/phase extension for each crystal starting with the phases in files "phi16cy_1.31" and "phi16cy_2.31" for crystals 1 and 2, respectively, and combining the "averaged" phases with MIR, SIR phase information in files "phasit_1.31" and "phasit_2.31" for crystals 1 and 2, respectively. It also will extend phases to additional amplitudes on files "extrfl_1.d" and "extrfl_2.d" for crystals 1 and 2, respectively. The script assumes that all of the files needed for a normal solvent flattening run (fft.d, bnd2.d, etc) are available for each crystal, and that the third mask from a previous run (mask3.14) is still available for solvent flattening in each crystal. In order to keep input data and files associated with the proper crystal, the "normal" file names should have an "_N" inserted immediately proceeding the extension, i.e. fft_1.d, fft_2.d etc would replace fft.d for crystals 1 and 2, respectively. If the template script file is to be used unchanged, then all filenames should be EXACTLY as in the examples (except for the standard parameter file). Only the data relating to submap ranges and the NC operators should be changed. The final phases will be written to files "phiextndavg_1.31", and "phiextndavg_2.31" for crystals 1 and 2, respectively, and printed information will be written to "extndavg_mc.l". The procedure is run simply by entering "sh extndavg_mc.sh" (UNIX) or "@EXTNDAVG_MC.COM" (VMS). For each crystal it assumes that the following files exist where "N" is replaced by the crystal number, and that the file "mapavg_mc.d" exists to control the averaging. phi16cy_N.31 Starting phases, to get first map fft_N.d fft grid info extrmap_N.d submap extraction info asu_N.msk averaging mask bldcel_N.d info for reconstruction of full cell map from submap bnd2_N.d solvent flattening info mask_N.map solvent flattening mask minv2_N.d map inversion info extnd_N.d phase combination info phasit_N.31 Anchor phases, to be combined with map inverted phases extrfl_N.d Additional reflections, if phase extension requested. Additionally, it is assumed that ALL file names (apart from the parameter files) REFERENCED WITHIN the control files above (e.g. files referred to within fft_N.d, extrmap_N.d, bldcel_N.d, bnd2_N.d, minv2_N.d and extnd_N.d) also include the appropriate "_N" insertion modifying the "standard" file names to distinguish data for different crystals. Some examples are given below. -- Sample input files fft_1.d and fft_2.d for two crystals --- pdc1.pam pdc2.pam COMPUTE DENSITY MAP COMPUTE DENSITY MAP 0 144 80 120 1 0 20 0 0 0 0 0 80 128 120 1 0 20 0 0 0 0 four_1.ref four_2.ref four_1.map four_2.map -- Sample input files extrmap_1.d and extrmap_2.d ---- pdc1.pam pdc2.pam four_1.map four_2.map asu_1.map asu_2.map -.42 .49 -.56 .59 -.13 .63 -.62 .77 -.31 .35 -.25 .92 --- Sample input file mapavg_mc.d, for averaging over two copies in crystal 1 and four copies in crystal 2 --- pdc1.pam 2 asu_1.map asu_1.msk asu_1.avg 2 1 2 78.140 95.988 179.646 0.796 -0.67 0.239 0.152 asu_2.map asu_2.msk asu_2.avg 4 1 2 3 4 77.369 83.131 179.597 9.935 3.468 2.652 0.194 163.989 115.624 -177.918 -7.586 -5.270 35.599 0.333 184.007 27.036 180.196 1.551 -.575 38.236 0.451 283.568 92.472 -28.468 -5.787 -15.019 0.729 37.247 --- Sample input files bldcel_1.d and bldcel_2.d, builds complete cell from averaged submaps for each crystal --- pdc1.pam pdc2.pam four_1.map four_2.map avgcell_1.map avgcell_2.map asu_1.avg asu_2.avg asu_1.msk asu_2.msk 0 0 ************* procedure extndavg_mc.sh *************** # SCRIPT FOR NONCRYSTALLOGRAPHIC SYMMETRY AVERAGING IN THE CASE # WHERE MULTIPLE CRYSTAL FORMS ARE USED # # This sample script is appropriate for the case where two crystals # are to be averaged. It can readily be modified to include more # crystals, by making additons in the four places as indicated. # # The single file "mapavg_mc.d" containing input for the mapavg # program is assumed to be present to control the multi-crystal map # averaging process. In addition, a series of files specific to each # crystal is needed as described below. # # For each of the "N" crystals the following files are assumed to # exist, where the "N" in the file name is to be replaced by the # crystal number, i.e. 1, 2, 3, etc. # # phi16cy_N.31 Starting phases, to get first map # fft_N.d fft grid info # extrmap_N.d submap extraction info # asu_N.msk averaging mask # bldcel_N.d info for reconstruction of full cell map # bnd2_N.d solvent flattening info # mask3_N.14 solvent flattening mask # minv2_N.d map inversion info # extnd_N.d phase combination info # phasit_N.31 Anchor phases, for combining with inverted phases # extrfl_N.d Additional reflections, if phase extension requested # # Also, to distinguish data specific for each crystal all file # names (other than the parameter files) REFERENCED WITHIN the files # above should also have an "_N" inserted just prior to the file # extension, where N is the crystal number. # # INITIALIZE TEMPORARY FILE NAMES FOR EACH CRYSTAL # # INITIALIZATION FOR CRYSTAL 1 cp phi16cy_1.31 newphi_1.ref ln phasit_1.31 minv_1.ref # # INITIALIZATION FOR CRYSTAL 2 cp phi16cy_2.31 newphi_2.ref ln phasit_2.31 minv_2.ref # # REPEAT ABOVE 4 LINES FOR EACH ADDITIONAL CRYSTAL, IF ANY, # ADJUSTING THE FILE NAMES ACCORDINGLY # # # PERFORM 16 CYCLES OF PHASE EXTENSION/AVERAGING, USING THIRD # MASK FOR EACH CRYSTAL for cycle in 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 do # # COMPUTE THE FULL CELL MAP AND EXTRACT SUBMAP FOR CRYSTAL 1 rm minv_1.ref mv newphi_1.ref four_1.ref fsfour < fft_1.d >> extndavg_mc.l rm four_1.ref # extrmap < extrmap_1.d >> extndavg_mc.l # # # COMPUTE THE FULL CELL MAP AND EXTRACT SUBMAP FOR CRYSTAL 2 rm minv_2.ref mv newphi_2.ref four_2.ref fsfour < fft_2.d >> extndavg_mc.l rm four_2.ref # extrmap < extrmap_2.d >> extndavg_mc.l # # REPEAT ABOVE 8 LINES FOR EACH ADDITIONAL CRYSTAL, IF ANY, # ADJUSTING THE FILE NAMES ACCORDINGLY # # # # HAVE ALL THE NECESSARY MAPS, NOW DO THE AVERAGING. (THIS STEP # DONE ONLY ONCE, SINCE MAPAVG HANDLES ALL CRYSTALS AT SAME TIME) # # AVERAGE THE ELECTRON DENSITY, WITHIN THE MOLECULAR ENVELOPES mapavg < mapavg_mc.d >> extndavg_mc.l # # # # NOW DO THE SOLVENT FLATTENING, INVERSION AND PHASE COMBINATION # SEPARATELY FOR EACH CRYSTAL # # REBUILD THE COMPLETE UNIT CELL FOR CRYSTAL 1 rm asu_1.map bldcel < bldcel_1.d >> extndavg_mc.l rm four_1.map asu_1.avg # # MODIFY AVERAGED ELECTRON DENSITY MAP ACCORDING TO MASK FOR # CRYSTAL 1 ln mask3_1.14 mask_1.map # use .06 for 3A,.086 for 3.5 and .112 for 4A bndry < bnd2_1.d >> extndavg_mc.l rm avgcell_1.map mask_1.map # # INVERT MODIFIED MAP TO OBTAIN STRUCTURE FACTORS FOR CRYSTAL 1 mapinv < minv2_1.d >> extndavg_mc.l rm mod_1.map # # COMBINE PRIOR PHASE INFORMATION WITH THAT FROM MAP INVERSION # AND EXTEND PHASING TO ADDITIONAL AMPLITUDES FOR CRYSTAL 1 bndry < extnd_1.d >> extndavg_mc.l # # # REBUILD THE COMPLETE UNIT CELL FOR CRYSTAL 2 rm asu_2.map bldcel < bldcel_2.d >> extndavg_mc.l rm four_2.map asu_2.avg # # MODIFY AVERAGED ELECTRON DENSITY MAP ACCORDING TO MASK FOR # CRYSTAL 2 ln mask3_2.14 mask_2.map # use .06 for 3A,.086 for 3.5 and .112 for 4A bndry < bnd2_2.d >> extndavg_mc.l rm avgcell_2.map mask_2.map # # INVERT MODIFIED MAP TO OBTAIN STRUCTURE FACTORS FOR CRYSTAL 2 mapinv < minv2_2.d >> extndavg_mc.l rm mod_2.map # # COMBINE PRIOR PHASE INFORMATION WITH THAT FROM MAP INVERSION # AND EXTEND PHASING TO ADDITIONAL AMPLITUDES FOR CRYSTAL 2 bndry < extnd_2.d >> extndavg_mc.l # # REPEAT ABOVE 20 LINES FOR EACH ADDITIONAL CRYSTAL, IF ANY, # ADJUSTING THE FILE NAMES ACCORDINGLY # done # # RENAME THE FINAL OUTPUT PHASE FILES FOR EACH CRYSTAL # # FOR CRYSTAL 1 mv newphi_1.ref phiextndavg_1.31 mv minv_1.ref allcoef_1.31 # # FOR CRYSTAL 2 mv newphi_2.ref phiextndavg_2.31 mv minv_2.ref allcoef_2.31 # # REPEAT ABOVE 4 LINES FOR EACH ADDITIONAL CRYSTAL, IF ANY, # ADJUSTING THE FILE NAMES ACCORDINGLY # # THATS ALL ************* procedure extndavg_mc.com *************** $SET NOVERIFY $! MODIFIED FOR NONCRYSTALLOGRAPHIC SYMMETRY AVERAGING IN THE CASE $! WHERE MULTIPLE CRYSTAL FORMS ARE USED $! $! THIS SAMPLE CONTROL FIlE IS APPROPRIATE FOR THE CASE WHERE TWO $! CRYSTALS ARE TO BE AVERAGED. IT CAN READILY BE MODIFIED TO INCLUDE $! MORE CRYSTALS, BY MAKING ADDITONS IN THE FOUR PLACES AS INDICATED. $! $! THE SINGLE FILE "MAPAVG_MC.DAT" CONTAINING INPUT FOR THE MAPAVG $! PROGRAM IS ASSUMED TO BE PRESENT TO CONTROL THE MULTI-CRYSTAL MAP $! AVERAGING PROCESS. IN ADDITION, A SERIES OF FILES SPECIFIC TO EACH $! CRYSTAL IS NEEDED AS DESCRIBED BELOW. $! $! FOR EACH OF THE "N" CRYSTALS THE FOLLOWING FILES ARE ASSUMED TO $! EXIST, WHERE THE "N" IN THE FILE NAME IS TO BE REPLACED BY THE $! CRYSTAL NUMBER, I.E. 1, 2, 3, ETC. $! $! PHI16CY_N.31 STARTING PHASES, TO GET FIRST MAP $! FFT_N.D FFT GRID INFO $! EXTRMAP_N.D SUBMAP EXTRACTION INFO $! ASU_N.MSK AVERAGING MASK $! BLDCEL_N.D INFO FOR RECONSTRUCTION OF FULL CELL MAP $! BND2_N.D SOLVENT FLATTENING INFO $! MASK3_N.14 SOLVENT FLATTENING MASK $! MINV2_N.D MAP INVERSION INFO $! EXTND_N.D PHASE COMBINATION INFO $! PHASIT_N.31 ANCHOR PHASES, FOR COMBINING WITH INVERTED PHASES $! EXTRFL_N.D ADDITIONAL REFLECTIONS, IF PHASE EXTENSION REQUESTED $! $! ALSO, TO DISTINGUISH DATA SPECIFIC FOR EACH CRYSTAL ALL FILE $! NAMES (OTHER THAN THE PARAMETER FILES) REFERENCED WITHIN THE FILES $! ABOVE SHOULD ALSO HAVE AN "_N" INSERTED JUST PRIOR TO THE FILE $! EXTENSION, WHERE N IS THE CRYSTAL NUMBER. $! $! SEND ALL PRINTED OUTPUT TO EXTNDAVG_MC.L $ASSIGN EXTNDAVG_MC.L SYS$OUTPUT $! $! $! INITIALIZATION FOR CRYSTAL 1 $COPY PHI16CY_1.31 NEWPHI_1.REF $COPY PHASIT_1.31 MINV_1.REF $COPY MASK3_1.14 MASK_1.MAP $! $! INITIALIZATION FOR CRYSTAL 2 $COPY PHI16CY_2.31 NEWPHI_2.REF $COPY PHASIT_2.31 MINV_2.REF $COPY MASK3_2.14 MASK_2.MAP $! $! REPEAT ABOVE 5 LINES FOR EACH ADDITIONAL CRYSTAL, IF ANY, $! ADJUSTING THE FILE NAMES ACCORDINGLY $! $! $! $! PERFORM 16 CYCLES OF PHASE EXTENSION/AVERAGING, USING THIRD $! MASK FOR EACH CRYSTAL $! $CYCLE = 0 $LOOP: $CYCLE = CYCLE + 1 $! $! COMPUTE THE FULL CELL MAP AND EXTRACT SUBMAP FOR CRYSTAL 1 $DELETE MINV_1.REF;* $RENAME NEWPHI_1.REF FOUR_1.REF $ASSIGN FFT_1.DAT FOR005 $FSFOUR $DEASSIGN FOR005 $DELETE FOUR_1.REF;* $! $ASSIGN EXTRMAP_1.DAT FOR005 $EXTRMAP $DEASSIGN FOR005 $! $! COMPUTE THE FULL CELL MAP AND EXTRACT SUBMAP FOR CRYSTAL 2 $DELETE MINV_2.REF;* $RENAME NEWPHI_2.REF FOUR_2.REF $ASSIGN FFT_2.DAT FOR005 $FSFOUR $DEASSIGN FOR005 $DELETE FOUR_2.REF;* $! $ASSIGN EXTRMAP_2.DAT FOR005 $EXTRMAP $DEASSIGN FOR005 $! $! REPEAT ABOVE 12 LINES FOR EACH ADDITIONAL CRYSTAL, IF ANY, $! ADJUSTING THE FILE NAMES ACCORDINGLY $! $! $! $! HAVE ALL THE NECESSARY MAPS, NOW DO THE AVERAGING. (THIS STEP $! DONE ONLY ONCE, SINCE MAPAVG HANDLES ALL CRYSTALS AT SAME TIME) $! $! AVERAGE THE ELECTRON DENSITY, WITHIN THE MOLECULAR ENVELOPES $ASSIGN MAPAVG_MC.DAT FOR005 $MAPAVG $DEASSIGN FOR005 $! $! $! $! NOW DO THE SOLVENT FLATTENING, INVERSION AND PHASE COMBINATION $! SEPARATELY FOR EACH CRYSTAL $! $! REBUILD THE COMPLETE UNIT CELL FOR CRYSTAL 1 $DELETE ASU_1.MAP;* $ASSIGN BLDCEL_1.DAT FOR005 $BLDCEL $DEASSIGN FOR005 $DELETE FOUR_1.MAP;* $DELETE ASU_1.AVG;* $! $! MODIFY AVERAGED ELECTRON DENSITY MAP ACCORDING TO MASK FOR $! CRYSTAL 1 $! use .06 for 3A,.086 for 3.5 and .112 for 4A $ASSIGN BND2_1.DAT FOR005 $BNDRY $DEASSIGN FOR005 $DELETE AVGCELL_1.MAP;* $! $! INVERT MODIFIED MAP TO OBTAIN STRUCTURE FACTORS FOR CRYSTAL 1 $ASSIGN MINV2_1.DAT FOR005 $MAPINV $DEASSIGN FOR005 $DELETE MOD_1.MAP;* $! $! COMBINE PRIOR PHASE INFORMATION WITH THAT FROM MAP INVERSION $! AND EXTEND PHASING TO ADDITIONAL AMPLITUDES FOR CRYSTAL 1 $ASSIGN EXTND_1.DAT FOR005 $BNDRY $DEASSIGN FOR005 $! $! $! REBUILD THE COMPLETE UNIT CELL FOR CRYSTAL 2 $DELETE ASU_2.MAP;* $ASSIGN BLDCEL_2.DAT FOR005 $BLDCEL $DEASSIGN FOR005 $DELETE FOUR_2.MAP;* $DELETE ASU_2.AVG;* $! $! MODIFY AVERAGED ELECTRON DENSITY MAP ACCORDING TO MASK FOR $! CRYSTAL 2 $! use .06 for 3A,.086 for 3.5 and .112 for 4A $ASSIGN BND2_2.DAT FOR005 $BNDRY $DEASSIGN FOR005 $DELETE AVGCELL_2.MAP;* $! $! INVERT MODIFIED MAP TO OBTAIN STRUCTURE FACTORS FOR CRYSTAL 2 $ASSIGN MINV2_2.DAT FOR005 $MAPINV $DEASSIGN FOR005 $DELETE MOD_2.MAP;* $! $! COMBINE PRIOR PHASE INFORMATION WITH THAT FROM MAP INVERSION $! AND EXTEND PHASING TO ADDITIONAL AMPLITUDES FOR CRYSTAL 2 $ASSIGN EXTND_2.DAT FOR005 $BNDRY $DEASSIGN FOR005 $! $! REPEAT ABOVE 28 LINES FOR EACH ADDITIONAL CRYSTAL, IF ANY, $! ADJUSTING THE FILE NAMES ACCORDINGLY $! $IF CYCLE .LT. 16 THEN GOTO LOOP $! $! $! RENAME THE FINAL OUTPUT PHASE FILES FOR EACH CRYSTAL $! $! FOR CRYSTAL 1 $DELETE MASK_1.MAP;* $RENAME NEWPHI_1.REF PHIEXTNDAVG_1.31 $RENAME MINV_1.REF ALLCOEF_1.31 $PURGE ALLCOEF_1.31 $! $! FOR CRYSTAL 2 $DELETE MASK_2.MAP;* $RENAME NEWPHI_2.REF PHIEXTNDAVG_2.31 $RENAME MINV_2.REF ALLCOEF_2.31 $PURGE ALLCOEF_2.31 $! $! REPEAT ABOVE 6 LINES FOR EACH ADDITIONAL CRYSTAL, IF ANY, $! ADJUSTING THE FILE NAMES ACCORDINGLY $! $! THATS ALL $DEASSIGN SYS$OUTPUT
11.00 DENSITY MODIFICATION WITH MOLECULAR REPLACEMENT DERIVED PHASE INFORMATION When the initial source of phase information is from a model derived by molecular replacement techniques, it is still sometimes desirable to improve the phases by solvent flattening and/or NC symmetry averaging. This may be the case when the molecular replacement derived model represents only a fraction of the asymetric unit contents, and the missing parts of the structure must still be found. In such cases the solvent flattening etc. iterations require creation of phase probability distribution coefficients for the partial structure model to be used as "anchor" phases in the phase combination step. Also, it is desirable to do the iterations on 2FO-FC maps rather than on the normal FOM*FO maps, since the "missing" parts of the structure will then contribute more to the maps. Thus one must assure that both Fo and Fc are present on the phase files. This all can be accomplished without modification to the "doall" script by doing the following: 1) Generate phases from the partial structure in PHASIT, structure factor calculation mode, requesting the "short form" output, i.e. using IHLCF=0, ISIGA=0. This file contains both Fo and Fc, and should be called "phasit.31". (Don't worry that the file does not contain distribution coefficients as a normal "phasit.31 file would, in this case the file will be used only to seed the process by creating the first map.) 2) Generate the same phases again from the partial structure in PHASIT, structure factor calculation mode, but this time request the "long form" output, i.e. using IHLCF=1, ISIGA=0 or 1. Call this file "anchor.31". It includes probability distribution coefficients and will serve as the "anchor" phases, which map inverted phase information will be combined with on each iteration. 3) Modify the fft.d input file to request a 2Fo-Fc map instead of an Fo map. 4) Modify the bnd3.d (and possibly extnd.d) input files to specify "anchor.31" to be used instead of "phasit.31" for the anchor phase set, and set IOTYP=1 so that both Fo and Fc appear on the output file. 5) Modify the rmhv.d input file to specify that no heavy atoms, i.e. 0 input atoms, are used. You can now run the "doall" procedure, and 2Fo-Fc maps will be used for all calculations including those used during mask construction. Note however, that the output files (phi4cy.31, phi8cy.31, phi16cy.31 etc.) will now contain Fo and Fc in the amplitude slots instead of the normal fom*fo and fo, thus they can NOT be used to compute figure of merit weighted maps in FSFOUR. They can however, be used for Fo and difference type maps. The figure of merit is still present in the file, but it will not be applied in FSFOUR. Finally, if non-crystallographic symmetry averaging is to be performed in addition to solvent flattening one can continue the process as described in the AVERAGING DIFFERENCE or 2FO-FC MAPS section.
12.00 PHASE EXTENSION Phases (and optionally amplitudes) can be extended, either to higher resolution or to missing reflections within the original resolution limit, by modifying an electron density map created with the known phases, inverting the modified map and combining the map inverted structure factors with the initial data via option 3 of the BNDRY program. To extend phases to reflections for which amplitudes are available, a file must first be created using the program MISSNG, which generates a list of reflections (file "extrfl.d" for which at least amplitudes are available, and possibly phase information. Then the file "sloext.d" must be created (see write-up for SLOEXT) to control the limits and rate of phase extension, and the file "extnd.d" or "extnda.d" must be created to control the phase combination step in BNDRY. Once these files are created, and assuming that the final solvent mask (mask3.14), phase files (phasit.31 & phi16cy.31), and input data files (eg. fft.d minv2.d, bnd2.d) from a prior "doall" solvent flattening run are still available, one can execute the "extnd.sh" script to carry out the phase extension iterations. The resolution is gradually extended out to the limit specified in the sloext.d file, with the final phases written to "phiextnd.31". One can do phase AND AMPLITUDE extension similarly, by preparing "extnda.d" and "sloext2.d" files, and executing "extnda.sh". The phase extension process is even more powerful if density modification in addition to solvent flattening/negative density truncation is included, such as noncrystallographic symmetry averaging. In such cases the script "extndavg.sh" can be used, which requires all of the files utilized by "extnd.sh", plus the "extrmap.d", "mapavg.d", "bldcel.d" and "asu.msk" files needed for NC symmetry averaging (see the noncrystallographic symmetry section of the write-up). When doing phase extension to higher resolution it is important to compute the map on a grid sampled AT LEAST one third the smallest d spacing to be encountered anywhere in the process, and to specify Miller index ranges during map inversion ("minv2.d" file) which satisfy the highest resolution desired. Note that if the extension is substantial, this may require regenerating the solvent mask (and therefore the averaging mask "asu.msk") on a finer grid than was originally used. Best results are obtained when the extension is carried out slowly, with at least 5 iterations per extension step. Sample scripts and template files are provided for both UNIX systems (as described here), and VMS symtems (with the corresponding data files having ".dat" extensions and the control files having ".com" extensions. Samples are also given in the EXAMPLES section.
13.00 MAD PHASING The PHASES package can be used to determine phase angles from MAD (Multiple wavelength Anomalous Dispersion) data, by treating the data from each wavelength as a native anomalous scattering, isomorphous replacement or derivative anomalous scattering data set, and then combining information from all sets in the conventional manner. This is facilitated by the ability to input scattering factor information to the PHASIT program. One simply adjusts the input scattering factors and data appropriately for the wavelength and data set type desired. For example, consider the case where data has been measured at three wavelengths, and Bijvoet pairs were measured in all sets. A reasonable strategy would be to: 1) Select one of the data sets to be the "native." For this set we would prefer no Bijvoet signal to be present, so we might pick a wavelength where delta f" is near zero. Note however, that we can use any wavelength, even if delta f" is large, provided we first AVERAGE both members of the Bijvoet pair for each acentric reflection. Indeed, it is often desirable to choose a wavelength where delta f' is near zero, allowing delta f" to be appreciable but it's effects can be reduced or removed by the averaging. Thus if delta f" is large be sure to include acentric reflections ONLY IF BOTH MEMBERS OF THE BIJVOET PAIR WERE EXPLICITLY MEASURED AND AVERAGED! This is the data set that will actually be phased and eventually used for map calculations, thus it should include the centric reflections as well. All other sets will be scaled to it. Since the REAL part of the anomalous scattering correction is NOT removed by the averaging, we will need to know what delta f' is at this wavelength. Let the real and imaginary components of the anomalous dispersion scattering factor corrections at this wavelength be called delta f'(N) and delta f"(N). 2) Select another data set at a wavelength D1 which maximizes the magnitude of ( delta f'(D1) - delta f'(N) ). For this set, average the Bijvoet pairs for all acentric reflections as in (1), to remove the contribution from delta f"(D1), and include the centric data as well. This set can then be merged with the "native" in CMBISO to form an "isomorphous" derivative scaled set. It can then be used in PHASIT as an SIR data set, but since the only difference in the scattering between this and the "native" is due to differences in delta f', you must input the appropriate scattering factors. Thus input zeroes for the 9 normal scattering factor coeficients, but input ( delta f'(D1) - delta f'(N) ) for the REAL part, and delta f"(D1) for the IMAGINARY part of the anomalous correction. Be careful of the sign when doing the subtraction. (Note that the delta f"(D1) term will not be used in the SIR calculation, thus it does not have to be input as zero which you might have expected). (3) Select another data set at a wavelength D2 which maximizes delta f"(D2). For this set, DO NOT average the Bijvoet pairs. Simply merge the data with the "native" set (created in (1)) with CMBANO to generate a scaled "anomalous" set. The output data can then be used in PHASIT as a "derivative anomalous scattering" data set, but since the difference in scattering between this and the "native" is due to both the difference in delta f' and the effect of delta f"(D2), you must again adjust the scattering factors accordingly. Input zeros for the 9 normal scattering factor coeficients, and input ( delta f'(D2) - delta f'(N) ) for the REAL and delta f"(D2) for the IMAGINARY parts of the anomalous scattering correction. Again, be careful of the sign when doing the subtraction. In this case both the real and imaginary components will be used. The "isomorphous" and/or "anomalous" scaled files prepared in (2) and (3) can initially be used in the normal manner to locate the anomalously scattering atoms from difference or Bijvoet difference Patterson maps (see flowchart section), and possibly for initial heavy atom refinement in GREF. Note however, that if one is using "isomorphous" data sets as described in (2) above with program MRGDF to compute difference or cross difference Fouriers and the sign of ( delta f'(D1) - delta f'(N) ) is negative, then peaks in the map corresponding to the isomorphous scatterers should also be negative. In that case one can then request that program PSRCH list only negative peaks to check the sites. This is necessary ONLY for ISOMORPHOUS data sets in which the REAL part of the derivative minus native scattering factor is expected to be negative. The same thing holds for "difference" or "double difference" Fouriers computed with the difference files generated by program PHASIT when MAD data sets are used and the file corresponds to an ISOMORPHOUS data set with (delta f'(D) - delta f'(N)) negative. Once the anomalous scatterers have been found, the isomorphous and anomalous scaled data from (2) and (3) can be used simultaneously in PHASIT to compute SIRAS phases, which can then be used for map computations or solvent flattening in the normal manner. One should first carry out phase refinement in PHASIT, refining the heavy atom parameters and scale factors. If the wavelengths were chosen as described above these two sets should provide the greatest phasing power, since the isomorphous and anomalous signals were maximized in each case. It is not necessarily all that can be done however. For example, the same data set (averaged Bijvoet mates) which was utilized in (2) to create an "isomorphous" set, can also be processed (without averaging Bijvoet mates) as in (3) to create another "derivative anomalous scattering" set. In that case the appropriate anomalous scattering correction factors would be ( delta f'(D1) - delta f'(N) ) and delta f"(D1). Likewise, the data set (unaveraged Bijvoet mates) used in (3) to get the derivative anomalous set, can also be processed (after averaging Bijvoet mates) as in (2) to get another "isomorphous" set. For the new isomorphous set the appropriate anomalous scattering correction factors would be ( delta f'(D2) - delta f'(N) ) and delta f"(D2). Finally, if the original "native" data set was collected at a wavelength where delta f"(N) is appreciable, then it too can also be included (without averaging Bijvoet mates) as a "native anomalous scattering" data set. In that case the appropriate anomalous scattering correction factors would be delta f'(N) and delta f"(N). (Note that the real part will not be used in the calculations, and in PHASIT native anomalous scattering data sets should come last in the input). Thus with data at three wavelengths, one can combine up to 5 different sources of phase information in PHASIT: two isomorphous sets, two derivative anomalous scattering sets, and one native anomalous scattering set. Although some of these sets provide essentially the same (redundant) information, the experimental errors will be different in each set, thus inclusion of all of them may still be helpful. It may be useful to try various combinations. MAD PHASING AT TWO WAVELENGTHS A procedure similar to that above can be used when anomalous scattering data has been collected only at two wavelengths. In that case a "native" set is selected and processed as in (1), to remove contributions from delta f"(N). The other data set is first processed and merged with the native as in (2) to create an "isomorphous" set, and then processed and merged again (this time NOT averaging Bijvoet mates) as in (3) to create a "derivative anomalous scattering" set. If the "native" set was taken at a wavelength where delta f"(N) is appreciable, then the original "native" data (this time WITHOUT averaging Bijvoet mates) can also be used as a "native anomalous scattering" data set. Thus even with data at only two wavelengths it is possible to obtain and combine phase information from three sources, native anomalous scattering, derivative isomorphous replacement and derivative anomalous scattering.
14.00 VMS USER INFORMATION Use of the PHASES package on VMS systems is very similar to its use on UNIX systems, except that command files ".com" are used instead of ".sh" shell scripts. In all cases the programs function identically, and all input and ouput is the same. A command procedure to be executed from each users login.com file will define all of the programs so that they can be run simply by entering the program name (as in UNIX systems). However, one can not use the UNIX input and output redirection operators (<, <<, >, >>), so that for the non-interactive programs input data must either immediately follow the program name (on subsequent lines), or come from a file which has been "ASSIGNed" to FOR005 (and "DEASSIGNed" upon program completion). Likewise, to direct standard output to a file one must ASSIGN the file to FOR006. Also, the different byte order and floating point format makes it difficult, if not impossible, to use any binary files on other computer systems. This is not normally a problem since the only binary files which typically need to be transferred are graphics map files for use with programs TOM, O or CHAIN on graphics workstations. The VMS version of program GMAP which creates these files contains special code within it such that the binary map files it produces may be transferred (via ftp, type binary) and used DIRECTLY on the Silicon Graphics or ESV workstations where the graphics programs will be run. Installing the PHASES package on a VMS system will involve four steps. The first three steps should be done only once, whereas the last step must be repeated any time a new user of the package is added to the computer system. The steps are: 1) Edit the file "set_phases.com", which is present in the parent directory as distributed. Only one line has to be changed, so that the logical name "PHASES_DIR" points to the parent directory where the software resides. 2) If one is on a DEC or VAX station instead of an ALPHA workstation, one MAY have to edit the file "XSTUFF.OPT" (in the [.src] directory below the parent directory) to point to the directory where the systems X-Window object libraries are located. The supplied version is appropriate for ALPHA workstations, but may need to be changed for other workstations. 3) From the parent directory, type @BUILDIT.COM to invoke the compilation and linking. 4) Have each user of the package insert the line $@DISK:[DIRECTORY]SET_PHASES.COM in his/her login.com file. Note that the DISK and DIRECTORY should be changed to point to the appropriate PHASES parent directory as in (1). The installation will result in files being deposited in four subdirectories below the parent directory. The subdirectories contain the program source files, executables, write-up and sample template files. The four directories have the logical names PHASES_SRC, PHASES_EXE, PHASES_DOC and PHASES_TEMPL, respectively. Users may find it desirable to copy the write-up and sample template files to their own directory, thus the commands COPY PHASES_DOC:PHASES.WUP * COPY PHASES_TEMPL:*.com * COPY PHASES_TEMPL:*.DAT * will probably be useful. It is also recommended that at least one copy of the PHASES.WUP manual be printed, but beware as it is large (roughly 190 pages). After the installation and execution of the login.com file, any program in the package can be run simply by typing its name. Note that if running in batch mode, one will have to insert the line $SET DEFAULT DISK:[DIRECTORY] in the beginning of every ".com" file, where DISK and DIRECTORY are replaced by the users working disk and directory if the users input data files are to be found.
15.00 For UNIX, the following scripts are invoked by the doall script --- procedure mask1.sh --- # COMPUTE ORIGINAL ELECTRON DENSITY MAP # ln phasit.31 four.ref fsfour < fft.d > mask1.l mv four.map orig.map rm four.ref ln orig.map four.map # # REMOVE HEAVY ATOM PEAKS FROM MAP # rmheavy < rmhv.d >> mask1.l mv nohv.map four.map # # INVERT MAP AFTER TRUNCATING DENSITY < 0 # mapinv < minv1.d >> mask1.l rm four.map # # MULTIPLY FOURIER COEFICIENTS BY TRANSFORM OF WEIGHTING FUNCTION # bndry < bnd0.d >> mask1.l rm minv.ref # # COMPUTE "SMEARED" MAP FROM MODIFIED COEFICIENTS # fsfour < fft.d >> mask1.l rm four.ref # # DETERMINE PROTEIN-SOLVENT BOUNDARY MASK FROM "SMEARED" MAP # bndry < bnd1.d >> mask1.l mv mask.map mask1.14 rm four.map # THATS ALL --- procedure cycle4.sh --- # # MODIFY ELECTRON DENSITY MAP ACCORDING TO MASK1 # # use .06 for 3A data and .086 for 3.5 and .112 for 4.A mv orig.map four.map ln mask1.14 mask.map bndry < bnd2.d > cycle4.l rm four.map mask.map # # INVERT MODIFIED MAP TO OBTAIN STRUCTURE FACTORS # mapinv < minv2.d >> cycle4.l rm mod.map # # COMBINE PRIOR PHASE INFORMATION WITH THAT FROM MAP INVERSION # bndry < bnd3.d >> cycle4.l # # # PERFORM 3 MORE CYCLES OF REFINEMENT # for cycle in 1 2 3 do # # COMPUTE ELECTRON DENSITY MAP FROM COMBINED PHASE INFORMATION # rm minv.ref mv newphi.ref four.ref fsfour < fft.d >> cycle4.l rm four.ref # # MODIFY ELECTRON DENSITY MAP ACCORDING TO MASK # ln mask1.14 mask.map # use .06 for 3A,.086 for 3.5 and .112 for 4A bndry < bnd2.d >> cycle4.l rm four.map mask.map # # INVERT MODIFIED MAP TO OBTAIN STRUCTURE FACTORS # mapinv < minv2.d >> cycle4.l rm mod.map # # COMBINE PRIOR PHASE INFORMATION WITH THAT FROM MAP INVERSION # bndry < bnd3.d >> cycle4.l done mv newphi.ref phi4cy.31 mv minv.ref allcoef.31 # THATS ALL --- procedure mask2.sh --- # # COMPUTE ELECTRON DENSITY MAP # ln phi4cy.31 four.ref fsfour < fft.d > mask2.l rm four.ref # # REMOVE HEAVY ATOM PEAKS FROM MAP rmheavy < rmhv.d >> mask2.l mv nohv.map four.map # # INVERT MAP AFTER TRUNCATING DENSITY < 0 # mapinv < minv1.d >> mask2.l rm four.map # # MULTIPLY FOURIER COEFICIENTS BY TRANSFORM OF WEIGHTING FUNCTION # bndry < bnd0.d >> mask2.l rm minv.ref # # COMPUTE "SMEARED" MAP FROM MODIFIED COEFICIENTS # fsfour < fft.d >> mask2.l rm four.ref # # DETERMINE PROTEIN-SOLVENT BOUNDARY MASK FROM "SMEARED" MAP # bndry < bnd1.d >> mask2.l mv mask.map mask2.14 rm four.map # THATS ALL --- procedure cycle8.sh --- # # START OVER USING NEW MASK # cp phasit.31 newphi.ref ln phasit.31 minv.ref # # # PERFORM 4 CYCLES OF REFINEMENT, USING SECOND MASK # for cycle in 1 2 3 4 do # # COMPUTE ELECTRON DENSITY MAP FROM COMBINED PHASE INFORMATION # rm minv.ref mv newphi.ref four.ref fsfour < fft.d >> cycle8.l rm four.ref # # MODIFY ELECTRON DENSITY MAP ACCORDING TO MASK # ln mask2.14 mask.map # use .06 for 3A,.086 for 3.5 and .112 for 4A bndry < bnd2.d >> cycle8.l rm four.map mask.map # # INVERT MODIFIED MAP TO OBTAIN STRUCTURE FACTORS # mapinv < minv2.d >> cycle8.l rm mod.map # # COMBINE PRIOR PHASE INFORMATION WITH THAT FROM MAP INVERSION # bndry < bnd3.d >> cycle8.l done mv newphi.ref phi8cy.31 mv minv.ref allcoef.31 # THATS ALL --- procedure mask3.sh --- # # COMPUTE ELECTRON DENSITY MAP # ln phi8cy.31 four.ref fsfour < fft.d > mask3.l rm four.ref # # REMOVE HEAVY ATOM PEAKS FROM MAP rmheavy < rmhv.d >> mask3.l mv nohv.map four.map # # INVERT MAP AFTER TRUNCATING DENSITY < 0 # mapinv < minv1.d >> mask3.l rm four.map # # MULTIPLY FOURIER COEFICIENTS BY TRANSFORM OF WEIGHTING FUNCTION # bndry < bnd0.d >> mask3.l rm minv.ref # # COMPUTE "SMEARED" MAP FROM MODIFIED COEFICIENTS # fsfour < fft.d >> mask3.l rm four.ref # # DETERMINE PROTEIN-SOLVENT BOUNDARY MASK FROM "SMEARED" MAP # bndry < bnd1.d >> mask3.l mv mask.map mask3.14 rm four.map # THATS ALL --- procedure cycle16.sh --- # # START OVER USING NEW MASK # cp phasit.31 newphi.ref ln phasit.31 minv.ref # # # PERFORM 8 CYCLES OF REFINEMENT, USING THIRD MASK # for cycle in 1 2 3 4 5 6 7 8 do # # COMPUTE ELECTRON DENSITY MAP FROM COMBINED PHASE INFORMATION # rm minv.ref mv newphi.ref four.ref fsfour < fft.d >> cycle16.l rm four.ref # # MODIFY ELECTRON DENSITY MAP ACCORDING TO MASK # ln mask3.14 mask.map # use .06 for 3A,.086 for 3.5 and .112 for 4A bndry < bnd2.d >> cycle16.l rm four.map mask.map # # INVERT MODIFIED MAP TO OBTAIN STRUCTURE FACTORS # mapinv < minv2.d >> cycle16.l rm mod.map # # COMBINE PRIOR PHASE INFORMATION WITH THAT FROM MAP INVERSION # bndry < bnd3.d >> cycle16.l done mv newphi.ref phi16cy.31 mv minv.ref allcoef.31 # THATS ALL --- procedure extnd.sh --- # # RESUME WHERE WE LEFT OFF AFTER FIRST 16 CYCLES # cp phi16cy.31 newphi.ref ln phasit.31 minv.ref # # CREATE THE TEMPORARY FILE "EXTND.TMP" CONTROLLING EXTENSION sloext < sloext.d > extnd.l # # PERFORM THE PHASE EXTENSION/REFINEMENT ITERATIONS USING THE # THIRD MASK # # DO ITERATIONS AS LONG AS THE FILE EXTND.TMP EXISTS (IT WILL # AUTOMATICALLY BE DELETED WHEN THE SPECIFIED RESOLUTION/NUMBER # OF ITERATIONS IS REACHED) while test -r EXTND.TMP do # # COMPUTE ELECTRON DENSITY MAP FROM COMBINED PHASE INFORMATION # rm minv.ref mv newphi.ref four.ref fsfour < fft.d >> extnd.l rm four.ref # # MODIFY ELECTRON DENSITY MAP ACCORDING TO MASK # ln mask3.14 mask.map # use .06 for 3A,.086 for 3.5 and .112 for 4A bndry < bnd2.d >> extnd.l rm four.map mask.map # # INVERT MODIFIED MAP TO OBTAIN STRUCTURE FACTORS # mapinv < minv2.d >> extnd.l rm mod.map # # COMBINE PRIOR PHASE INFORMATION WITH THAT FROM MAP INVERSION # bndry < extnd.d >> extnd.l # # ADJUST ITERATION COUNT AND/OR RESOLUTION CUTOFF (THIS WILL ALSO # DELETE THE "EXTND.TMP" FILE AT THE APPROPRIATE TIME, THUS EXITING # THE LOOP) sloext < sloext.d >> extnd.l # done # mv newphi.ref phiextnd.31 mv minv.ref allcoef.31 # THATS ALL --- procedure extnda.sh --- # # RESUME WHERE WE LEFT OFF # cp phiextnd.31 newphi.ref ln phasit.31 minv.ref # # CREATE THE TEMPORARY FILE "EXTND.TMP" CONTROLLING EXTENSION sloext < sloext2.d > extnda.l # # PERFORM THE PHASE AND AMPLITUDE EXTENSION ITERATIONS USING # THE THIRD MASK # # DO ITERATIONS AS LONG AS THE FILE EXTND.TMP EXISTS (IT WILL # AUTOMATICALLY BE DELETED WHEN THE SPECIFIED RESOLUTION/NUMBER # OF ITERATIONS IS REACHED) while test -r EXTND.TMP do # # COMPUTE ELECTRON DENSITY MAP FROM COMBINED PHASE INFORMATION # rm minv.ref mv newphi.ref four.ref fsfour < fft.d >> extnda.l rm four.ref # # MODIFY ELECTRON DENSITY MAP ACCORDING TO MASK # ln mask3.14 mask.map # use .06 for 3A,.086 for 3.5 and .112 for 4A bndry < bnd2.d >> extnda.l rm four.map mask.map # # INVERT MODIFIED MAP TO OBTAIN STRUCTURE FACTORS # mapinv < minv2.d >> extnda.l rm mod.map # # COMBINE PRIOR PHASE INFORMATION WITH THAT FROM MAP INVERSION # bndry < extnda.d >> extnda.l # # ADJUST ITERATION COUNT AND/OR RESOLUTION CUTOFF (THIS WILL ALSO # DELETE THE "EXTND.TMP" FILE AT THE APPROPRIATE TIME, THUS EXITING # THE LOOP) sloext < sloext2.d >> extnda.l # done # mv newphi.ref phiextnda.31 mv minv.ref allcoef.31 # THATS ALL
16.00 For VMS, the following procedures are invoked by doall.com --- procedure mask1.com --- $SET NOVERIFY $! $! SEND ALL PRINTED OUTPUT TO MASK1.L $ASSIGN MASK1.L SYS$OUTPUT $! $! COMPUTE ORIGINAL ELECTRON DENSITY MAP $! $COPY PHASIT.31 FOUR.REF $ASSIGN FFT.DAT FOR005 $FSFOUR $DEASSIGN FOR005 $DELETE FOUR.REF;* $COPY FOUR.MAP ORIG.MAP $! $! REMOVE HEAVY ATOM PEAKS FROM MAP $ASSIGN RMHV.DAT FOR005 $RMHEAVY $DEASSIGN FOR005 $DELETE FOUR.MAP;* $RENAME NOHV.MAP FOUR.MAP $! $! INVERT MAP AFTER TRUNCATING DENSITY < 0 $! $ASSIGN MINV1.DAT FOR005 $MAPINV $DEASSIGN FOR005 $DELETE FOUR.MAP;* $! $! MULTIPLY FOURIER COEFICIENTS BY TRANSFORM OF WEIGHTING FUNCTION $! $ASSIGN BND0.DAT FOR005 $BNDRY $DEASSIGN FOR005 $DELETE MINV.REF;* $! $! COMPUTE "SMEARED" MAP FROM MODIFIED COEFICIENTS $! $ASSIGN FFT.DAT FOR005 $FSFOUR $DEASSIGN FOR005 $DELETE FOUR.REF;* $! $! DETERMINE PROTEIN-SOLVENT BOUNDARY MASK FROM "SMEARED" MAP $! $ASSIGN BND1.DAT FOR005 $BNDRY $RENAME MASK.MAP MASK1.14 $DEASSIGN FOR005 $DELETE FOUR.MAP;* $! $DEASSIGN SYS$OUTPUT $! THATS ALL $EXIT --- procedure cycle4.com --- $SET NOVERIFY $! $! SEND ALL PRINTED OUTPUT TO CYCLE4.L $ASSIGN CYCLE4.L SYS$OUTPUT $! $! MODIFY ELECTRON DENSITY MAP ACCORDING TO MASK1 $! $! USE .06 FOR 3A DATA, .086 FOR 3.5A, AND .112 FOR 4A DATA $! $RENAME ORIG.MAP FOUR.MAP $COPY MASK1.14 MASK.MAP $ASSIGN BND2.DAT FOR005 $BNDRY $DEASSIGN FOR005 $DELETE FOUR.MAP;* $! $! INVERT MODIFIED MAP TO OBTAIN STRUCTURE FACTORS $! $ASSIGN MINV2.DAT FOR005 $MAPINV $DEASSIGN FOR005 $DELETE MOD.MAP;* $! $! COMBINE PRIOR PHASE INFORMATION WITH THAT FROM MAP INVERSION $! $ASSIGN BND3.DAT FOR005 $BNDRY $DEASSIGN FOR005 $! $! PERFORM 3 MORE CYCLES OF REFINEMENT $! $CYCLE = 0 $LOOP: $CYCLE = CYCLE + 1 $! $! COMPUTE ELECTRON DENSITY MAP FROM COMBINED PHASE INFORMATION $! $DELETE MINV.REF;* $RENAME NEWPHI.REF FOUR.REF $ASSIGN FFT.DAT FOR005 $FSFOUR $DEASSIGN FOR005 $DELETE FOUR.REF;* $! $! MODIFY ELECTRON DENSITY MAP ACCORDING TO MASK1 $! $! USE .06 FOR 3A DATA, .086 FOR 3.5A, AND .112 FOR 4A DATA $! $ASSIGN BND2.DAT FOR005 $BNDRY $DEASSIGN FOR005 $DELETE FOUR.MAP;* $! $! INVERT MODIFIED MAP TO OBTAIN STRUCTURE FACTORS $! $ASSIGN MINV2.DAT FOR005 $MAPINV $DEASSIGN FOR005 $DELETE MOD.MAP;* $! $! COMBINE PRIOR PHASE INFORMATION WITH THAT FROM MAP INVERSION $! $ASSIGN BND3.DAT FOR005 $BNDRY $DEASSIGN FOR005 $! $IF CYCLE .LT. 3 THEN GOTO LOOP $! $DELETE MASK.MAP;* $RENAME NEWPHI.REF PHI4CY.31 $RENAME MINV.REF ALLCOEF.31 $! $DEASSIGN SYS$OUTPUT $! THATS ALL $EXIT --- procedure mask2.com --- $SET NOVERIFY $! $! SEND ALL PRINTED OUTPUT TO MASK2.L $ASSIGN MASK2.L SYS$OUTPUT $! $! COMPUTE ORIGINAL ELECTRON DENSITY MAP $! $COPY PHI4CY.31 FOUR.REF $ASSIGN FFT.DAT FOR005 $FSFOUR $DEASSIGN FOR005 $DELETE FOUR.REF;* $! $! REMOVE HEAVY ATOM PEAKS FROM MAP $ASSIGN RMHV.DAT FOR005 $RMHEAVY $DEASSIGN FOR005 $DELETE FOUR.MAP;* $RENAME NOHV.MAP FOUR.MAP $! $! INVERT MAP AFTER TRUNCATING DENSITY < 0 $! $ASSIGN MINV1.DAT FOR005 $MAPINV $DEASSIGN FOR005 $DELETE FOUR.MAP;* $! $! MULTIPLY FOURIER COEFICIENTS BY TRANSFORM OF WEIGHTING FUNCTION $! $ASSIGN BND0.DAT FOR005 $BNDRY $DEASSIGN FOR005 $DELETE MINV.REF;* $! $! COMPUTE "SMEARED" MAP FROM MODIFIED COEFICIENTS $! $ASSIGN FFT.DAT FOR005 $FSFOUR $DEASSIGN FOR005 $DELETE FOUR.REF;* $! $! DETERMINE PROTEIN-SOLVENT BOUNDARY MASK FROM "SMEARED" MAP $! $ASSIGN BND1.DAT FOR005 $BNDRY $DEASSIGN FOR005 $REN MASK.MAP MASK2.14 $DELETE FOUR.MAP;* $! $DEASSIGN SYS$OUTPUT $! THATS ALL $EXIT --- procedure cycle8.com --- $SET NOVERIFY $! $! SEND ALL PRINTED OUTPUT TO CYCLE8.L $ASSIGN CYCLE8.L SYS$OUTPUT $! $! START OVER USING NEW MASK $! $COPY PHASIT.31 NEWPHI.REF $COPY PHASIT.31 MINV.REF $COPY MASK2.14 MASK.MAP $! $! PERFORM 4 CYCLES OF REFINEMENT USING 2ND MASK $! $CYCLE = 0 $LOOP1: $CYCLE = CYCLE + 1 $! $! COMPUTE ELECTRON DENSITY MAP FROM COMBINED PHASE INFORMATION $! $DELETE MINV.REF;* $RENAME NEWPHI.REF FOUR.REF $ASSIGN FFT.DAT FOR005 $FSFOUR $DEASSIGN FOR005 $DELETE FOUR.REF;* $! $! MODIFY ELECTRON DENSITY MAP ACCORDING TO MASK2 $! $! USE .06 FOR 3A DATA, .086 FOR 3.5A, AND .112 FOR 4A DATA $! $ASSIGN BND2.DAT FOR005 $BNDRY $DEASSIGN FOR005 $DELETE FOUR.MAP;* $! $! INVERT MODIFIED MAP TO OBTAIN STRUCTURE FACTORS $! $ASSIGN MINV2.DAT FOR005 $MAPINV $DEASSIGN FOR005 $DELETE MOD.MAP;* $! $! COMBINE PRIOR PHASE INFORMATION WITH THAT FROM MAP INVERSION $! $ASSIGN BND3.DAT FOR005 $BNDRY $DEASSIGN FOR005 $! $IF CYCLE .LT. 4 THEN GOTO LOOP1 $! $DELETE MASK.MAP;* $RENAME NEWPHI.REF PHI8CY.31 $RENAME MINV.REF ALLCOEF.31 $PURGE ALLCOEF.31 $! $DEASSIGN SYS$OUTPUT $! THATS ALL $EXIT --- procedure mask3.com --- $SET NOVERIFY $! $! SEND ALL PRINTED OUTPUT TO MASK3.L $ASSIGN MASK3.L SYS$OUTPUT $! $! COMPUTE ORIGINAL ELECTRON DENSITY MAP $! $COPY PHI8CY.31 FOUR.REF $ASSIGN FFT.DAT FOR005 $FSFOUR $DEASSIGN FOR005 $DELETE FOUR.REF;* $! $! REMOVE HEAVY ATOM PEAKS FROM MAP $ASSIGN RMHV.DAT FOR005 $RMHEAVY $DEASSIGN FOR005 $DELETE FOUR.MAP;* $RENAME NOHV.MAP FOUR.MAP $! $! INVERT MAP AFTER TRUNCATING DENSITY < 0 $! $ASSIGN MINV1.DAT FOR005 $MAPINV $DEASSIGN FOR005 $DELETE FOUR.MAP;* $! $! MULTIPLY FOURIER COEFICIENTS BY TRANSFORM OF WEIGHTING FUNCTION $! $ASSIGN BND0.DAT FOR005 $BNDRY $DEASSIGN FOR005 $DELETE MINV.REF;* $! $! COMPUTE "SMEARED" MAP FROM MODIFIED COEFICIENTS $! $ASSIGN FFT.DAT FOR005 $FSFOUR $DEASSIGN FOR005 $DELETE FOUR.REF;* $! $! DETERMINE PROTEIN-SOLVENT BOUNDARY MASK FROM "SMEARED" MAP $! $ASSIGN BND1.DAT FOR005 $BNDRY $DEASSIGN FOR005 $REN MASK.MAP MASK3.14 $DELETE FOUR.MAP;* $! $DEASSIGN SYS$OUTPUT $! THATS ALL $EXIT --- procedure cycle16.com --- $SET NOVERIFY $! $! SEND ALL PRINTED OUTPUT TO CYCLE16.L $ASSIGN CYCLE16.L SYS$OUTPUT $! $! START OVER USING NEW MASK $! $COPY PHASIT.31 NEWPHI.REF $COPY PHASIT.31 MINV.REF $COPY MASK3.14 MASK.MAP $! $! PERFORM 8 CYCLES OF REFINEMENT USING 3RD MASK $! $CYCLE = 0 $LOOP2: $CYCLE = CYCLE + 1 $! $! COMPUTE ELECTRON DENSITY MAP FROM COMBINED PHASE INFORMATION $! $DELETE MINV.REF;* $RENAME NEWPHI.REF FOUR.REF $ASSIGN FFT.DAT FOR005 $FSFOUR $DEASSIGN FOR005 $DELETE FOUR.REF;* $! $! MODIFY ELECTRON DENSITY MAP ACCORDING TO MASK3 $! $! USE .06 FOR 3A DATA, .086 FOR 3.5A, AND .112 FOR 4A DATA $! $ASSIGN BND2.DAT FOR005 $BNDRY $DEASSIGN FOR005 $DELETE FOUR.MAP;* $! $! INVERT MODIFIED MAP TO OBTAIN STRUCTURE FACTORS $! $ASSIGN MINV2.DAT FOR005 $MAPINV $DEASSIGN FOR005 $DELETE MOD.MAP;* $! $! COMBINE PRIOR PHASE INFORMATION WITH THAT FROM MAP INVERSION $! $ASSIGN BND3.DAT FOR005 $BNDRY $DEASSIGN FOR005 $! $IF CYCLE .LT. 8 THEN GOTO LOOP2 $! $DELETE MASK.MAP;* $RENAME NEWPHI.REF PHI16CY.31 $RENAME MINV.REF ALLCOEF.31 $PURGE ALLCOEF.31 $! $DEASSIGN SYS$OUTPUT $! THATS ALL $EXIT --- procedure extnd.com --- $SET NOVERIFY $! $! SEND ALL PRINTED OUTPUT TO EXTND.L $ASSIGN EXTND.L SYS$OUTPUT $! $! RESUME WHERE WE LEFT OFF AFTER FIRST 16 CYCLES $! $COPY PHI16CY.31 NEWPHI.REF $COPY PHASIT.31 MINV.REF $COPY MASK3.14 MASK.MAP $! $! CREATE THE TEMPORARY FILE "EXTND.TMP" CONTROLLING EXTENSION $ASSIGN SLOEXT.DAT FOR005 $SLOEXT $DEASSIGN FOR005 $! $! PERFORM THE PHASE EXTENSION/REFINEMENT ITERATIONS USING THE $! THIRD MASK $! $LOOP3: $! $! DO ITERATIONS AS LONG AS THE FILE EXTND.TMP EXISTS (IT WILL $! AUTOMATICALLY BE DELETED WHEN THE SPECIFIED RESOLUTION/NUMBER $! OF ITERATIONS IS REACHED) $FILESPEC=F$SEARCH("EXTND.TMP") $IF FILESPEC .EQS. "" THEN GOTO DONE3 $! $! COMPUTE ELECTRON DENSITY MAP FROM COMBINED PHASE INFORMATION $! $DELETE MINV.REF;* $RENAME NEWPHI.REF FOUR.REF $ASSIGN FFT.DAT FOR005 $FSFOUR $DEASSIGN FOR005 $DELETE FOUR.REF;* $! $! MODIFY ELECTRON DENSITY MAP ACCORDING TO MASK3 $! $! USE .06 FOR 3A DATA, .086 FOR 3.5A, AND .112 FOR 4A DATA $! $ASSIGN BND2.DAT FOR005 $BNDRY $DEASSIGN FOR005 $DELETE FOUR.MAP;* $! $! INVERT MODIFIED MAP TO OBTAIN STRUCTURE FACTORS $! $ASSIGN MINV2.DAT FOR005 $MAPINV $DEASSIGN FOR005 $DELETE MOD.MAP;* $! $! COMBINE PRIOR PHASE INFORMATION WITH THAT FROM MAP INVERSION $! AND EXTEND PHASING TO ADDITIONAL AMPLITUDES INPUT ON UNIT 16 $! $ASSIGN EXTND.DAT FOR005 $BNDRY $DEASSIGN FOR005 $! $! ADJUST ITERATION COUNT AND/OR RESOLUTION CUTOFF (THIS WILL ALSO $! DELETE THE "EXTND.TMP" FILE AT THE APPROPRIATE TIME, THUS EXITING $! THE LOOP) $ASSIGN SLOEXT.DAT FOR005 $SLOEXT $DEASSIGN FOR005 $! $GOTO LOOP3 $! $DONE3: $! $DELETE MASK.MAP;* $RENAME NEWPHI.REF PHIEXTND.31 $RENAME MINV.REF ALLCOEF.31 $PURGE ALLCOEF.31 $! $DEASSIGN SYS$OUTPUT $! THATS ALL $EXIT --- procedure extnda.com --- $SET NOVERIFY $! $! SEND ALL PRINTED OUTPUT TO EXTNDA.L $ASSIGN EXTNDA.L SYS$OUTPUT $! $! RESUME WHERE WE LEFT OFF $! $COPY PHIEXTND.31 NEWPHI.31 $COPY PHASIT.31 MINV.REF $COPY MASK3.14 MASK.MAP $! $! CREATE THE TEMPORARY FILE "EXTND.TMP" CONTROLLING EXTENSION $ASSIGN SLOEXT2.DAT FOR005 $SLOEXT $DEASSIGN FOR005 $! $! PERFORM THE PHASE EXTENSION/REFINEMENT ITERATIONS USING THE $! THIRD MASK $! $LOOP4: $! $! DO ITERATIONS AS LONG AS THE FILE EXTND.TMP EXISTS (IT WILL $! AUTOMATICALLY BE DELETED WHEN THE SPECIFIED RESOLUTION/NUMBER $! OF ITERATIONS IS REACHED) $FILESPEC=F$SEARCH("EXTND.TMP") $IF FILESPEC .EQS. "" THEN GOTO DONE4 $! $! COMPUTE ELECTRON DENSITY MAP FROM COMBINED PHASE INFORMATION $! $DELETE MINV.REF;* $RENAME NEWPHI.REF FOUR.REF $ASSIGN FFT.DAT FOR005 $FSFOUR $DEASSIGN FOR005 $DELETE FOUR.REF;* $! $! MODIFY ELECTRON DENSITY MAP ACCORDING TO MASK3 $! $! USE .06 FOR 3A DATA, .086 FOR 3.5A, AND .112 FOR 4A DATA $! $ASSIGN BND2.DAT FOR005 $BNDRY $DEASSIGN FOR005 $DELETE FOUR.MAP;* $! $! INVERT MODIFIED MAP TO OBTAIN STRUCTURE FACTORS $! $ASSIGN MINV2.DAT FOR005 $MAPINV $DEASSIGN FOR005 $DELETE MOD.MAP;* $! $! COMBINE PRIOR PHASE INFORMATION WITH THAT FROM MAP INVERSION, $! EXTEND PHASING TO ADDITIONAL AMPLITUDES INPUT ON UNIT 16, AND $! EXTEND PHASES AND AMPLITUDES TO ANY O $! $ASSIGN EXTNDA.DAT FOR005 $BNDRY $DEASSIGN FOR005 $! $! ADJUST ITERATION COUNT AND/OR RESOLUTION CUTOFF (THIS WILL ALSO $! DELETE THE "EXTND.TMP" FILE AT THE APPROPRIATE TIME, THUS EXITING $! THE LOOP) $ASSIGN SLOEXT2.DAT FOR005 $SLOEXT $DEASSIGN FOR005 $! $GOTO LOOP4 $! $DONE4: $! $DELETE MASK.MAP;* $RENAME NEWPHI.REF PHIEXTNDA.31 $RENAME MINV.REF ALLCOEF.31 $PURGE ALLCOEF.31 $! $DEASSIGN SYS$OUTPUT $! THATS ALL $EXIT