TABLE OF CONTENTS
GENERAL INFORMATION ..................... intro
REFERENCING THE PHASES PACKAGE .......... 0.00
GETTING STARTED ......................... 1.00
Accessing on line documentation ....... 1.01
Template scripts and files ............ 1.02
Flow Charts ........................... 1.03
File Formats .......................... 1.04
PROGRAM WRITEUPS ........................ 2.00
Phasit ................................ 2.01
Bndry ................................. 2.02
Fsfour ................................ 2.03
Mapinv ................................ 2.04
Pamfile ............................... 2.05
Mapview ............................... 2.06
Gmap .................................. 2.07
Missng ................................ 2.08
Mrgdf ................................. 2.09
Mrgbdf ................................ 2.10
Rd31 .................................. 2.11
Mk31b ................................. 2.12
Psrch ................................. 2.13
Cmbiso ................................ 2.14
Cmbano ................................ 2.15
Topdel ................................ 2.16
Gref .................................. 2.17
Import ................................ 2.18
Extrmap ............................... 2.19
Extrmsk ............................... 2.20
Mapavg ................................ 2.21
Maporth ............................... 2.22
Lsqrot ................................ 2.23
Lsqrotgen ............................. 2.24
Skew .................................. 2.25
Bldcel ................................ 2.26
Mdlmsk ................................ 2.27
Mrgmsk ................................ 2.28
Trnmsk ................................ 2.29
Rdhead ................................ 2.30
Precess ............................... 2.31
O_to_SP ............................... 2.32
Xpl_phi ............................... 2.33
Pdb_cds ............................... 2.34
Rmheavy ............................... 2.35
Ctour ................................. 2.36
Viewplt ............................... 2.37
Plttek ................................ 2.38
Mkpost ................................ 2.39
Pstats ................................ 2.40
Hndchk ................................ 2.41
Sloext ................................ 2.42
EXAMPLES ................................ 3.00
Pamfile ............................... 3.01
Initial phasing ....................... 3.02
Solvent levelling ..................... 3.03
Doall scripts ......................... 3.04
Expected output ....................... 3.05
NATIVE, DIFFERENCE AND "CALCULATED"
PATTERSON MAPS .......................... 4.00
REFINING HEAVY ATOM PARAMETERS .......... 5.00
HEAVY ATOM DIFFERENCE, DOUBLE DIFFERENCE
AND CROSS DIFFERENCE FOURIER MAPS ....... 6.00
CREATING/EDITING SOLVENT MASKS .......... 7.00
INCORPORATION OF PARTIAL STRUCTURES ..... 8.00
REDUCED BIAS NATIVE, COMBINED AND
DIFFERENCE FOURIER MAPS ................. 9.00
INCORPORATION OF NONCRYSTALLOGRAPHIC
SYMMETRY AVERAGING ..................... 10.00
Averaging with Multiple Crystals ..... 10.01
Averaging Difference or 2FO-FC Maps .. 10.02
Sample Input Files for Averaging ..... 10.03
DENSITY MODIFICATION WITH MOLECULAR
REPLACEMENT DERIVED PHASE INFORMATION .. 11.00
PHASE EXTENSION ........................ 12.00
MAD PHASING ............................ 13.00
VMS USER INFORMATION ................... 14.00
UNIX SHELL SCRIPTS ..................... 15.00
VMS COMMAND PROCEDURES ................. 16.00
PHASES
PHASES is a package of computer programs designed to compute phase
angles for diffraction data from macromolecular crystals. The package
is complete in that it contains programs for the following: merging
and scaling of native and derivative data sets; analyzing difference
statistics; computing Patterson and electron density maps; searching
for peaks; refining heavy atoms (or protein domains as rigid groups);
computing phases by MIR (multiple isomorphous replacement), SIR
(single isomorphous replacement), SAS (single wavelength anomalous
scattering), SIRAS (single isomorphous replacement supplemented with
anomalous scattering), MIRAS (multiple isomorphous replacement
supplemented with anomalous scattering) or from atomic coordinates for
an input model; noncrystallographic symmetry averaging; combining
phases from a partial structure with MIR etc phases; computation and
analysis of cross difference or Bijvoet difference Fourier maps; and
for phase extension and refinement.
Once an initial set of phases is generated, programs are included
to improve them by carrying out solvent levelling with negative
density truncation and/or combination with model based phase
information and/or averaging over noncrystallographic symmetry.
Solvent levelling is facilitated by the automatic protein-solvent
boundary determination method (Wang, in Methods in Enzymology 115,
1985) which is implemented here entirely in reciprocal space in a much
more efficient manner than in previous programs. If applied to SIR or
SAS starting phases, the programs can also carry out the ISIR or ISAS
phasing procedures described by Wang. The programs are written
in FORTRAN 77 with the exception of a single C interface subroutine
(facilitating use of X-Window graphics in some programs) and are
applicable to any space group without requiring changes by the user.
The program package was written by W. Furey, VA Medical Center and
University of Pittsburgh, Dept. of Crystallography. The package
consists of 5 major programs and many utility programs as follows:
PROGRAM FUNCTION
PHASIT.F Computes MIR, SIR, MODEL etc phases from
input atomic parameters and diffraction
data. Can refine heavy atoms or derivative
scaling parameters in "phase refinement"
mode.
BNDRY.F Computes coefficients for automatic
boundary determination. Determines
protein-solvent boundary mask, flattens
solvent and applies negative density
truncation, combines phases from external
sources (map inversion or from partial
structures) with original phase information,
extends phases to higher resolution.
FSFOUR.F Space group general 3D FFT program for
electron density calculations.
MAPINV.F Space group general 3D FFT program for
structure factor calculations.
*
MAPVIEW.F Interactive contouring/map viewing program.
Allows user to view maps and masks, and
trace/edit solvent or averaging masks.
CTOUR.F Creates contoured plots from FSFOUR maps,
either as individual sections, mono or
stereo projections. Plots can be viewed
directly or converted to PostScript.
GMAP.F Extracts region from a FSFOUR map and
creates corresponding maps for the graphics
programs TOM, O or CHAIN. Also can create
skeleton files for TOM or O.
MISSNG.F Selects reflections for phase extension.
MRGDF.F Generates coefficients for isomorphous
difference Fourier or cross Fourier.
MRGBDF.F Generates coefficients for Bijvoet
difference Fourier or cross Fourier.
RD31.F Converts internal binary file to ASCII
for examination and/or editing.
MK31B.F Restores ASCII version of file to binary.
PSRCH.F Searches Fourier map and lists unique peaks.
CMBISO.F Combines native and derivative isomorphous
replacement data into one file and scales
the derivative data to the native.
CMBANO.F Combines native data and derivative
anomalous scattering data into one file
and scales the derivative data to the native.
TOPDEL.F Examines isomorphous/anomalous scattering
differences, identifies and rejects outliers,
prepares file for difference Pattersons.
GREF.F Refines heavy atom parameters against
isomorphous or anomalous scattering
differences; refines protein domains,
substructures etc as rigid groups against
native data.
RMHEAVY.F Temporarily removes density in map from heavy
atoms, to aid in accurate solvent mask
generation.
IMPORT.F Allows user to introduce his own phases and
Hendrickson-Lattman coefficients (computed
by external programs) into the PHASES package
for subsequent calculations. This allows one
to bypass the PHASIT program.
XPL_PHI.F Creates input reflection file for XPLORE
from a PHASES style phased file.
*
PRECESS.F Lets one construct and interactively examine
"pseudo" precession or "pseudo" difference
precession photos made from reflection files.
*
VIEWPLT.F Displays up to 10 plots created by CTOUR
on workstation or X-Window capable monitor.
PLTTEK.F Displays plots created by CTOUR on terminals
capable of using TEKTRONIX 4010 emulation.
MKPOST Converts plots created by CTOUR to PostScript.
PDB_CDS.F Converts coordinate files between PDB and
PHASES formats, and vice versa.
EXTRMAP.F Extracts a region (submap) from the standard
FSFOUR map for use in averaging, skewing etc.
EXTRMSK.F Extracts a region (submap) from the standard
solvent mask for possible editing, skewing
etc.
MAPAVG.F Averages one or more maps to impose non-
crystallographic symmetry.
MAPORTH.F Orthogonalizes non-orthogonal map (and
optionally mask) for use in refinement of
noncrystallographic symmetry operator.
LSQROT.F Refines purely rotational noncrystallographic
symmetry operator against electron density.
LSQROTGEN.F Refines general noncrystallographic symmetry
operator (arbitrary rotational angle, with
translation) against electron density.
SKEW.F Skews a map (and optionally a mask) to
a new, and arbitrarily oriented cell.
BLDCEL.F Rebuilds a complete unit cell map (and
optionally, mask) from an input asymmetric
unit submap (and optionally, mask).
MDLMSK.F Creates a mask from coordinates in an input
atomic model, for use in averaging, NC
symmetry operator refinement or use in
solvent flattening.
MRGMSK.F Merges multiple masks created by MDLMSK
into a single mask.
TRNMSK.F Transforms mask created in a "skewed"
cell back to the normal cell.
HNDCHK.F Interpolates density from a map at
specified sites, usually for the purpose
of determining the proper hand.
SLOEXT.F Controls number of iterations and rate of
phase extension to higher resolution.
RDHEAD.F Dumps header from averaging map (submap) or
mask files for examination.
O_TO_SP.F Extracts spherical polar angles and
axis location for use in PHASES from
rotation matrix/translation vector
produced by program "O"
PSTATS.F Tabulates mean phase difference between
two phase sets as a function of d spacing.
*
Two versions are provided for these interactive graphical programs,
one for Silicon Graphics hardware which uses the GL, and the other,
(called MAPVIEW_X, PRECESS_X, VIEWPLT_X) which can run on any
system with a color monitor supporting the X-Window protocol
(which also includes SGI).
Versions of the package are supplied for Silicon Graphics, Sun,
IBM R6000, ESV and DEC ALPHA (both OSF and OPENVMS) workstations,
but it should be easy to port to other machines. It generally will not
be necessary to modify the programs as they are "self adjusting"
and can accomodate various sized problems. However, in the
unlikely event that modifications are necessary, messages indicating
what should be changed will be printed (it will never involve more
than 3 lines of code). Each program is independent and can be run in
standalone mode, and individual write-ups are supplied for each.
Usually however, a command file is submitted to invoke an entire
sequence of program executions constituting a complete phasing
application. Template command files are given to carry out the entire
phasing process for both UNIX and VMS based computers.
Comments and/or inquires should be made to:
Dr. William Furey
Biocrystallography Laboratory
PO BOX 12055
VA Medical Center
University Drive C
Pittsburgh, Pa 15240
tel (412) 683-9718 E-Mail fureyw@VMS.CIS.PITT.EDU
The documentation is organized such that first a description of
the necessary files is provided, followed by an overview of the
phasing process. Then flow charts are given indicating how the
programs can be used to carry out frequently needed tasks. Then
individual program write-ups are given, which include descriptions of
computations performed by the major programs. Then sample input files
are given both for routine phasing and for solvent flattening. Then
descriptions of procedures for refining heavy atoms, creating/editing
solvent masks, incorporating partial structure information,
noncrystallographic symmetry averaging, phase extension and MAD
phasing are provided. Finally, a listing of all template command
procedures for both UNIX and VMS systems is given.
0.00 REFERENCING THE PHASES PACKAGE
When publishing results obtained from use of the software, a
statement should be included like "all heavy atom refinement, phasing,
solvent flattening, noncrystallographic symmetry averaging, map
calculations etc. (or whatever is appropriate) were carried out with
the PHASES package (Furey & Swaminathan, 1995)." This refers to the
following paper:
"PHASES-95: A Program Package for the Processing and Analysis of
Diffraction Data from Macromolecules", W. Furey & S. Swaminathan,
in MACROMOLECULAR CRYSTALLOGRAPHY, a volume of Methods in Enzymology,
eds. C. Carter & R. Sweet, Academic Press, Orlando, Fl. (1996), in
press.
1.00 GETTING STARTED
The first thing to do is to prepare an input parameter file
specifying the cell constants, symmetry information etc. This file is
referred to as the "standard parameter file" throughout the PHASES
package, and is often called "PAMFIL" generically in specific program
writeups. One should select a name for it which is indicative of the
particular structure being worked on, and rapidly communicates to the
user that it is a parameter file. For example, PDC.PAM might be a good
choice for phasing pyruvate decarboxylase. The main purpose of this
file is to insure consistency in cell constants, symmetry, lattice
type etc throughout all programs, and to eliminate redundant input of
these parameters by the user. In addition one can optionally specify
the name of a "running log file." If this is done then in addition to
normal output to either the screen or individual log files for each
program, a copy of all printed output is also appended to a single
file, preceeded by a time stamp indicating what program was run and
when. Thus one can maintain a complete history of all computations
and results in a single log file.
Each standard parameter file should contain the following
information in the indicated sequence.
LOGFILE=FILNAME Where FILENAME is the name of the
desired "running" log file. If no
cumulative log is desired, enter
LOGFILE=NULL
There must be no spaces immediately
preceeding or following the "=". Upper
or lower case is permitted.
LATTICE=X Where "X" is either P,A,B,C,I,F or R
There must be no spaces immediately
preceeding or following the "=". Upper
or lower case is permitted for the word
LATTICE, but only UPPER case for the
single character symbol.
A, B, C, ALPHA, BETA, GAMMA Unit cell constants, in angstroms and
degrees. Readable in free format, i.e.
at least one blank or comma separating
entries.
NSYM Number of equivalent positions in
the space group. Do NOT include
additional translations associated
with centering conditions for
non-primitive lattices, i.e. for
space group C2 NSYM=2. (this entry
read in free format).
The NSYM symmetry operators follow, one operator per line EXACTLY
as indicated in the International Tables for X-Ray Crystallography.
The first operator should ALWAYS be X,Y,Z. Note that for rhombohedral
lattices the HEXAGONAL CELL AND SYMMETRY OPERATORS SHOULD BE USED,
along with the lattice type R.
The following sample serves as a complete template for a parameter
file, for space group P2(1)2(1)2(1)
LOGFILE=seb.rlog
LATTICE=P
45.331 68.33 79.62 90. 90. 90.
4
X,Y,Z
1/2-X,-Y,1/2+Z
1/2+X,1/2-Y,-Z
-X,1/2+Y,1/2-Z
Once a suitable parameter file is created, the phasing process can
begin.
One starts phasing by preparing one or more "scaled" or "merged"
files containing x-ray diffraction data. The files will vary
depending on whether isomorphous replacement or anomalous scattering
data is to be used for phasing. Each file should be ASCII (read in
free format) with all records containing the same type of information.
Each record should contain
H, K, L, FP, Sig(FP), FPH, Sig(FPH) for isomorphous replacement
or
H, K, L, F+, Sig(F+), F-, Sig(F-) for native anomalous scattering
or
H, K, L, FP, Sig(FP), FPH+, Sig(FPH+), FPH-, Sig(FPH-)
for derivative anomalous scattering
where
H, K, L = Miller indices (integers).
FP, FPH = Native and Derivative structure factor amplitudes
F+, F- = Structure factor amplitudes for reflection. F+
corresponds to indices H, K, L, F- to -H,-K,-L.
FPH+, FPH- = Derivative structure factor amplitudes. FPH+ corresponds
to indices H, K, L, FPH- to -H, -K, -L.
Sig(X) = Estimated standard deviation for quantity X.
A separate file should be prepared for each derivative/anomalous
scattering data set. For isomorphous replacement and derivative
anomalous scattering data the FPH values should have already been
properly scaled to the FP values. If more than one data set is to be
used for phasing, then ALL F VALUES SHOULD BE ON THE SAME SCALE.
Indeed, for MIR phasing it is best to keep corresponding FP values
IDENTICAL in each data set. The "scaled/merged" files are usually
prepared by the programs CMBISO or CMBANO and are generally given
filenames ending in ".scl", but they can also be generated externally
by the user. It is always desirable however, to use the ".scl"
ending as some of the programs in PHASES will deduce the file format
from the ending of the filename. Once these files are prepared, they
can be used to create difference Pattersons to identify heavy atom
sites. A control file containing heavy atom parameters (either for
the derivative, or anomalous scatterers) must then be prepared, and
GREF or PHASIT can be run. If PHASIT is simply used to compute
structure factors from a model, then the ".scl" files are not needed,
but reflection and coordinate files must still be supplied. One
can use the phase file output from PHASIT directly to compute an
electron density map with FSFOUR, or the file can be used with
programs BNDRY, FSFOUR and MAPINV to carry out solvent levelling,
negative density truncation, phase extension and phase combination
analogous to Wang's ISIR procedure. If the latter is selected, then
the file output from PHASIT should be named "phasit.31"
In general, programs CMBISO and/or CMBANO are used to prepare all
reflection data files. Then TOPDEL is run to reject outliers and
select data for difference Patterson calculations to be performed by
FSFOUR. The Patterson map can be interactively contoured and examined
in MAPVIEW, searched for peaks in PSRCH, or contoured to generate
hard copies as PostScript files with CTOUR and MKPOST. Once heavy atom
locations are identified, they can be refined by GREF or PHASIT. The
heavy atom parameters and data are then used in PHASIT to compute SIR,
MIR phases etc. Next MISSNG is run followed by the solvent
flattening/negative density truncation/phase extension iterations
carried out by BNDRY and invoked by the procedure DOALL. If more than
one derivative is needed, or one wants to search for additional heavy
atom sites, programs MRGDF and/or MRGBDF can then be used to create
difference or cross difference coefficients, FSFOUR computes the map,
and MAPVIEW, PSRCH or CTOUR are used to identify peaks again. One can
also use the difference coefficients files produced by PHASIT to
compute "double difference" type maps to search for minor sites. The
new heavy atom parameters are then included in PHASIT, and the process
is repeated. This procedure can be cycled over as many derivatives
or data sets as needed. As a final step, it is often useful to hold
the "solvent flattened" phases fixed in PHASIT, and refine the heavy
atom parameters again. This final set of heavy atom parameters is then
used to compute final MIR etc phases in PHASIT, which are then used to
start a final round of solvent flattening. The final map resulting
from these phases can be interactively contoured and examined in
MAPVIEW, converted to graphics map format (e.g. for TOM, O or CHAIN)
and skeletonized by GMAP, or hard copies can be prepared by CTOUR and
MKPOST.
1.01 ACCESSING ON LINE DOCUMENTATION
The complete PHASES manual (what you are reading now) is maintained
online in the file PHASES.WUP. This file generally resides in the top
level of the PHASES directory, which initially is a subdirectory under
"export" (on UNIX systems), but its location may vary depending on how
one installs the software. On OpenVMS systems it can be accessed by
referring to PHASES_DOC:PHASES.WUP (if one installs the software as
described later). It is recommended that each user make a copy of the
manual in his own working directory so it can be examined without
fear of destroying the original. The manual is a simple ASCII text
file and can be examined in the editor of your choice. All program
write-ups begin with the program name followed by a single space and
then by the word "WRITE-UP" (all in uppercase), so that, for example,
to get to the write-up for program FSFOUR one can simply enter an
editor and search for "FSFOUR WRITE-UP" or just "FSFOUR W". This will
position the editor at the appropriate place in the manual. Just be
sure to exit the editor without making any changes. Indeed, it may be
desirable to set the file protection so that it can be read but not
written.
1.02 TEMPLATE SCRIPTS AND FILES
Included with the PHASES distribution are a series of sample control
files (*.sh or *.com files) as well as sample input data (*.d or *.dat
files). As initially distributed, these files reside in the top level
of the PHASES directory (itself a subdirectory under "export" on UNIX
systems, or in PHASES_TEMPL, if installed as suggested in the "VMS
USER INFORMATION" section). The "*.sh" files are UNIX shell scripts
to invoke one or more programs, while the "*.com" files accomplish the
same tasks under OpenVms. Similarly, the "*.d" and "*.dat" files are
sample data inputs for programs under the UNIX and OpenVms operating
systems, respectively. Generally the "*.d" and "*.dat" files are
identical. It is suggested that each user copy these files to his
working directory to serve as templates for new applications. This
will minimize the possibility of typing errors, and also serve as an
example for a particular calculation. Indeed, it may be desirable to
open two windows, one editing the template file and the other
positioned to examine the appropriate write-up as described in the
preceeding section.
1.03 FLOW CHARTS
Native file Derivative file
. .
. .
v v
************************************
* CMBISO or CMBANO *
************************************
.
.
.
"Scaled/Merged" file
.
.
v
*****************
* TOPDEL *
*****************
.
.
"Patt" file
.
.
v
*****************
* FSFOUR *
*****************
.
.
"Map" file
.
.
v
....................................................
. . .
. . .
v v v
**************** ****************** *************
* PSRCH * * MAPVIEW * * CTOUR *
**************** ****************** *************
Path for initial processing of a derivative data set, includes
merging and scaling native and derivative data, rejecting outliers,
computing difference Patterson maps and examination.
"Scaled/Merged" file(s)
.
.
v
*************** "Phased" file,
* PHASIT * from BNDRY after
*************** solvent flattening
. . .
. . .
. "Phased" file .
. . .
. . .
. v v
. ...................................
. .
. .
. v
. *********************
. * MRGDF or MRGBDF *<-- "Scaled/Merged
. ********************* file"
"difference file" .
. .
. "Cross phase" file
. .
. .
. v
. *********************
..............>* FSFOUR *
*********************
.
.
"Map" file
.
.
v
....................................................
. . .
. . .
v v v
**************** ****************** *************
* PSRCH * * MAPVIEW * * CTOUR *
**************** ****************** *************
Paths for generating and examining "cross difference" Fourier,
"cross Bijvoet difference" Fourier, "double difference" Fourier
or "double Bijvoet difference" Fourier, started either by
generating SIR, MIR etc phases, or using "solvent flattened"
phases.
Native file
.
.
v
***************** "Phased" ***************
----------* PHASIT * ----------------->* MISSNG *
. ***************** file . ***************
. . .
. . .
. . "Extension" file
. . .
. . .
Partial . .
structure ***************** . .
file * FSFOUR *<------ . .
. ***************** . . .
. . ^ ----- .
. . . . .
. . . -------- .
. . . . .
. . . . .
. v . v .
. ***************** .
--------->* BNDRY *<-------------------------
*****************
. ^
. .
. .
v .
*****************
* MAPINV *
*****************
Path for solvent flattening process, as implemented in the "doall"
procedure. Starts with SIR, MIR etc phases and includes mask
generation, solvent flattening and phase combination iterations. The
leftmost and rightmost branches are optional, for inclusion of partial
structure information and phase extension, respectively. The FSFOUR-
BNDRY-MAPINV loop performs the iterations. The PHASIT output
is fed directly to FSFOUR only during the initial pass, to generate
the first map. In all passes it is fed to BNDRY to serve as the
"anchor" phases in the phase combination step.
************ *************
* FSFOUR *------------>* EXTRMAP *
************ *************
^ ^ .
. . .
. . v
************** . . *************
* PHASIT *------- . * MAPAVG *<----
************** . . ************* .
. . . .
. . . "Envelope"
. . . Mask
v . v .
************ ************* .
"Extension"----->* BNDRY *<------------* BLDCEL *<----
file ************ *************
^ .
. .
. v
************
* MAPINV *
************
Path followed during solvent flattening iterations modified to
include noncrystallographic symmetry averaging. The PHASIT output
is fed directly to FSFOUR only during the initial pass, to generate
the first map. In all passes it is fed to BNDRY to serve as the
"anchor" phases in the phase combination step. The "extension" file
is optional, and is used for phase extension only.
1.04 FILE FORMATS
Most of the programs in the PHASES package utilize the same internal
file formats, choosen for combinations of simplicity and efficiency.
The major files used are now described.
1) "Input" files. Entering data initially into the package assumes one
can prepare reflection files either in free format, as XENGEN-like
"MULISTS", or as "SCALEPACK" style files. Thus input structure factor
files can have any of the following record formats.
FREE FORMAT i.e.
h, k, l, F, sig(F) (ASCII, read in free format)
The free format input file is generally assumed in the programs if
the filename ends in ".DAT" or ".dat", and sometimes will be assumed
if no other file type is deduced from the ending of the filename. The
"free format" implies that the values in each record are separated by
at least one blank space or a comma.
or
XENGEN like "MULIST" i.e.
h, k, l, res, F, sig(F), F+, sig(F+), F-, sig(F-), iflag
in format ( 3I4, 1X, F6.4, 6(1X, F8.2) 1X, I2 )
The "iflag" status flag is optional. If present, it will be
used to screen for viable anomalous scattering data. If absent,
only values with F+ and F- greater than zero will be used when
anomalous data is needed. MULIST format is generally assummed
within the programs if the filename ends in ".MU" or ".mu".
or
SCALEPACK style files i.e. the file starts with a variable
number of header records, the total number of which is given by
one plus 2 times the number given in the first header record
(format I5). After the header records (usually 3) the data
follows as individual records containing
h, k, l, I+, sig(I+), I-, sig(I-)
in format ( 3I4, 4F8.0 )
Note that unlike the other formats, for SCALEPACK files
INTENSITIES and their standard deviations are given instead
of AMPLITUDES. Also, the files need not contain Bijvoet pair
data as the last two items may be missing, as would be the
case if the data were reduced treating Freidel mates as
equivalent. SCALEPACK format is generally assumed within the
programs if the filename ends in ".SCA" or ".sca".
2) "Scaled" (and merged) structure factor files.
These files are produced by CMBISO or CMBANO, starting with input
files of type (1). The files are ASCII, with each record containing
h, k, l, FP, sig(FP), FPH, sig(FPH)
or
h, k, l, FP+, sig(FP+), FP-, sig(FP-)
or
h, k, l, FP, sig(FP), FPH+, sig(FPH+), FPH-, sig(FPH-)
in format ( 3I4, 6F10.2)
for either isomorphous, native anomalous or derivative anomalous
data sets, respectively. SCALED files are generally assumed
within the programs if the filename ends in ".SCL" or ".scl".
3) "Phased" structure factor files. These files are produced mainly
by PHASIT and BNDRY, but can also be generated by other programs.
There are two types of "phased" files, depending on whether or not
probability distributions are available. Both types of files are
BINARY, but can be converted to ASCII by the utility program RD31.
The first type, the normal or "long" format has records containing
h, k, l, FOM*FO, FO, PHIbest, A_B, C_D, MK, FOM
where h, k, l, A_B, C_D and MK are INTEGERS, and the others REALS
The Hendrickson-Lattman probability distribution coefficients are
packed two per word, in the A_B and C_D entries according to
A_B = ( IFIX(A*100) + 16384 )*32768 + IFIX(B*100) + 16384
C_D = ( IFIX(C*100) + 16384 )*32768 + IFIX(D*100) + 16384
FO is the observed protein structure factor amplitude, PHIbest
the "best" (centroid) phase in degrees, and FOM the associated figure
of merit. MK is the restricted phase indicator, such that if MK=1 there
are no restrictions. If MK > 1, then the reflection is centric, with
one of the allowed phases given by 15*(MK-1), and the other 180
degrees away from it.
There is also an alternate version of the "long format" phase file,
obtained only from running option 3 of BNDRY with IOTYP=1, which
has FO and FC replacing FOM*FO and FO in the records. This file type
is used ONLY if one wants to do solvent flattening and/or NC symmetry
averaging iterations on DIFFERENCE or 2FO-FC MAPS. Its usage is
explained elsewhere in the documentation.
The second type, or "short" format has records containing
h, k, l, FO, FC, PHI
where FC is the "calculated" structure factor amplitude, as typically
computed from input coordinates for a model in PHASIT, GREF, or output
from the map inversion program MAPINV.
Note that the Fourier program FSFOUR only reads the first six entries
in a record, so that in general, EITHER type of "phased" file can be
used for map calculations. However, some map types might be accessible
with only one of the formats (e.g. difference maps). Both long and
short format PHASED files are generally recognized within the programs
if the filename ends in ".31".
4) "Mask" files. These files are binary, with the same record format
applying both to "solvent masks" and "averaging masks." The file
starts with a header record containing
A,B,C,AL,BE,GA,NX,NY,NZ,IXMN,IYMN,IZMN,IXMX,IYMX,IZMX
with the first 6 values REAL*4, the next 9 INTEGER*4, the lengths in
Angstroms and the angles in degrees.
NX =
Number of grid points defining one "cell length" along
NY = respective axis. Implicitly defines grid spacing as
del x = A/NX, del y = B/NY and del z = C/NZ
NZ =
IXMN, IXMX =
Minimum, maximum grid index defining map region such
IYMN, IYMX = that x (fractional) = IX * (del x) / A etc.
There are no restrictions on magnitudes or signs.
IZMN, IZMX =
The mask follows as (IYMX-IYMN+1)*(IZMX-IZMN+1) records, with
each containing one row (IXMX-IXMN+1 BYTE values) along X,
starting at IXMN. Y is slowest varying, i.e. the file could have
been created with the following FORTRAN code:
DO 30 IY=IYMN,IYMX
DO 20 IZ=IZMN,IZMX
20 WRITE(LU)(MSK(IX,IY,IZ),IX=IXMN,IXMX)
30 CONTINUE
Note that the mask entries are FORTRAN type BYTE (INTEGER*1).
For solvent masks, the entries will either be 0 (protein) or 2
(solvent). For averaging masks only the values 0, 10, 20, 30, 40
etc are meaningful as they indicate the grid point is inside the
primary envelope for molecules 1, 2, etc. The masks can be displayed
with program MAPVIEW, and program RDHEAD can be used to list the
header record.
5) "FSFOUR" maps. These maps are produced by FSFOUR (and BLDCEL).
They are binary, and contain a variable number of header records
followed by the map. The map ALWAYS covers one full cell. See FSFOUR
write-up (and possibly examine the program source) for further
details.
6) "Submaps" Also referred to as "averaging" maps. These map files
are binary, with the same header and record structure as "mask" files,
except that the density values are written as FORTRAN type REAL instead
of mask values. They are usually prepared by MAPVIEW or EXTRMAP, but
can be generated by MAPORTH, SKEW, TRNMSK etc. Note that RDHEAD can
also be used to list the header record.
7) "Extension" file. Used for phase extension, and created by program
MISSNG. This file is ASCII, and contains a list of reflection indices,
Fobs and phase probability distribution coefficients, for reflections
absent on the main "phased" file, but for which native amplitudes and
possibly phase probability distribution coefficients are available.
It is used only for phase extension. The records simply contain
h, k, l, Fobs, A_B, C_D
in format ( 3I4, F10.2, 2I12 ) where the distribution coefficients
are packed as in a normal phased file. If no distribution coefficients
are available the A_B and C_D values are zero.
2.00 PROGRAM WRITEUPS
This section includes writeups for each individual program. For
the major programs (PHASIT, BNDRY etc), in addition to describing the
required input data, a complete description of how the program works
is included. Often, suggested strategies are provided as well. First
time users should read at least the PHASIT, BNDRY, FSFOUR, MAPINV
and MAPVIEW writeups completely.
2.01 PHASIT WRITE-UP
PHASIT can be run in one of two modes, protein phasing mode or
structure factor calculation mode. Some of the input data is common to
both modes, but other data is needed only for the particular mode
invoked. First, the data that is always needed is described.
INPUT DATA (UNIT 5)
CARD 1 - PAMFIL (free format)
PAMFIL = name of parameter file
containing cell and symmetry
information.
CARD 2 - MODE, NXSCAT (free format)
MODE = 0 for protein phase
calculations.
= 1 for structure factor
calculations.
NXSCAT = number of additional atomic
types for which scattering
factors will be input. Note
that 20 types are already
stored in the program (see
below), thus this is usually
nonzero only for exotic
atoms or wavelengths other
than CU K alpha.
The following block of cards should be included only if NXSCAT > 0
Up to 5 additional atomic types may be input. For each additional
atomic type, include the following 3 records
REC 1 (A(J),J=1,4) (free format)
A(J) = Coefficients for analytical
approximation to scattering
factors, as in Int. Tables,
Vol IV, pages 99-101.
REC 2 (B(J),J=1,4) , C (free format)
B(J) = Coefficients for analytical
approximation to scattering
C = factors, as in Int. Tables,
Vol IV, pages 99-101.
REC 3 DEL f' , DEL f'' (free format)
DEL f' = real part of anomalous
scattering correction term.
DEL f'' = imaginary part of anomalous
scattering correction term.
The appropriate remaining data should be supplied only for the mode
selected.
**** additional input for protein phasing mode (MODE= 0 )****
CARD 3 + 3*NXSCAT - NSETS, NOREF, N (free format)
NSETS = number of data sets
(derivatives)to use in phasing
(max = 30)
NOREF = 0 for protein phase calculation
only.
= 1 for protein phase calculation
plus "phase refinement" of
derivative parameters.
N = minimum number of contributing
data sets for the phase of an
acentric reflection to be output.
CARD 4 +3*NXSCAT - OUTREF (free format)
OUTREF = Name of output reflection file to
contain the final protein phases.
The following block of cards 1-6, must then be repeated for each
data set
1) TITLE = anything (free format)
2) FILEIN = input merged data filename (free format)
3) FILOUT = output difference Fourier filename (free format)
4) DCUT, SIGCUT, ISOFLG, SCLFPH, BOVFPH, SCLFH, ( EC(I),I=1,4 )
(free format)
DCUT = minimum allowed d spacing.
SIGCUT = minimum allowed F/sig value.
ISOFLG = 0 for isomorphous replacement data.
= 1 for native anomalous scattering data.
= 2 for derivative anomalous scattering
data.
SCLFPH = scale factor multiplying FPH (obs)
to scale it to FP (obs). Usually =1.
unless refined in previous run.
BOVFPH = overall thermal factor, applied to
FPH (obs) to scale it to FP (obs).
Applied as exp(BOVFPH*ssthol) * FPH.
Usually = 0. unless refined in
previous run.
SCLFH = scale factor multiplying |FH|(calc)
to scale it to the observed data.
If unknown, input 0. and it will be
computed.
(EC(I),I=1,4) = coefficients for 3 term polynomial,
used to generate "standard" E (lack
of closure, based on intensity)
values as function of |FP|., and the
minimum allowed value of E. If
unknown, input 0. for each and they
will be computed.
5) NA = (number of heavy atoms/anomalous scatterers with known
positions, free format)
6 etc) ATNAME, X, Y, Z, B, OCC, ITYPE FORMAT(7X,A8,5F10.5,I5)
ATNAME = anything
ITYPE = 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19 or 20
for C, N, O, S, Fe+3, Pt+2, Hg+2, Au+3, Pb+2, Os+4,
I-, Zn+2, Ca+2, Mg+2, Cd+2, U+6, P, Br-, Cl- or Sm+3,
respectively. ITYPE = 21 through 20+NXSCAT for the
additional types, in the same order as originally input
by the user.
OCC = Occupancy factor
X,Y,Z = Fractional atomic coordinates
B = Thermal factor.
Note that if B is > 0., then it is assumed to be an
isotropic thermal factor. If B is input as 0., then the
temperature factor is assumed to be anisotropic with the
B11, B22, B33, B12, B13, B23 elements being supplied on
the immediately following record. If B is < 0., then the
temperature factor is assumed to be isotropic with
magnitude = ABS(B), but it will be converted to anisotropic
prior to use in the program.
The following record should be included ONLY if the supplied B
value is less than or equal to 0. for the preceeding atom.
5a etc) B11, B22, B33, B12, B13, B23, BRES, SIG FORMAT(8F10.5)
B11 =
B22 =
Components of anisotropic thermal factor tensor.
B33 = If B (previous record) is < 0., then these fields
are irrelevant as the program will compute them
B12 = by converting |B| to anisotropic.
B13 =
B23 =
BRES = Possible target value for restraining the isotropic
equivalent of the anisotropic temperature factor. If
BRES > 0., then a restraint term of the form
WT*(BRES-BEQ)**2 is included in the least squares
equations.
SIG = Sigma for restraint term, used only if BRES is > 0.
WT is 1/SIG**2. (Suggested value =0.5)
Include cards 5 (and possibly 5a) for each of the NA atoms.
***** END OF INPUT, UNLESS HEAVY ATOM REFINEMENT WAS REQUESTED *****
If "phase refinement" was requested (NOREF=1), then include the
following cards.
CARD A) NPASS, FMCUT, NHVCYL, IWT, IEXC, NFIXP, MAXLIK (free format)
NPASS = # of times protein phases are
to be recomputed, i.e. # of
refinement passes. (max=10).
Protein phases are held fixed
during each pass, and updated
at the end of each pass.
FMCUT = Figure of merit cutoff.
Reflections will not be used in
phase refinement if the
associated figure of merit is
< FMCUT.
NHVCYL = # of refinement cycles
to be performed in each pass.
(max=50). Each cycle can refine
heavy atom and/or scaling parameters
for any data set.
IWT = 0 for refinement weights based
on expected lack of closure.
= 1 for refinement weights based
on estimated accuracy of current
protein phase.
= 2 for unit weights.
IEXC = 0 to exclude contribution to
protein phase distribution from
each data set when parameters
for that data set are being
refined.
= 1 to include contributions to
protein phase distribution from
all possible data sets during
refinement.
NFIXP = 0 for normal operation (uses
protein phases based on current
heavy atom data during refinement).
= 1 to read in externally derived
protein phases, and hold them
fixed during heavy atom refinement.
If NFIXP=1, then IEXC is reset 1,
and IWT is reset to 0 if it was 1.
MAXLIK = 0 for conventional parameter
refinement.
= 1 for "Maximum Likelihood" parameter
refinement.
**** The following card should be included ONLY if NFIXP=1 ****
CARD A' FXDFIL (free format)
FXDFIL = name of file containing the
protein phases to be held fixed
and used during refinement.
The following card set B,C,D must then be repeated for each of the
NHVCYL cycles requested.
CARD B) IVSET (free format)
IVSET = data set number (in order as
originally input) of set for which
derivative parameters are to be
refined.
CARDS C) (IVAR(J),J=1,5 or 10) (free format)
Variable selection information
IVAR(1) = 1 to refine x coordinate, 0 to hold fixed
IVAR(2) = 1 to refine y coordinate, 0 to hold fixed
IVAR(3) = 1 to refine z coordinate, 0 to hold fixed
IVAR(4) = 1 to refine occupancy, 0 to hold fixed
IVAR(5) = 1 to refine B (or B11), 0 to hold fixed
IVAR(6) = 1 to refine B22, 0 to hold fixed
IVAR(7) = 1 to refine B33, 0 to hold fixed
IVAR(8) = 1 to refine B12, 0 to hold fixed
IVAR(9) = 1 to refine B13, 0 to hold fixed
IVAR(10)= 1 to refine B23, 0 to hold fixed
Card C must be repeated for as many atoms as are in the specified data
set. Each card refers to a single atom, in the same order as
originally input. Note that IVAR(6-10) are appropriate only if the
corresponding atom was input with (or converted to) an anisotropic
temperature factor.
CARD D) (IVSCL(I),I=1,3) (free format)
IVSCL(1) = 1 to refine SCLFPH, 0 to hold fixed
IVSCL(2) = 1 to refine BOVFPH, 0 to hold fixed
IVSCL(3) = 1 to refine SCLFH, 0 to hold fixed
Note! For native anomalous scattering data sets, IVSCL(1) and
IVSCL(2) must be 0
**** FILES ****
The input "scaled/merged" reflection files have already been
described. The output protein phase file OUTREF is binary and contains
records with the following:
H, K, L, FMFO, FO, PHIBEST, IPRAB, IPRCD, MK, FOM
where
H, K, L = Miller indices (integers)
FMFO = Figure of merit weighted structure factor amplitude
(either FOM * FP or FOM * F+)
FO = Observed structure factor amplitude (either FP or F+)
PHIBEST = Best (centroid) phase, in degrees.
IPRAB Hendrickson-Lattman coefficients A,B,C,D for the phase
= probability distribution used, packed two per word as
IPRCD (IFIX(A*100)+16384)*32768 + IFIX(B*100)+16384 and
(IFIX(C*100)+16384)*32768 + IFIX(D*100)+16384
MK = Restricted phase indicator. For general reflections
MK=1, for centric reflections MK > 1 and one of the
allowed phase values is (MK-1)*15 degrees (the other
possibility is 180 degrees away).
FOM = Figure of merit associated with PHIBEST and used for
weighting.
The output files "FILOUT" are "short form" phase files suitable
for computing difference Fouriers, double difference Fouriers, observed
difference Pattersons or "calculated" difference Pattersons for each
data set, via the MAPTYP=1,3,6,7 options, respectively, in FSFOUR. They
can be used to identify more heavy atom sites, to generate difference
Pattersons or to generate "calculated difference Pattersons" from the
input heavy atom model for comparison with the "observed difference
Pattersons". These files actually contain records with
IH,IK,IL,FHobs,FHcalc,PHI_Hcalc
IH,IK,IL,(FP+ - FP-)obs,(FP+ - FP-)calc,(PHI_PRO-90)
IH,IK,IL,(FPH+ - FPH-)obs,(FPH+ - FPH-)calc,(PHI_PRO-90)
for isomorphous, native anomalous and derivative anomalous data sets,
respectively.
If phase refinement is requested (NOREF=1) and protein phases are to
be explicitly input (NFIXP=1), then an additional file FXDFIL with the
same structure as the output phase file above must also be supplied to
provide the protein phase information. If MAXLIK = 0 only the indices,
PHIBEST and FOM will be used. If MAXLIK = 1 the Hendrickson-Lattman
coefficients will also be used.
In protein phasing mode the program expects to read in one or more
"merged" data files, i.e. files with records containing H, K, L, FP,
SFP, FD, SFD for isomorphous replacement data, H, K, L, F+, SF+, F-,
SF- for native anomalous scattering data or H, K, L, FP, SFP, FPH+,
SFPH+, FPH-, SFPH- for derivative anomalous scattering data. It is
assumed that the native and derivative data has already been properly
scaled together (via CMBISO or CMBANO). If more than one data set is
input containing native F values (FP), corresponding FP values are
assumed to be identical (on same scale) in each set, as would be the
case if each derivative set was scaled to the same native set with
CMBISO. It is not necessary for any given reflection to be present in
all sets. If more than one data set is supplied, but a reflection is
present in only one of them, then the resulting output phase for that
reflection will correspond to an SIR (or SAS) calculation rather than
MIR. One can however, request that acentric reflection phases be
output only if N or more data sets contributed, where N is an input
parameter. Thus an N value of 2 would insure that output phases are
generated only for cases where the phase ambiguity has been resolved
(in principle). For centric reflections there is no phase ambiguity,
hence the N criterion is not applied. If only one data set is input,
then N should be 1 to insure that all computed phases (either SIR or
SAS) are output.
NOTE!!!! If both NATIVE anomalous scattering and other types of
data sets are input, THE NATIVE ANOMALOUS SCATTERING SETS SHOULD BE
THE LAST ONES INPUT. If both anomalous and isomorphous data sets are
input then the F and SIG values for the anomalous data should be on
the same scale as the isomorphous data. This will happen automatically
if CMBISO and CMBANO are used to prepare the data files and the same
native set was used as input. If NATIVE anomalous scattering data is
to be used IN ADDITION TO OTHER DATA TYPES, then it is convenient to
also run it through CMBANO to put it on the scale of the other data,
and then edit the output file to strip away the extra FP and Sig(FP)
fields. This is needed to conform to the file format for native
anomalous scattering sets, yet be properly scaled for consistancy with
the other data sets.
If only mutiple anomalous scattering data sets are input, then F
values for all sets are assumed to be on the same scale, and the heavy
atom parameters should correspond to the same hand, and be consistent
with the input indices.
IT IS ASSUMED THAT WHEN MULTIPLE DATA SETS ARE INPUT, THE ORIGIN
AND HAND IS CONSISTENT THROUGHOUT ALL DATA SETS.
**** additional input for SF calculation mode (MODE=1) ****
CARD 3 + 3*NXSCAT - INPREF (free format)
INPREF = Name of file containing the
input reflections for which
structure factors will be computed.
CARD 4 + 3*NXSCAT - INPCDS (free format)
INPCDS = Name of file containing the
input atomic coordinates.
CARD 5 + 3*NXSCAT - OUTSF (free format)
OUTSF = Name for output file containing
the calculated structure factors.
CARD 6 + 3*NXSCAT - KRES,(KILRES(I),I=1,KRES) (free format)
KRES = Number of residues to be omitted
from structure factor calculation.
(KILRES(I),I=1,KRES) = residue numbers for the KRES
residues to be omitted.
CARD 7 + 3*NXSCAT - IMODE, IHLCF, ISIGA (free format)
IMODE = 0 if atomic type to be derived from
first character of atom name (see
below)
= 1 if atomic type explicitly input
(see below)
IHLCF = 0 "Short" Fourier output. File
contains Fobs, Fcalc, phase.
= 1 "Full" Fourier output. File
contains FM*Fobs, Fobs, phase,
Hendrickson-Lattman coefs etc.
NOTE! IHLCF is meaningful only when
ISIGA is zero, as the nature
of the output file is determined
for ISIGA > 0 as described below.
ISIGA = 0 If "full" file output is requested
(IHLCF=1), Bricogne's modification
of Sim's weights are to be used to
construct the phase probability
distributions.
= 1 For "Full" file output but with
distributions based on Sigma_A
weights.
= 2 For "short" file output appropriate
for reduced bias difference maps
based on sigma_A weighting (use Fo-FC
option in FSFOUR).
= 3 For "short" file output appropriate
for reduced bias native maps based
on sigma_A weighting (use 2FO-FC
option in FSFOUR).
**** FILES ****
INPREF - Input structure factor file. Several types of files can
be used here, and the type of file is deduced from the last part
of the filename. Allowed file types include binary (31 type files,
either long format or short format), any of the "merged" files,
"MULISTS", SCALEPACK style files or files in free format.
If the filename ends with ".31", then a binary style "phased"
file is assumed, which can be the output from a previous PHASIT
or BNDRY run. Either long or short format files can be used, and
the program will figure out which type was input and pick up the
indices and Fobs values appropriately. The records thus would
contain either
h, k, l, FOM*FO, FO, PHIbest, A_B, C_D, MK, FOM (long format)
or
h, k, l, FO, FC, PHI (short format)
Note that previous files output from PHASIT, structure factor mode
with ISIGA > 1 or output in "phasing mode" as a "difference
coefficient file" are NOT appropriate as they do NOT contain FO
explicitly. Similarly, long format files output from BNDRY with
IOTYP=1 are not appropriate as they do not contain FO in the
second amplitude slot.
If the file name ends with ".MU" or ".mu", then it is assumed to be
an ASCII "MULIST" i.e. a file generated by program MAKEMU (in the
XENGEN system) or by program FBSCALE. In that case each record is
assumed to contain
H, K, L, RES, F, Sig(F), F+, Sig(F+), F-, Sig(F-), Iflag
in format (3I4, 1X, F6.4, 6(1X, F8.2) 1X, I2 ). Only the indices
and F values will be used.
If the filename ends with ".SCA" or ".sca", then an ASCII SCALEPACK
file is assumed. After a variable number of header records (see the
FILE FORMATS section), reflection records follow and contain
H, K, L, I+, sig(I+), I-, sig(I-)
in format (3I4, 4F8.1)
Note the use of intensities rather than F's. The last two items
in each record may be omitted. If present, they would be used
only if I+ was not measured.
If the filename ends with anything other than ".31", ".MU", ".mu",
".SCA" or ".sca", the file is assumed to be ASCII and is read in free
format. The records are assumed to contain
H, K, L, FO
where H, K, L = Miller indices (integers)
FO = Observed structure factor amplitude
Note that this is appropriate for any of the "scaled and merged"
files output by CMBISO or CMBANO, and generic files as well.
INPCDS - Input atomic coordinate file, ASCII with
format ( 1X, A1, 5X, A1, I3, A4, 5F10.5, I5). Each record
should contain
CHN, RT, IRES, ATOM, X, Y, Z, B, OCC, ITYP
where
CHN = single character chain identifier (not used)
RT = single letter amino acid code (not used)
IRES = sequence number (used only if rejecting residues)
ATOM = atom name (used only if IMODE=0)
X,Y,Z = fractional atomic coordinates
B = Isotropic thermal factor
OCC = Occupancy factor
ITYP = 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19 or 20
for C, N, O, S, Fe+3, Pt+2, Hg+2, Au+3, Pb+2, Os+4,
I-, Zn+2, Ca+2, Mg+2, Cd+2, U+6, P, Br-, Cl- or Sm+3,
respectively. ITYP = 21 through 20+NXSCAT for the
additional types, in the same order as originally input
by the user. Note that if IMODE=0, then atomic types are
derived from the first character of the atom name, but
only C,N,O,S or Fe will be recognized.
Include one record of this type for each atom
OUTSF - The output structure factor file differs, depending on the
values of IHLCF and ISIGA.
If ISIGA = 0 and IHLCF = 0, the file is binary with each record
containing
H, K, L, FO, FC, PHIcalc
where H, K, L = Miller indices (integers)
FO = Observed structure factor amplitude
FC = Calculated structure factor amplitude (scaled to
input set)
PHIcalc = Calculated phase angle in degrees.
If ISIGA =1 (or ISIGA=0 AND IHLCF = 1) the file is binary with each
record containing
H, K, L, FMFO, FO, PHIcalc, IPRAB, IPRCD, MK, FOM
where
H, K, L = Miller indices (integers)
FMFO = Figure of merit weighted structure factor amplitude
FOM * FO
FO = Observed structure factor amplitude FO
PHIcalc = Calculated phase, in degrees.
Hendrickson-Lattman coefficients A,B,C,D for phase
IPRAB probability distribution centered on calculated phase,
= packed two per word as
IPRCD (IFIX(A*100)+16384)*32768 + IFIX(B*100)+16384 and
(IFIX(C*100)+16384)*32768 + IFIX(D*100)+16384
MK = Restricted phase indicator. For general reflections
MK=1, for centric reflections MK > 1 and one of the
allowed phase values is (MK-1)*15 degrees (the other
possibility is 180 degrees away).
FOM = Figure of merit associated with PHIcalc and used for
weighting.
Note that this record structure is identical to that produced in
protein phasing mode, although the probability distributions will all
be unimodal.
If ISIGA = 2 the file is binary with each record containing
H, K, L, FOM*FO, D*FC, PHIcalc
with the parameters as previously described, and D is as defined in
Read's Sigma_A procedure. This file is appropriate for reduced bias
DIFFERENCE maps, and should be used in FSFOUR with the FO-FC option.
If ISIGA = 3 the file is binary with each record containing
H, K, L, FOM*FO, D*FC, PHIcalc for acentric reflections
and
H, K, L, FOM*FO/2, 0., PHIcalc for centric reflections
with the parameters as previously described, and D is as defined in
Read's Sigma_A procedure. This file is appropriate for reduced bias
NATIVE maps, and should be used in FSFOUR with the 2FO-FC option.
In structure factor calculation mode, a set of reflection indices
and observed F values are read in from one file (which can be the
output file generated from a previous run of PHASIT or BNDRY). Atomic
coordinates, occupancies and thermal parameters are read in from
another file. Structure factors are then computed for all input
reflections, and a binary output file is written. Records in the
binary file differ depending on which options (IHLCF and ISIGA
parameters) were selected. In one case a "short" form of the phase
file is written, generally containing Fobs, Fcalc and the phase. The
output structure factor file then is identical (in structure) to that
produced by MAPINV, thus it can be used in option 3 of the BNDRY
program to combine phase information from the partial (or complete,
but tentative) structure with other phase information. If combined
with an output from PHASIT (protein phasing mode), then SIR, MIR etc
phases can be combined with those from the model structure. If
combined with an output from BNDRY, then partial structure phases can
be combined with MIR, etc phases AFTER density modification. The file
can also be used directly to compute electron density, difference
density, "residue deleted" maps etc., based on phases and amplitudes
computed from the input model. Provisions are available to omit
various residues from the structure factor calculation, thereby
facilitating use of the file for computation of "residue deleted"
electron density maps.
If other options are selected, after calculating the structure
factors and scaling them to the observed data Hendrickson-Lattman
coefficients are also computed, based either on Bricogne's
modification of Sim's weighting scheme or on Read's Sigma_A procedure.
The output file then can contain FM*Fobs, Fobs, Phi, HL coefficients,
restricted phase indicator and figure of merit. In that case the
output file structure is identical to that produced by BNDRY, or by
PHASIT in protein phasing mode. The file can then also be used to
compute Fourier maps, but conventional DIFFERENCE Fouriers can NOT be
computed since the Fcalcs are not present on the file. It can however,
then be used as the "anchored" phases to which other phase information
can be "tethered", i.e. replace the MIR phases. It can also be input
to MISSNG, so that phase extension can be tethered to the partial
structure phases in subsequent density modification cycles.
By invoking other options the file can contain coefficients
appropriate for "reduced bias" native or difference maps, based on
Randy Read's Sigma_A procedure.
**** PHASIT PROGRAM STRUCTURE ****
In protein phasing mode the following events take place.
For each data set the program will do the following:
1) Read in all reflections and reject those which fail to pass the
supplied d and F/SIG cutoff information.
2) The indices of each accepted reflection are transformed (if needed)
to correspond to a "standard" asymmetric unit, systematic absences are
rejected, and phase restrictions are identified for centric
reflections. If the data set contains anomalous scattering data
centric reflections are rejected. All other reflections are stored.
3) Heavy atom parameters are read in and structure factors are
computed based on the heavy atom positions, using the appropriate
scattering factors for isomorphous or anomalous scattering data,
respectively.
4) A suitable number of reflections are chosen from which difference
magnitudes ABS(FP - FPH), ABS(F+ - F-) or ABS(FPH+ - FPH-) are used to
scale the heavy atom structure factors. For isomorphous replacement
data all reliable centric reflections are used, if any are present. If
there is an insufficient number of centric reflections, the selected
list is augmented by the 25% largest differences for acentric data.
For anomalous scattering data, the 25% largest differences ABS(F+ -
F-) etc are used. If the user input a scale factor, then it is used
instead of the computed value. R factors are then reported after
scaling the heavy atom structure factors.
5) The data is grouped into ranges based on the magnitude of FO or
(F+ + F-)/2, and rms E values (lack of closure) are computed for each
range. All centric data (possibly augmented with acentric data as
described above) are used to determine E values in the isomorphous
replacement case. In the anomalous scattering case only the 25%
strongest differences are used. For centric isomorphous replacement
data the input sig(FP) and sig(FPH) values are used to remove from
the E values the components arising from measurement error, and the
remaining lack of closure value is halved. The components due to
measurement error are then added back. This enables the E values
determined from centric data to be applicable to the acentric data.
A three term polynomial is then fit by least squares to the rms E
values as functions of FO or (F+ + F-)/2. If the user input the
polynomial coefficients, then this step is bypassed.
6) From the scaled heavy atom structure factors, input amplitudes and
computed E values, Hendrickson-Lattman coefficients are computed to
represent the SIR (or SAS) phase probability distributions. For the
centric isomorphous replacement data the E values are first adjusted
to "undo" the downscaling making them appropriate for acentric data.
7) SIR (or SAS) phases are then computed by integrating over the
distributions to yield "best" phases and the associated figure of
merit. Figure of merit statistics are then output, along with an
estimate of the "phasing power" ( FH(calc)/E or 2.*FH"(calc)/E ) as
a function of resolution. Note that for the purpose of phasing power
calculations E values are based on amplitude differences, whereas for
the actual probability distributions E values are based on intensity
differences.
8) The indices, observed and calculated amplitudes, input standard
deviations, Hendrickson-Lattman coefficients, calculated phase
components and restricted phase indicators are output to a scratch
file.
After repeating the procedures 1-8 for each data set, phase
information from all sets is combined as follows:
9) The scratch files are rewound and read. The first time unique
indices are encountered, they are stored along with FP (or F+), the
restricted phase indicator and the Hendrickson-Lattman coefficients. A
counter is also saved to keep track of the number of data sets
(probability distributions) contributing to each reflection. If the
same reflection is encountered again, the Hendrickson-Lattman
coefficients are added to those already saved and the counter is
incremented.
10) For each unique reflection, the cumulative Hendrickson-Lattman
coefficients are used to generate the combined phase probability
distribution. The distribution is then integrated to yield the "best"
(centroid) phase and associated figure of merit. The computed phase is
then saved, and the number of contributing data sets and restricted
phase indicator are examined. If the reflection is acentric, the number
of data sets contributing to that particular distribution is compared
to N (input value) to decide whether or not to output the reflection.
11) The indices, figure of merit weighted FP (or F+), FP (or F+), best
phase, Hendrickson-Lattman coefficients (for combined distribution),
restricted phase indicator and figure of merit are then output for
each centric reflection and for those acentric reflections passing the
"N" criteria. Figure of merit statistics are then output for the final
phase set. A "difference Fourier coefficient" file is also written
for each data set enabling one to search for additional sites, or to
compare Pattersons "calculated" from the input sites with the
"observed" difference Pattersons. Both difference maps (showing all
heavy atom sites) and "double difference maps" (after subtracting
out the input heavy atoms) can be computed with the same file, as
can the "observed" and "calculated" difference Pattersons.
12) If more than one data set was input, the scratch files are then
rewound and read again to recompute the "phasing power" and "bias" for
each data set. This time however, the phasing power calculations are
based on lack of closure values obtained using the new protein phases.
In theory, for data sets containing only small errors, the phasing
power for each data set should increase relative to its initial value
if the multiple data sets are consistantly resolving the phase
ambiguity. Large decreases indicate an inconsistant derivative or lack
of isomorphism beyond a given d spacing, and generally result from
incorrect signs of many isomorphous or anomalous scattering
differences. Usually there will be small decreases observed when
more than 2 or 3 data sets are used. This means that some of the signs
of delta F are inconsistant and is unavoidable with experimental data.
Also, the phasing power is essentially the "signal to noise ratio" for
each data set, thus when it falls below 1.00 the data probably does
more harm than good. A good policy is to truncate each data set at the
resolution where the phasing power falls to about 1.00. The "mean
relative error" M.R.E., defined as (1/N) * SUM (e(phi)**2 / 2.*E**2))
where e(phi)**2 is the lack of closure, weighted over all possible
protein phases for each reflection is also output for each data set,
and should be about 0.5 if the E's are properly determined. In
addition, the mean phase "bias" toward heavy atom phases is listed
both as a function of resolution, and overall for each data set. Since
there should be no correlation between true protein and heavy atom
phases, the mean bias should be 90 degrees for each data set. If it
deviates significantly from 90 degrees, one (or possibly more
correlated) data set(s) is/are likely to be dominating the phasing
process, and biasing the results.
13) If more than one data set was input and derivative parameters are
NOT being refined (NOREF=0), the program then starts a second cycle by
updating the E value polynomial coeficients for each set as before,
but this time using probability weighted averages over all possible
protein phase values for each reflection. The updated E values are then
used to recompute Hendrickson-Lattman coefficients for each set. New SIR
or SAS phases are then computed and Figure of merit statistics are
listed for each set separately. The results are then written to new
scratch files. Steps 9-12 are then repeated to produce and evaluate new
combined distributions. Statistics are given as before, but this time
the mean absolute phase shift (in degrees) from the previous cycle is
output as well. Only the results of this final cycle will appear on
the output phase and difference coefficients files. This recycling
procedure generally improves results since phases are based on what are
normally more accurate E values. This is especially true for the
anomalous scattering data sets, since the original E's were estimated
from a small subset of data based on crude (though reasonable)
statistical arguments. The program then terminates.
14) If atomic or scaling parameters ARE being refined (NOREF=1), for
each data set a check is made to determine whether E value polynomial
coeficients have been updated yet for it (as for example, in a previous
run). If not, new coefficients are determined as in step 13, and new
SIR or SAS phases are computed based on them. If the E coefficients
are updated for ANY set, then all sets are combined again to determine
new protein phases and statistics as before. Once updated polynomial
coefficients are available for each set, and protein phase estimates
have been obtained based on them, refinement of parameters then
proceeds.
The program loops over each set to be refined as follows:
If externally derived protein phases are to be used (NFIXP=1), the
indices, phases, FOM'S (and distribution coefficients if maximum
likelihood refinement is requested) are read in and stored. Otherwise,
protein phases and figures of merit are recomputed using contributions
to the combined phase probability distributions from either ALL data
sets, or from all EXCLUDING the set currently being refined, as
indicated by the user supplied parameter IEXC. For the set being
refined, heavy atom structure factors and derivatives are then
computed, and FPH(calc) (or FPH+(calc), FPH-(calc)) and its
derivatives with respect to the variable parameters are computed,
using the selected protein phases. Contributions to the Cullis and
Kraut R factors are then accumulated. If the current figure of merit
exceeds the input cutoff, the derivatives are included in the buildup
of least squares equations minimizing the weighted lack of closure
with respect to the selected variable parameters. If MAXLIK=0 the
quantity minimized is
SUM [ W*(|FPH|(obs) - |FPH|(calc))**2] for isomorphous or
SUM [ W*((|FPH+|(obs)-|FPH-|(obs)) - (|FPH+|(calc)-|FPH-|(calc)) )**2 ]
for anomalous scattering data sets, respectively, where W is 1./E**2,
1./E'**2 (E' is the RMS E value (based on amplitudes) only for the
contributing data sets), or unity as selected by the user via the
parameter IWT. If MAXLIK=1, instead of computing |FPH|(calc) at the
single value of phi(Protein)=phi(best), the equations above are
modified to include contributions from all possible values of
phi(Protein), with each suitably weighted by the probability associated
with phi(Protein). Thus in the isomorphous case the quantity minimized
becomes
SUM [ W * SUM [ P(i) * (|FPH|(obs) - |FPH|(calc,i))**2 ] ]
where P(i) is the probability for phi(Protein) used in the calculation
of |FPH|(calc,i), and P(i) is stepped over the phase circle in 5 degree
increments. A similar expression is used in the anomalous case.
The least squares equations are solved by matrix inversion, and the
parameters are then updated. The following R factors are reported.
R Cullis = SUM | ||FPH|(obs) +/- |FP|(obs)| - |FH|(calc) |
----------------------------------------------
SUM | |FPH|(obs) +/- |FP|(obs)|
with the sum taken over all centric reflections.
R Kraut = SUM | |FPH|(obs) - |FPH|(calc) |
---------------------------------
SUM |FPH|(obs)
with the sum taken over all acentric reflections (isomorphous case).
R Kraut = SUM ||FPH+|(obs)-|FPH+|(calc)| + ||FPH-|(obs)-|FPH-|(calc)|
-----------------------------------------------------------
SUM |FPH+|(obs) + |FPH-|(obs)
with the sum taken over all acentric reflections (anomalous case).
After NHVCYL refinement cycles, the heavy atom structure factors
and R factors are recomputed based on the new parameters. Steps 6-12
are then repeated to generate new protein phases, and the E values are
updated as in step 13. The whole process is repeated for each of the
NPASS passes requested. After each pass, the mean absolute phase shift
over all reflections is output. After the last pass, the protein phase
and difference coefficients files are written, and a new file
NEWPARAMS.INP is created, which is a copy of the original input deck
except that the new heavy atom parameters, scale and E coefficients
replace the original ones. This deck can be used for further refinement
in a subsequent job. Note that within a pass, protein phases are held
fixed (except for possible removal of contributions from the derivative
being refined). They are updated only after the end of each pass, and
even then, only if externally derived phases are NOT being used.
***** NOTES ON PHASE REFINEMENT *****
During phase refinement, one generally excludes contributions to the
protein phase probability distributions from the data set for which
parameters are being refined (IEXC = 0). This is because the assumption
is that the protein phases and heavy atom parameters are independent,
which will not be true if the derivative contributed to the protein
phases. Indeed, it may not be strictly true even if contributions to
protein phases are omitted from the derivative, if it has heavy atom
sites in common with another derivative that IS contributing. On the
other hand, successful phase refinement of parameters depends on
REASONABLY ACCURATE protein phases being available. This presents a
problem when only a few derivatives are to be used. If protein phase
contributions come from only one derivative (the one not being
refined), then the protein phases are very poorly determined as they
are actually SIR phases. Phase refinement then usually results in
reduction of the FH scale factor and most occupancies. The end result
is a degradation of most all statistical indicators, but little or no
change in the figure of merit. In this case it may be desirable to
ignore the correlation, and include all contributions to the protein
phase (IEXC=1), which results in stable, although slow refinement. In
that case the expected improvement is usually obtained, but the bias
toward heavy atom phases may be slightly larger than desired. It is
sometimes useful to do this even with 3 or more derivatives.
Also, note that the R Cullis and R Kraut values are dependent on
the current protein phases. Thus if contributions from the set being
refined are excluded, these factors will generally increase as they do
not reflect the final protein phases, but only the phases in use at
the time they were computed. For this reason, it is always desirable
to include all possible contributions (IEXC=1) at least in the last
cycle, just to get the final Cullis and Kraut R factors which
correspond to the MIR phases for publication purposes. The parameter
shifts need not be used.
It is often desirable to read in externally derived protein
phases, and hold them fixed for use in heavy atom parameter refinement.
This could be the case, for example, if the initial parameters are
poorly determined, but a "solvent flattened" and/or "symmetry
averaged" map looks reasonable. In that case, protein phases obtained
from the map (and possibly combined with the original phases) might be
better suited for parameter refinement than the original phases were.
These "EXTERNAL" phases can be input and used during parameter
refinement (NFIXP=1). In that case, the program still computes new
protein phases after each refinement pass for the purpose of updating
statistics, E values and final output, but the phases which were input
are ALWAYS used UNCHANGED during every refinement cycle. The output
phases however, will always correspond to those computed from the
current heavy atom parameters, and can be used to start a new round of
solvent flattening. IT IS STRONGLY SUGGESTED that one always do at
least one round of refinement against solvent flattened phases in
this manner, AND USE THE NEW PARAMETERS TO INITIATE A FINAL ROUND OF
SOLVENT FLATTENING!
An important aspect of phase refinement is that it enables
refinement of the derivative to native scaling parameters. These
parameters should initially be 1. and 0. for SCLFPH and BOVFPH, as
CMBISO or CMBANO has equated the scattering from native and derivative
data sets. While this is adequate for initial heavy atom determination,
it can not be strictly correct as the presence of the additional heavy
atoms MUST increase the scattering for the derivative crystal relative
to that from the native. Thus refinement of the FPH scale factor should
increase it to slightly more than unity, the exact value being limited
by the composition of the native and derivative crystals. If the FPH
scale factor falls below unity, it can not correspond to reality.
There is however, no restriction on the BPH scale factor (which is
actually a delta B, between the native and derivative data sets). Since
the data sets have already been "thermally" scaled (in CMBISO or
CMBANO), refinement of BPH generally results only in small shifts,
which can be positive or negative. Also, note that all changes in the
derivative scaling parameters are TEMPORARILY applied internally in the
program. The input "merged" data files for each set are NOT modified in
any manner, and still correspond to the scaling applied in CMBISO or
CMBANO. The cross-phase Fourier coeficient generating programs MRGDF
and MRGBDF can apply the additional scaling parameters, if desired,
for the purpose of generating difference or cross difference Fouriers
which reflect the new scaling parameters. Also, note that in principle
one can refine both the derivative FPH and FH scale factors
simultaneously, but since they are correlated, in practice this
sometimes leads to poor results. This is particularly the case with
derivative anomalous scattering data. In that case, it may be best to
refine only one of these two parameters in any given cycle, and
alternate refinement of them between cycles. Refinement of the native-
derivative scale factors works best when initiated against FIXED
EXTERNAL PHASES (e.g. solvent flattened and/or NC symmetry averaged).
For maximum likelihood phase refinement one has considerably
more flexibility in the weights and in the figure of merit cutoff.
Since the contributions will be weighted by their probabilities
anyway, one can greatly reduce the figure of merit cutoff, perhaps
even to include all reflections. It might also be useful to then
refine with the "exterior" weights unity (IWT=2) so that the
probabilities will be the only weights applied. During maximum
likelihood refinement there is no need to exclude contributions from
the derivative being refined. Note that maximum likelihood refinement
can also be done with external phases (NFIXP=1). In the program,
although contributions to the matrix (and hence the parameter shifts)
come from all points on the probability distribution, for statistical
purposes the R factors are still reported only while assuming
phi(protein) = phi(best).
In structure factor calculation mode the following events take
place:
1) All atomic parameters are read and checked to insure that the atom
type is recognized, and that enough storage exists to do the
calculation. If any residues were targeted for rejection, atoms in the
residue have their scattering factors set to zero to effectively
eliminate them from the input list.
2) Each reflection is read in and the corresponding structure factor
is computed based on the atomic parameters input. The indices, FO, FC
and phase are stored, and sums for the least squares calculation of a
scale factor and for computation of the correlation coefficient
between observed and calculated amplitudes are incremented.
3) After all structure factors are generated, the scale factor
relating FO to FC is computed, all FC's are rescaled and an R factor
(based on F) and correlation coefficient are computed.
4) The R factor, correlation coefficient and number of reflections
processed is listed.
5) If both IHLCF=0 and ISIGA=0, the indices, FO, scaled FC and
phase are output for each reflection and the program terminates.
6) If IHLCF=1 and ISIGA=0, the data are sorted, and mean values of
abs(Fo**2 - Fc**2) in various resolution shells are computed. A three
term polynomial is then fit to the delta data as a function of
resolution.
For each reflection, the indices (and phase) are converted, if
needed, to the "standard" asymmetric unit, and the expected value of
abs (Fo**2 - Fc**2) is obtained from the polynomial and is used to
compute Hendrickson-Lattman coefficients for the reflection using
Bricogne's modification of Sim's weighting scheme, i.e.
W = 2 * FO * FC / < | FO**2 - FC**2 | >
sin(theta)/lambda
A = W * COS (Phi calc)
B = W * SIN (Phi calc)
C = 0
D = 0
The distributions are evaluated (to get the figures of merit), and
the indices, Fm*Fo, Fo, Phi, Hendrickson-Lattman coefs, restricted
phase indicator, and Fm are written to the output file. A sum to
compute the mean figure of merit is also updated. The mean figure of
merit is listed, and the program terminates.
7) If ISIGA > 0 the indices and phases are transformed to the
standard asymmetric unit, the data are sorted on resolution, and are
converted to normalized structure factors. Both sigma_A and "D"
values are then computed for each shell as described by Read.
Distribution coefficients are then computed as described above except
that
W = 2. * Sigma_A * Eobs * Ecal / (1. - Sigma_A**2) for acentric data
W = Sigma_A * Eobs * Ecal / (1. - Sigma_A**2) for centric data
The distributions are then evaluated to get the figure of merit, and
coefficients appropriate for conventional electron density, reduced
bias native or reduced bias difference maps are written to the output
file as requested. The mean figure of merit is then reported.
Note that the options (IHLCF=1 and ISIGA=0,) or ISIGA=1 are very
useful if one wishes to "solvent flatten" or "average" a map which is
obtained from a model, i.e. a molecular replacement solution, since
it provides an "MIR like" phase file which can be used to "tether"
subsequent phase information to (via BNDRY, option 3), while the
other options are useful for direct examination of maps or to provide
model based phases for phase combination with MIR like information.
2.02 BNDRY WRITE-UP
BNDRY is a program to facillitate solvent flattening, negative
density truncation and/or phase combination. If the starting phases
come from a single derivative or single wavelength anomalous
scattering data, then it can be used to carry out Wang's ISIR/ISAS
procedure. If the starting phases come from MIR or any other source,
it can be used for "phase refinement" via solvent flattening and/or
negative density truncation. The program can also be used for phase
extension to higher resolution or to missing reflections within the
input resolution, or for combining partial structure information with
MIR, SIR etc data. BNDRY can be run in one of four modes, with each
mode carrying out a particular function. The mode desired is selected
via an input parameter. The input required, output generated and
tasks performed for each mode are now described.
INPUT CONTROL DATA (UNIT 5)
RECORD 1 PAMFIL (free format)
PAMFIL = Name of file specifying cell parameters,
symmetry information etc.
RECORD 2 IOPT (free format)
IOPT = 0 Convert input Fourier coefficients to those
appropriate for protein-solvent boundary
determination.
= 1 Construct mask map identifying protein and
solvent regions from "smeared" map input.
= 2 Modify electron density map via solvent
leveling and negative density truncation.
= 3 Combine phase information from new source
with input Hendrickson-Lattman representation
of phase probability distributions.
Each run of the program should invoke only one of the options.
Depending on the option called, the following data must also be given.
***** for IOPT=0 *****
RECORD 3 RAD (free format)
RAD = sphere radius, for weighted averaging of density
in boundary determination (typically 2.5-3 times
minimum d spacing)
RECORD 4 INPREF (free format)
INPREF = name of input reflection/phase file. This file
should have been generated by creating a map with
the desired starting phases, setting all density
values (omitting the F000 term) < 0 to 0, and
inverting it.
RECORD 5 OUTREF (free format)
OUTREF = name of output reflection/phase file. A map generated
with this data is "locally averaged", or "smeared",
and is appropriate for protein/solvent boundary
determination.
**** FILES ****
INPREF = (binary file) = Input Fourier coefficients, from inversion of
original electron density after negative
density truncation, as computed by program
MAPINV. One reflection per record
containing H, K, L, FO, FC, PHI where
H, K, L = INTEGERS, FC,PHI=REALS and PHI is
in degrees. FO not used.
OUTREF = (binary file) = Output Fourier coefficients, same form as
INPREF but modified to correspond to
"smeared" map appropriate for boundary
determination.
***** for IOPT=1 *****
RECORD 3 MAPFL1 (free format)
MAPFL1 = Name of input "locally averaged" or "smeared" map
file. This map should have been produced by FSFOUR
from the coefficients in file OUTREF (option 0).
RECORD 4 MSKFIL (free format)
MSKFIL = Name of output mask file.
RECORD 5 PSOL (free format)
PSOL = bulk solvent fraction (by volume, typically 0.3-0.6)
If the bulk solvent fraction is unknown, it can be
estimated reasonably well by the formula
PSOL= 0.97 - (1.22 * Z * Mw)/V
where V is the unit cell volume (cubic angstroms),
Mw is the molecular weight of the protein (Daltons),
and Z is the number of protein molecules of weight
Mw in the unit cell. This assumes a standard value
for protein volume, and that 3% of the solvent is
tightly bound to protein.
**** FILES ****
MAPFL1 = (binary file) = input electron density map ("smeared" map)
as computed by program FSFOUR from
coefficients generated in option 0.
MSKFIL = (binary file) = output mask map file. After one header
record, each subsequent record corresponds
to one input record of the map (as type
BYTE). Values of 2 indicate solvent region,
anything else indicates protein region. Note
that the mask can be displayed/edited by
program MAPVIEW.
***** for IOPT=2 *****
RECORD 3 MAPFL1 (free format)
MAPFL1 = Name of input electron density map file to be
modified.
RECORD 4 MSKFIL (free format)
MSKFIL = Name of input protein/solvent mask file.
RECORD 5 MAPFL2 (free format)
MAPFL2 = Name of output (modified) electron density map
file.
RECORD 6 SVAL (free format)
SVAL = empiracal constant, used to approximate F000/V
(See program desciption that follows)
Typical values: .060 (3.0 Angstrom data)
.086 (3.5 Angstrom data)
.112 (4.0 Angstrom data)
.250 (6.0 Angstrom data)
**** FILES ****
MAPFL1 = (binary file) = input original (unmodified) electron density
map as generated by program FSFOUR
MSKFIL = (binary file) = input mask map (from run with IOPT=1)
MAPFL2 = (binary file) = output modified electron density map (same
structure as input map)
***** for IOPT=3 *****
RECORD 3 IEXT, DCUT, DAMP, ICMB, IOTYP (free format)
IEXT = 0 For no phase extension.
= 1 To extend phases to additional amplitudes given
on file EXTREF (up to DCUT resolution).
= 2 Same as 1, but phases AND AMPLITUDES also
generated for any other missing reflections up
to DCUT angstrom resolution.
DCUT = d spacing cutoff, in angstroms, for extension.
(note that extension only possible to reflection
index range (or to symmetry related reflections)
specified in input to MAPINV)
DAMP = Damping factor, (in range 0-1.) for weighting
contribution of input probability distribution to
the combined distribution. Usually 1.0, which
applies the true weight. If < 1., will downweight
contribution of original distribution, i.e.
increase relative weight of new (map inversion
or partial structure) distribution to the
combined distribution.
ICMB = 0 To use Bricogne's modification of Sim's weighting
procedure during phase combination.
= 1 To use Read's Sigma_a procedure for weighting
during phase combination.
IOTYP = 0 For normal output, i.e. phase file to contain
FOM*FO and FO in the amplitude slots.
= 1 For modified output, i.e. phase file to contain
FO and FC in the amplitude slots. (This is
used only if one wants to do NC symmetry
averaging on difference or 2FO-FC maps, or
solvent flattening on 2FO-FC maps).
RECORD 4 INPPRB (free format)
INPPRB = Name of input phase probability distribution file.
RECORD 5 INPFC (free format)
INPFC = Name of input calculated structure factor file.
(Obtained from inversion of modified map or computed
from a partial structure).
RECORD 5A EXTREF (free format)
****** include this record ONLY IF IEXT > 0 *****
EXTREF = Name of input file containing additional structure
factor amplitudes (and possibly phase probability distribution
coefficients) for phase extension.
RECORD 6 OUTPRB (free format)
OUTPRB = Name of output phase probability distribution file,
corresponding to combined (and possibly phase
extended) data.
**** FILES ****
INPPRB = (binary file) = Input Fourier and probability distribution
coefficients, one reflection per record
containing H,K,L,FM*FO,FO,PHI,IPRAB,IPRCD,MK,
FOM where H, K, L,IPRAB,IPRCD,MK= integers,
FO,FM*FO,PHI,FOM=real and PHI is in degrees.
This file usually is prepared by program
PHASIT or IMPORT. But if it is a previous
output from BNDRY, then new phase information
can be combined with phases AFTER density
modification instead of with the original
phases. In the latter case IOTYP should have
been set to 0 when the file was originally
created.
INPFC = (binary file) = Input Fourier coefficients (from inversion of
modified map or from partial structure), with
records containing H,K,L,FO,FC,PHI as output
from MAPINV or PHASIT. FO is not used.
EXTREF = (ASCII, free format) = Input reflection file. Each record
should contain H, K, L, FNAT, A_B and
C_D where H, K, L = INTEGERS, FNAT=REAL
and A_B, C_D are INTEGERS. If phase
probability distribution coefficients
are available they are packed two per
word in A_B and C_D as in a normal
"phased" file. If they are not
available A_B and C_D are zero. This
file is needed only if IEXT > 0.
Phases will be extended to these
reflections, subject to DCUT criteria.
This file usually is prepared by
program MISSNG.
OUTPRB = (binary file) = Output Fourier coefficients, after
combination of phase information, with same
structure as on INPPRB, except that the
Hendrickson-Lattman coefficients, phase and
figure of merit correspond to the new phase.
Note that the FO entry will actually contain
FC for reflections which were "amplitude"
extended. If IOTYP=1, then the amplitude
slots in each record will contain FO and FC
instead of FM*FO and FO, enablng different
types of Fourier maps to be computed during
density modification runs, if desired.
WARNING! The IOTYP=1 option is fine if the
output file is to be used ONLY for Fourier
calculations, but it must NOT be used later
in BNDRY or PHASIT as an input file, since
they expect FO to be picked up from the
second amplitude slot!
**** BNDRY PROGRAM STRUCTURE ****
Depending on the value of IOPT, the following events take place.
For IOPT = 0
The input sphere radius RAD is read in along with the current
reflection data and phase information. A unique set of reflections
is selected and the Fourier transform of the weighting function
W = 1-r/RAD with W = 0. for r > RAD is then computed for each
reflection. The transform FS is as follows
A = 4 * pi * RAD * sin(theta)/lambda
3 4
FS = 4 * pi * RAD * [ 2 * ( 1 - COS(A) ) - A * SIN(A) ] / A
The input structure factor amplitudes are multiplied by FS, and the
modified data is written out. In the original Wang algorithm, the
weighting function was applied by convolution with the electron
density in direct space, after zeroing out negative densities.
Instead, we zero out the negative densities, invert the truncated map
to obtain structure factors, multiply the structure factors by the
transform of the weighting function, and compute a new modified map
from the resulting modified structure factors. This is identical to
the original method except that it is much more efficient,
particularly for large maps and/or large RAD. It also does not require
the time-saving approximation of repeating the procedure twice with
half of the desired RAD as was sometimes needed with the direct space
algorithm. The resulting coefficients will generate a map which is
equivalent to zeroing out negative densities and then taking a
weighted average (with weight W) of all density within RAD angstroms
of each grid point in the input map.
For IOPT = 1
The input fractional volume known (or thought) to represent solvent is
read in and converted to the corresponding number of grid points in
the map. The modified electron density map is then read and a
histogram is generated keeping track of how many grid points have
associated electron density of a given value. Starting from the lowest
density value, an electron density threshold RHOCUT is incremented
until the total number of grid points having density less than RHOCUT
equals the number of grid points representing solvent. The modified
electron density map is then rewound and read again, but this time as
each record is read, a "MASK" value is defined for each grid point
depending on whether the corresponding density exceeds RHOCUT. Mask
values of 2 represent the solvent region, anything else the protein
region. The mask map is then output to a file. Note that the mask can
be displayed/edited with program MAPVIEW.
For IOPT = 2
An empiracal constant SVAL is read in and is used to approximate F000/V
(on scale of input map) from the relationship
< Rho solvent > + F000/V
-------------------------- = SVAL
Rho(Max) + F000/V
An electron density map and a mask map are read in. Using the mask map
to discriminate protein and solvent regions, the mean density in the
solvent region is computed, and the maximum density in the protein
region is determined. From these three quantities F000/V (on scale of
input map) is then estimated from the relationship above. A new
modified map is then constructed such that the electron density at
each grid point is equal to
< Rho solvent > + F000/V if in solvent region.
Maximum of (Rho input) + F000/V or zero if in protein region.
Thus solvent leveling and negative density truncation are enforced.
This is identical to the procedure in Wang's ISIR programs.
For IOPT = 3
Indices, structure factor amplitudes, Hendrickson-Lattman coefficients
and restricted phase indicators are read in and stored for the
original set of phased reflections. If phase extension is requested
then a file containing additional reflections for which amplitudes
(and possibly phase probability distribution coefficients) are
available is also read in and the data stored. Then new computed
structure factors (either from inversion of a modified map or from a
model based calculation) are read in. Indices and phases from the
computed structure factors are transformed (if needed) to the standard
asymmetric unit, systematic absences are rejected, phase restrictions
are determined and the DCUT criteria is imposed (for phase extended
reflections). The new indices are compared with those stored, and if a
match is found the new phases and amplitudes are paired up with the
old. If no match is found and AMPLITUDE extension was requested, the
unpaired reflections are written to a scratch file.
The FC's are then scaled to the FO's by least squares based on the
original phased reflections, and the paired data are then sorted on
resolution and divided into ranges according to sin(theta)/lambda. If
Sim weighting or AMPLITUDE extension is requested the mean
sin(theta)/lambda and mean ABS(FO**2 - FC**2) are computed for each
range, and a three term polynomial in sin(theta)/lambda is fit to the
mean ABS(FO**2-FC**2) by least squares. If Sigma_A weighting is
requested the ranges are then used to determine normalized structure
factor amplitudes from which the Sigma_A values are derived.
For all paired reflections, new Hendrickson-Lattman coefficients for
the map inversion/partial structure data are computed according to
A = W * COS (Phi calc)
B = W * SIN (Phi calc)
C = 0
D = 0
where for Sim weighting
W = 2 * FO * FC / < | FO**2 - FC**2 | >
sin(theta)/lambda
and for Sigma_A weighting
W = 2. * Sigma_A * EO * EC / (1. - Sigma_A**2) for acentric data
W = Sigma_A * EO * EC / (1. -Sigma_A**2) for centric data
Test calculations (on a single model structure) indicated that the
Sigma_A weighting gave slightly worse results when used to combine
solvent flattened and MIR phases, but slightly better results when
combining model based phases with MIR phases. Regardless of weighting,
the new coefficients are combined with any input coefficients.
Prior to phase combination, the original input coefficients are damped
(if desired by the user), to increase the relative contribution of the
newly introduced (map inversion or model based) information. Usually
the damping factor is 1. (no damping), but in cases where phase
combination involves a fairly small (percentage wise) partial
structure fragment, damping the input coefficients can increase the
impact of the partial structure contribution. The combined phase
probability distributions are then evaluated and integrated to yield
"best" (centroid) phases and the associated figure of merit. The
combined phase information is then written to the output file in one
of two user selected formats, and summary statistics are listed which
include R factors, correlation coefficients and mean figures of merit
for original, extended and all reflections. It is important to note
that when phasing with only anomalous scattering data at one
wavelength, phase extension will ALWAYS be required to insure that
centric reflections get phased. Also, when phase extension is requested
and input reflection data for extension (generated by program MISSNG)
includes probability distributions, then the distributions will
always be combined with those from the map inversion/partial
structure. For extended reflections input without distribution
coefficients, the output phases will correspond exactly to those from
the map inversion/partial structure.
If AMPLITUDE extension was requested, the previously written scratch
file is rewound, read and the FC's rescaled (using the previously
determined scale factor). After insuring that only unique reflections
are selected, these data are also passed to the output file, except
that the figure of merit and HL coefficients are computed using
W = FC * FC / < | FO**2 - FC**2 | >
sin(theta)/lambda
This option is generally used to fill in missing (usually low order)
reflections within the resolution range of the measured data, NOT for
extension to higher resolution. It is usually done only after
convergence is obtained with all other data.
Use of the BNDRY program for density modification, control files and
important considerations are discussed later where sample inputs are
given.
2.03 FSFOUR WRITE-UP
PURPOSE- To calculate three dimensional Fourier transforms (maps)
when given a set of Fourier coefficients and control cards.
This program will calculate maps by using a multivariate variable
radix fast Fourier transform algorithm. the only restrictions are that
the number of grid points along each axis is even, and is a product of
the factors 2, 3, 4, or 5. Each factor can be used more than once.
The program is fully general so that all space groups can be handled.
The input structure factors must fall in the following range:
-NX/2 < h < NX/2
-NY/2 < k < NY/2
-NZ/2 < l < NZ/2
where NX, NY, NZ are the number of grid points along the a, b, and c
axes, respectively. Input structure factors outside the range will be
omitted from the calculation.
INPUT DATA (UNIT 5)
CARD 1 PAMFIL (free format)
PAMFIL = Name of input file containing cell and symmetry
information.
CARD 2 TITLE (free format)
TITLE = anything
CARD 3 NCENT,NX,NY,NZ,MAPTYP,IPRINT,NPIC,NORN,INPF,GSP,DCUT
(free format)
NCENT = 0 for noncentrosymmetric
space groups
= 1 for centrosymmetric
space groups
NX = number of grid points along the
a,b and c axes, respectively. If
NY = an input value is inconsistant
withthe factoring scheme, the
NZ = next largest acceptable value
will be used. If zero, see GSP
below.
MAPTYP = Fourier coefficient selection integer
= 1 for FO*exp(i*PHIC)
= 2 for FC*exp(i*PHIC)
= 3 for (FO-FC)*exp(i*PHIC)
= 4 for (2*FO-FC)*exp(i*PHIC)
= 5 for (FO-FC)**2
(difference Pattersons)
= 6 for FO**2
= 7 for FC**2
= 8 for -i*(FH+ - FH-)*exp(i*PHIH+)
(Bijvoet difference Fourier)
= 9 for (3*FO-2*FC)*exp(i*PHIC)
IPRINT = 0 for no printing of map
= 1 for printing of map
NPIC = number of non-hydrogen atoms
in the asymmetric unit. (Not
used within the program but
is passed on to program PSRCH
via the map file. Should not
exceed 140).
NORN = 0 for XZ sections
= 1 for YZ sections
= 2 for XY sections
***** CAUTION *****
If the map file is to be
input to programs PSRCH,
MAPINV, MAPVIEW, GMAP or CTOUR,
NORN must be 0.
INPF = 0 for binary reflection file input.
= 1 for formatted reflection file input.
GSP = Desired grid spacing in angstroms.
Defaults to 1.0, applied only if
NX=NY=NZ=0 to determine number of
grid points along each axis.
DCUT = minimum d spacing cutoff, in
angstroms, for acceptance of
input reflections.
CARD 4 INPREF (free format)
INPREF = Name of input reflection file.
CARD 5 MAPFIL (free format)
MAPFIL = Name of output map file.
CARD 6 LEVEL, (XLIM(I), I=1,3) (free format)
***** this card should be included ONLY if IPRINT is nonzero *****
LEVEL= scan level, if peaks are greater
than scan level, the peak will be
underlined with **, if zero,
defaults to 100
XLIM(1) =
XLIM(2) = printing limits. map will be
printed from 0 to XLIM (fractional)
XLIM(3) = along each axis
********* NOTES ON THE PROGRAM **********
The input reflection file is terminated by an end of file, and
should contain records with H, K, L, FOBS, FCAL, PHI where the first
three variables are INTEGERS and the remainder REALS. PHI should be in
degrees. The file may either be formatted or binary as indicated by
the parameter INPF. If it is formatted the format is assumed to be
( 3I4, 2F10.2, F7.2). If the input file contains records with
H,K,L,FPH,FP,PHI then MAPTYP=5 can be used to compute isomorphous
difference Pattersons (PHI is not used). If the input records contain
H,K,L,F(H,K,L),F(-H,-K,-L),PHI(H,K,L), then MAPTYP=5 will compute
anomalous difference Pattersons, and MAPTYP=8 can be used to
compute Bijvoet difference Fouriers. Note that if a binary file is
input each record must contain six words even if all of them are not
used in the calculation, i.e. PHI is not needed if MAPTYP=5,6 or 7,
but some value still must be supplied.
The output map file is binary and contains NSYM + 2 header records
followed by the map. If NORN = 0, the map is written such that each
record contains NX density values (one row along x), with NZ
consecutive records constituting each section of constant y, i.e. y is
slowest varying. If NORN = 1, the positions of x and y are
interchanged. If NORN = 2, the positions of y and z are interchanged.
All map values are integers scaled as described below. When NORN=0
the map file is suitable for input to programs PSRCH for locating
peaks, to MAPINV for modification followed by inversion, to MAPVIEW
for interactive contouring and display, to GMAP for conversion to
TOM/O or CHAIN formats and for creation of skeletons, or to CTOUR to
create hard copies of contoured plots. If NORN is nonzero the only
recourse is to print the map within this program.
***** SCALING THE DATA *****
Two scales are used, one for the binary map file and one for the
printed map output,if requested. If the input coefficients were on an
absolute scale, then the absolute electron density is obtained as
follows:
rho (absolute) = 10.*(printed map value)/(V*scale) + F000/V
rho (absolute) = (value on binary map output file)/(V*scale) + F000/V
where V is the unit cell volume and scale is given on the lineprinter
output. F000 is the total number of electrons in the unit cell. Note
that even if F000 is supplied on the input file, it will not be used
in the program, and must be added as indicated above. Also note that
the PRINTED output is limited to two digits per density value, but the
density is NOT rescaled to a maximum of 99. This means that values of
99 merely imply a density of AT LEAST 99.
***** BIJVOET DIFFERENCE FOURIERS *****
When maptyp=8 is selected, and the input reflection file contains
records with H, K, L, F(H,K,L), F(-H,-K,-L), PHI(H,K,L) then a
"Bijvoet difference Fourier" will be computed. In this case the map
consists of only the "imaginary" part of the electron density, and
should show strong positive peaks only at the sites of anomalous
scatterers (if the hand is correct). The multiplication factor -i is
applied only after expansion to a hemisphere to effectively
interchange real and imaginary parts of the density, as the program
would normally only compute the "real" part.
***** FILES *****
INPREF - Input Fourier coefficient file, can be either formatted or
binary as determined by input parameter INPF. Records
should contain h, k, l, Fobs, Fcal, Phi with h,k,l
INTEGERS and Fobs, Fcal, Phi REALS. Phi is in degrees. If
INPF=1, then file should be formatted with
FORMAT(3I4,2F10.2,F7.2)
MAPFIL - Binary map file output. Contains NSYM+2 header records
followed by the map, as described earlier.
2.04 MAPINV WRITE-UP
PURPOSE- To calculate three dimensional Fourier coefficients
(structure factors) when given an electron density
map and control cards.
This program will calculate structure factors by using a
multivariate variable radix fast Fourier transform algorithm to invert
an electron density map. The program is fully general so that all
space groups can be handled. It is assumed that the input map was
prepared by program FSFOUR. Structure factors may be calculated for
reflections in the following range:
h .ge. -NX/2 and h .lt. NX/2
k .ge. -NY/2 and k .lt. NY/2
l .ge. 0 and l .lt. NZ/2
where NX, NY, NZ are the number of grid points along the a, b, and c
axes, respectively, in the input map.
INPUT DATA (UNIT 5)
CARD 1 PAMFIL (free format)
PAMFIL = Name of input file containing cell and symmetry
parameters.
CARD 2 TITLE (free format)
TITLE = anything
CARD 3 MAPFIL (free format)
MAPFIL = Name of input map file.
CARD 4 SFOUT (free format)
SFOUT = Name of output structure factor file.
CARD 5 IPRNT, IPAIR, HMIN, HMAX, KMIN, KMAX, LMAX (free format)
IPRNT = 0 for no printing of structure
factors.
= 1 for printout
IPAIR = 0 for no pairing of calculated
structure factors with observed
data.
= 1 to combine calculated structure
factors with observed data
(supplied on auxilliary file) and
output R factor to the line printer
= 2 same as 1, but a separate file with
the combined data is also written.
HMIN =
limiting values defining range of
HMAX =
indices for which structure factors
KMIN =
will be calculated
KMAX =
(LMIN is always 0)
LMAX =
CARD 5A AUXINP (free format)
***** This card should be included ONLY if IPAIR > 0 *****
AUXINP = Name of input file containing auxilliary structure
factors for scaling.
CARD 5B AUXOUT (free format)
***** This card should be included ONLY if IPAIR = 2 *****
AUXOUT = Name of output file to contain calculated structure
factors scaled to (and paired with) the auxilliary
structure factors.
CARD 6 SC, F000, IMOD, IRHOMN (free format)
SC = scale factor applied to calculated
structure factors (see below).
If 0. defaults to 1.
F000 = total number of electrons in the unit
cell (see below).
IMOD = 0 for no modification of map prior to
transformation
= 1 to modify map prior to transformation
according to input criterion
=-1 same as 1 but the resulting density
is also squared prior to transformation.
IRHOMN = modification criterion (applied if
IMOD .ne.0). If (rho input + IRHOMN) < 0,
rho will be reset to 0. If F000 is
supplied, IRHOMN is automatically set to
correspond to non-negativity of electron
density.
********** NOTES ON THE PROGRAM **********
The input map file MAPFIL is assumed to have been generated with
program FSFOUR. It is binary, terminated with an end of file, and
after a few header records, contains the electron density map
represented as records (of integers) along x. y is the slowest varying
coordinate.
All calculated structure factors within the index range specified will
be output to file SFOUT. Note that this may include redundant
(symmetry related) as well as systematically absent reflections, if
they they fall within the specified index range. The output file is
binary, with records of H,K,L,FCALC,FCALC,PHI and is terminated by an
end of file. H,K and L are INTEGERS whereas FCALC and PHI are REALS.
PHI is in degrees. Note that FCALC is duplicated within each record so
that the file structure is consistant with the input required by
program FSFOUR.
If IPAIR > 0, then in addition to file SFOUT, an input file AUXINP of
observed structure factor amplitudes will be paired with the
corresponding calculated amplitudes and phases, and the combined data
used in one cycle of least squares refinement of a scale factor. The
resulting R factor between observed and calculated amplitudes is then
output to the lineprinter. Note that the input reflection data on file
AUXINP is restricted to the same range of indices as the calculated
data. If input values fall out of bounds they will be ignored.
Therefore, if data were collected with L negative, it will have to be
transformed by symmetry before it can be used successfully on file
AUXINP.
If IPAIR = 2, the results are identical to those obtained with IPAIR =
1, except that the combined (and rescaled) data is also output to a
separate file AUXOUT. The new file is of the same form as SFOUT, but
with records consisting of H,K,L,FOBS,FCALC,PHICALC for only those
reflections which were input on file AUXINP.
***** SCALING THE DATA *****
It is often desirable to control the scale of the calculated structure
factors. If the input electron density map was generated from
structure factors which are related to an absolute scale by:
F(input to FSFOUR) = k * F(absolute)
then k should be input for SC to obtain calculated structure factors
on an absolute scale. If sc= 0. (or 1.), then the calculated structure
factors will be on the same scale as those used to generate the map
(unless IMOD = -1, in which case they will be much larger). Note that
this scaling applies only to the output on file SFOUT (and the
lineprinter, if IPRNT .ne.0). If IPAIR .eq. 2, then the calculated
structure factors on file AUXOUT will always be scaled for best
agreement with those supplied on file AUXINP.
***** MODIFYING THE MAP *****
The following applies only if IMOD .ne. 0. Inclusion of F000 will
result in imposing non-negativity of electron density everywhere in
the map prior to inversion, provided SC is reasonably well known. If
SC is unknown, then F000 and SC on card 3 should be zero and IRHOMN
should be input to control the type and degree of modification.
Intelligent use of this parameter would then require knowledge of the
input map values prior to running the job. IRHOMN should be equal to
F000/V on the same scale as the input map. If IMOD = -1, IRHOMN is
first added to each density value, resulting values below zero are set
to zero, and each value is then squared prior to Fourier
transformation. This is equivalent to imposing non-negativity,
followed by one cycle through the tangent formula. Phases can
therefore be tangent formula refined or extended.
***** FILE REQUIREMENTS *****
MAPFIL - input map file, binary, as output by program FSFOUR
SFOUT - output file with all calculated structure factors,
binary, six word records as described earlier
AUXINP - auxilliary input structure factor file (required only
if IPAIR .ne. 0), binary, six word records in same form
as SFOUT (only H,K,L and FOBS are used)
AUXOUT - output auxilliary structure factor file, binary, six
word records as described earlier.
2.05 PAMFILE WRITE-UP
This is not a program, but rather a description of a "standard
parameter file" which is read by all programs in the PHASES package,
and several auxilliary programs as well. The main purpose of this file
is to insure consistency in cell constants, symmetry, lattice type etc
throughout all programs, and to eliminate redundant input of these
parameters by the user. In addition one can optionally specify the
name of a "running log file." If this is done then in addition to
normal output to either the screen or individual log files for each
program, all printed output is also appended to a single file,
preceeded by a time stamp indicating what program was run and when.
Thus one can maintain a complete history of all computations and
results in a single log file. The standard paramater file is often
referred to generically in program write-ups as "PAMFIL." One should
select a name for it which is indicative of the particular structure
being worked on, and rapidly communicates to the user that it is a
parameter file. For example, PDC.PAM might be a good choice for
phasing pyruvate decarboxylase.
Each standard parameter file should contain the following
information in the indicated sequence.
LOGFILE=FILNAME Where FILENAME is the name of the
desired "running" log file. If no
cumulative log is desired, enter
LOGFILE=NULL
There must be no spaces immediately
preceeding or following the "=". Upper
or lower case is permitted.
LATTICE=X Where "X" is either P,A,B,C,I,F or R
There must be no spaces immediately
preceeding or following the "=". Upper
or lower case is permitted for the word
LATTICE, but only UPPER case for the
single character symbol.
A, B, C, ALPHA, BETA, GAMMA Unit cell constants, in angstroms and
degrees. Readable in free format, i.e.
at least one blank or comma separating
entries.
NSYM Number of equivalent positions in the
space group. Do NOT include
additional translations associated
with centering conditions for
non-primitive lattices, i.e. for
space group C2 NSYM=2. (this entry
read in free format).
The NSYM symmetry operators follow, one operator per line EXACTLY
as indicated in the International Tables for X-Ray Crystallography.
The first operator should ALWAYS be X,Y,Z. Note that for rhombohedral
lattices the HEXAGONAL AXES AND SYMMETRY OPERATORS SHOULD BE USED,
along with the lattice type R.
The following sample serves as a complete template for a parameter
file, for space group P2(1)2(1)2(1)
LOGFILE=seb.rlog
LATTICE=P
45.331 68.33 79.62 90. 90. 90.
4
X,Y,Z
1/2-X,-Y,1/2+Z
1/2+X,1/2-Y,-Z
-X,1/2+Y,1/2-Z
2.06 MAPVIEW WRITE-UP
MAPVIEW is an interactive program to contour and display electron
density map sections, display mask sections and to facilitate
construction of one or more "molecular masks", by allowing one to
interactively "trace out" envelope boundaries with a cursor tied to a
mouse. The selected map (and mask) regions may be output for use in
other programs, or simply displayed on a workstation monitor. The
program is extremely useful for examining ANY map, be it an electron
density, difference density or Patterson map, but it is essential for
creation of molecular masks for use in noncrystallographic symmetry
averaging. The program functions interactively, and must be run on a
workstation with a color monitor. Two program versions are available,
one specific for Silicon Graphics systems which uses the GL, and another
(called mapview_X) which uses X-window graphics. The X-window version
also functions on SGI hardware, but the original mapview runs only
on SGI systems. If running mapview on an SGI, be sure to initiate the
program from a WINTERM window (as opposed to from an XTERM window).
mapview_X can be initiated from either window type. Both programs use
only the left and middle mouse buttons to accept user input (and the
keyboard). When either program is started up, the following sequence
of events takes place.
***** MAP SELECTION *****
The program first prompts the user for the name of a "map" file, and
inquires whether or not it is a "FSFOUR" type file. Usually it will be
a FSFOUR file (which runs over the whole unit cell), but if not, one
can input a map as an "averaging" style file (e.g xz sections, as
generated by EXTRMAP,MAPORTH,MAPAVG,SKEW or the saved output from an
earlier run of this program).
***** MASK SELECTION *****
The program then prompts to find out whether a mask will be created or
used. If one simply wants to look at contoured maps it won't, but to
create/examine/edit masks for noncrystallographic symmetry averaging
purposes or to look at solvent boundary maps answer yes. If yes, the
program prompts as to whether a previously created mask file should be
used, and if so, for the file name.
***** MAP REGION SELECTION INFORMATION *****
If a "FSFOUR" style map was input, the program then prompts for the
minimum and maximum coordinates in each direction, to define the
region to be displayed. Any values are allowed (including multiple
cells, i.e. the range can span one or more cell edges, as any needed
unit cell translations will be done automatically). The program then
prompts for the desired section orientation, i.e xz, xy or yz
sections. Note that if a previously created mask file is recovered,
then only xz sections can be used, and the range selected MUST cover
EXACTLY that used when the mask was created. For looking at the
solvent masks created by BNDRY this means one must select the entire
cell, i.e x,y,z all ranging from 0. to 0.999, although program EXTRMSK
can be used first if a different region must be chosen. If an
"averaging" style map was input this section is bypassed as only xz
sections are possible, and the range will cover precisely that in the
input map.
***** SECTION SELECTION INFORMATION *****
The program will then inform the user as to how many map sections are
present, how they are numbered, and inquire which section should be
displayed initially.
***** CONTOUR LEVEL INFORMATION *****
The program then informs the user what the minimum, maximum and sigma
values for the entire input electon density map are. It then prompts
for minimum, maximum and increment values for contour levels. The
program then contours and displays the requested section, and enters
interactive mode.
*************** I N T E R A C T I V E M O D E ***************
When in interactive mode, a menu is displayed and all subsequent
actions are requesed by the user pressing the left mouse button while
the cursor is in the desired menu option area. The following actions
then take place, when the appropriate item is selected.
***** NEXT SECTION OPTION *****
Replaces the display with the contoured map corresponding to the next
(adjacent, higher) map section
***** PREV SECTION OPTION *****
Replaces the display with the contoured map corresponding to the
previous (adjacent, lower) map section
***** NEW SECTION OPTION *****
Prompts user for section to be displayed. It lets the user "jump
around" the map, rather than having to scroll up or down with
multiple "prev" or "next" requests.
***** C LEVEL OPTION *****
Prompts user for new contour level information, then recontours and
displays the currently selected section.
***** NEW DIRECTION OPTION *****
Allows user to reselect display region range and/or map orientation.
Allowed only when FSFOUR style maps are in use.
***** ADD NEXT SECTION OPTION *****
Contours and displays next (adjacent, higher) section, but doesn't
clear old one first, i.e can be used to create projections, since the
contoured sections accumulate one on top of another on the screen.
Note!! There is no "shifting" as sections are added, thus projections
are strictly true only when viewed down an axis orthogonal to the
section. (When only a small number of sections are accumulated it is
reasonably valid even for nonorthogonal systems, as long as the cell
angles are not extreme).
***** ADD PREV SECTION OPTION *****
Same as "ADD NEXT SECTION" option, but the previous (adjacent, lower)
section is contoured and added. Same considerations as above.
***** SET MASK NO. OPTION *****
Active only when masks are in use. Allows user to select which mask
number (from 1 to 12) is to be used during mask tracing via "TRACE
MASK" option. This allows one to create multiple molecular envelope
masks, usually one for each molecule within the asymmetric unit. The
currently active mask no. will be displayed at the bottom right of the
screen, along with a sample of the color assigned to that mask.
***** EXIT OPTION *****
If the orientation is not such that xz sections are in use, the program
simply terminates. If xz sections are in use, the program prompts as
to whether the map region selected should be saved, and if masks are
in use, whether the masks should be saved. If either maps or masks are
to be saved, the user is informed as to what region is currently
available. If masks are in use and the "MAKE ASU" option had
previously been selected, the user is also informed about the ranges
delineating the "molecule" as determined from the mask. The user is
then prompted as to what region should be saved. The saved region can
only be the entire region currently available, or a subset of it. The
user is then prompted for file names for each saved file. The "saved"
map file is in "AVERAGING" style format, and thus can be read back in
to the program later, or can be input to the averaging programs. The
mask file can also be read back and/or used with the averaging
programs.
***** CLEAR OPTION *****
Clears the screen and redraws the current section, with all parameters
unchanged. Useful if one wants to start over and recreate a mask for a
specific section, if not happy with the original choice. Also, if the
system is very busy, occasionally switching from an open textport back
to interactive mode can leave remnants of the textport on the screen.
This option can then be used to correct the display.
***** SAVE IMAGE OPTION *****
Pressing the left mouse button while in the "SAVE IMAGE" menu area
will save the entire screen contents as an "image" file, with the name
"MAPV_N.RGB", where N is a one or two digit number. Numbers start from
zero and are automatically incremented each time an image is saved. Up
to 100 images can be made in any job. Note!! This function is only
operational in the SGI version.
The following additional menu options are functional only when
masks are in use
***** SHOW MASK OPTION *****
Displays "masks" for current section (if available). Each masked grid
point is displayed as a colored dot superimposed on the contoured
section. Unassigned points (outside all molecular envelopes) are shown
in blue. Points inside molecular envelopes are shown with the color of
the particular envelope mask (with black, i.e no color for mask 1). If
the "MAKE ASU" option was run, points within molecular envelopes which
are redundant (by crystal symmetry) are shown in red. Points outside
molecular envelopes, but related to those within molecular envelopes
by crystal symmetry are shown in green.
***** MAKE ASU OPTION *****
Should be invoked after all needed mask sections are created. Will
prompt the user for the standard parameter file specifying the
symmetry information. The program will then examine each point within
all molecular envelopes, and check for redundant (by crystallographic
symmetry) entries. Mask values for redundant points will then be
changed. After using this option, the "SHOW MASK" option will flag
redundant points with the color red. If one is happy with the
particular asymmetric unit "retained" (not red), then one can exit as
the averaging programs will recognize the updated mask value as
redundant and ignore it. If however, one feels a symmetry related part
of the redundant area should have been retained instead, then the
masks should be recreated for the offending sections. After examining
all sections, the total number of redundant points, and unique points
within all envelopes is output. After identifying redundant points,
the user is prompted as to whether points related by symmetry to those
within the envelopes should also be identified. If so, the "SHOW MASK"
option will color all grid points related to those within the
envelopes green. This is useful to insure that all density is
accounted for, and to emphasize intermolecular borders. Note that the
"MAKE ASU" option SHOULD NOT be used with SKEWED maps. (If a mask
is created on a "skewed" map, program TRNMSK can convert it to
correspond to normal sections, so that this option can still be used,
provided the original extracted map used to create the skewed map
is input as well).
***** COPY PREV MASK OPTION *****
Copies mask for previous (adjacent, lower) section to current section,
and displays it. It is automatically saved, but one can recreate it
with the "TRACE MASK" option, if desired.
***** COPY NEXT MASK OPTION *****
Copies mask for next (adjacent, higher) section to current section,
and displays it. It is automatically saved, but one can recreate it
with the "TRACE MASK" option, if desired.
***** TRACE MASK OPTION *****
Invoking this option allows the cursor and mouse buttons to be used to
trace out the boundary for one or more electron density "islands" in
order to isolate individual molecular envelope or averaging
boundaries. Press the LEFT mouse button to invoke the option. Then
move cursor into map region, and press the LEFT mouse button once at a
point on the boundary. Move to new point on boundary and press again.
A blue line will be drawn connecting to the previous point. Continue
selecting points all along the boundary until you connect back to the
initial point. If other "islands" are needed, move the cursor to a
point on the boundary of the next "island", and press the MIDDLE mouse
button. As before, continue selecting points around the new boundary
with the LEFT mouse button until connected to the initial point for
this "island". Repeat as often as needed, starting each new island
with the MIDDLE mouse button. When the section is completed, move the
cursor back to the "TRACE MASK" menu area and press the LEFT button
again to let the program know you are done. The mask section will then
be "filled" such that a blue dot will be placed at each grid point
outside the envelope(s) to show you the mask you created. Note that
such a mask must be created for each section which contains part of
the envelopes. Note also that the mask creation is done for only one
section at a time, thus if multiple sections are displayed (e.g. "ADD
NEXT SEC" or "ADD PREV SEC" options were used), the created mask is
only for the last (most recent) section. This will always correspond
to the section indicated at the top of the plot area. Also be aware
that the mask created for each section is initialized (with respect to
the mask no. currently active) each time the option is invoked
(although the screen display will not be redrawn). Thus once invoked
you must complete the mask for that section. Once the process is
completed (area "filled"), the mask is saved for that section, and can
be displayed at a later time with the "SHOW MASK" option. You can then
use the "SET MASK NO." option to select a different mask, and use
"TRACE MASK" again on the same section. In this way multiple mask
envelopes (with each encompassing a different molecule) can be
created, which will be needed for averaging if the noncrystallographic
symmetry is not purely rotational or the rotational symmetry is not
N-fold where N is an integer.
HINTS!!! When pure rotational noncrystallographic symmetry is present,
the boundary is much easier to determine if the mask is constructed on
a "skewed" map such that sections are orthogonal to the NC symmetry
axis. Also, it is much clearer to see if three or four sections are
accumulated via "ADD NEXT" or "ADD PREV" commands. A useful procedure
therefore is to accumulate 4 sections and create an initial mask, then
select "PREV SECTION" twice, followed by "ADD SECTION" three times.
This will position you one section higher than the first mask, but
with a projection over 4 sections visable. The same thing can be done
in the reverse direction with the related commands. You can do this
repeatedly as you step through the map. Also, often the density
changes slowly, so you may be able to simply copy the previous or next
sections mask, and use it unchanged. If you do such a copy but then
decide to edit the mask (via TRACE MASK option), the new mask will
overwrite the old, but the visable display will reflect the
superposition of both old and new masks, and will be confusing. To
convince yourself that the new one is in fact used, just select
"CLEAR" followed by "SHOW MASK", to see the actual mask saved. Note
that when pure rotational symmetry (N-fold) is involved, only one mask
is needed which encompasses all related molecules. In that case the
mask itself should generally display the same symmetry as the contents
of the envelope, thus masks over noncrystallographic twofold symmetry
regions should also look like they have twofold symmetry, and only
density which at least approximately obeys the expected NC symmetry
should be encompassed by the mask.
************** FILE FORMATS ***************
FSFOUR style map files are described in the FSFOUR write-up
If a map and/or mask file is generated by the program (or a non-FSFOUR
style map is to be read in), the following formats apply.
***** "AVERAGED" style map file (binary) format *****
RECORD 1) A,B,C,AL,BE,GA,NX,NY,NZ,IXMN,IYMN,IZMN,IXMX,IYMX,IZMX
with first 6 values REAL*4, next 9 INTEGER*4, lengths in angstroms,
angles in degrees.
NX =
Number of grid points defining one "cell length" along
NY = respective axis. Implicitly defines grid spacing as
del x = A/NX, del y = B/NY and del z = C/NZ
NZ =
IXMN, IXMX =
Minimum, maximum grid index defining map region such
IYMN, IYMX = that x (fractional) = IX * (del x) / A etc.
There are no restrictions on magnitudes or signs.
IZMN, IZMX =
The map follows as (IYMX-IYMN+1)*(IZMX-IZMN+1) records, with
each containing one row (IXMX-IXMN+1 REAL*4 values) along x,
starting at IXMN. y is slowest varying, i.e. the file could have
been created with the following FORTRAN code:
DO 30 IY=IYMN,IYMX
DO 20 IZ=IZMN,IZMX
20 WRITE(LU)(RHO(IX,IY,IZ),IX=IXMN,IXMX)
30 CONTINUE
***** all mask files (binary) format *****
Header record identical to that in "AVERAGED" style map file.
Mask records similar to "AVERAGED" style map records except that
the mask values are written as FORTRAN type "BYTE" (INTEGER*1).
Mask values which are 0, 10, 20, 30, 40 etc idicate corresponding
grid point is part of the envelope for masks 1,2,3,4,5 etc,
respectively. All other values indicate the points do not belong
to any mask envelope, and should not be used for refinement or
averaging.
2.07 GMAP WRITE-UP
Gmap (Graphics Map) can be used to create map and skeleton files
for use with external graphics programs. The program is interactive
and prompts for all information. It reads a standard FSFOUR map and
prompts the user to provide range information (fractional coordinate
limits) for the region to be extracted. Any region is valid, including
positive and negative coordinates, and spanning of unit cells. It
prompts for the names of the input and output map files, and the type
of output file desired. Currently one can output one of two map file
types: TOM/O or CHAIN. TOM/O files can be read by the programs TOM
(IRIS version of FRODO) as well as by O. CHAIN files can be read by
the CHAIN program. If a CHAIN file is requested the user will be
asked if CHAIN is to be run on an SGI or ESV workstaion. The standard
deviation for the output map is then given to aid in contour level
selection later in the graphics programs.
The user is then asked if a skeleton should be generated, and if
so, for the base and step level for skeletonisation. Suggested base and
step levels are 1.25*sigma and sigma, respectively. The user is then
prompted for the minimum length to designate main chain skeletons
(typically 10). After skeletonisation, one is prompted for the type
of skeleton output desired. Two types are available: TOM skeleton
files and O skeleton data blocks. In both cases one is prompted for
the output file name. If an O skeleton data block is requested, the
O molecule name (not to exceed 5 characters) is also prompted for.
One can create both TOM and O skeleton output in the same job.
INPUT FILE: Standard FSFOUR map, in the default orientation (NORN=0).
NOTES:
1) Generation of the graphics maps can be highly machine specific.
The current versions of GMAP, if run on an IRIS, SUN, ESV,
IBM R6000, or DEC ALPHA (OSF or OPENVMS) workstation will produce
map files readable by TOM, O or CHAIN on SGI or ESV workstations.
In all cases the map files are directly readable by TOM, O or CHAIN
on the target workstation, but will have to be transferred to
the graphics workstation (by ftp, with type BINARY set) unless the
disks are cross mounted via NFS. At present there is no provision
to create map files for older (non-workstation, eg PS300) graphics
programs.
2) The map files produced are binary, random access "DSN6" like maps,
and should be used directly in the graphics programs without the
need for running "mappage", "vaxmap" or any other formatting
programs.
3) With some versions of TOM, it may be necessary to recompile TOM
on SGI workstations with the f77 flag -old_rl set for the map
files to be used correctly.
4) TOM style skeleton files are binary files which can be used
with the TOM program on an IRIS, even if they are created with
the VAX version of GMAP. If such files are generated on a VAX
they must be transferred to an IRIS (via ftp, with the type
binary flag set) prior to use on the workstation. If GMAP is
run on an IRIS, the TOM style skeleton file can be used directly.
O style skeleton data blocks are ascii files, and can be
created and used interchangeably on all computer systems.
5) Some versions of TOM limit skeleton files to a maximum of 16000
skeleton points. The current program can generate larger files
since O can handle them, but it may be necessary to reduce the
region and/or increase the base until the number of points is
below 16000 if TOM is to be used.
6) The skeletonisation routine is essentially that which originated
in Uppsala, and is simply incorporated in GMAP for convenience.
2.08 MISSNG WRITE-UP
MISSNG is a program to compare reflections in the main phased data
set with those in other files, and output those reflections absent
in the main file but present in the others. The output file thus
contains candidates for phase extension. Usually the main (phased)
data set is the output from PHASIT and contains all reflections which
survived the cutoffs applied. The additional input files should include
the complete native set, from which the "merged" data files input to
PHASIT were created, but can also include other "phased" files (i.e.
files containing phase probability distribution coefficients). In this
way additional reflections for which native amplitudes are available
(and possibly phase probability distribution coefficients) can be
selected for phase extension via option 3 in BNDRY. The program is
interactive, and prompts for the input and output file names, whether
or not an additional "phased" file is to be included ( perhaps
obtained from a partial structure via PHASIT, SF mode with IHLCF=1 and
ISIGA=0 or 1), a d spacing cutoff, and the parameter file. The output
file contains all additional reflections currently missing from the
PHASIT file out to DCUT resolution for which native amplitudes are
available. If the additional phased file was included, then for
those reflections the phase distribution coefficients are also output
and they will be used during phase combination in BNDRY. For those
reflections without distribution coefficients, the calculated phase
will be output in BNDRY as there is no phase information to combine
with. The output file is generally called "extrfl.d". First the main
phased file is read. Then if an additional phased file is to be used
it is read and the reflections compared with those in the main file.
The additional reflections are then written out along with the
distribution coefficients. The native file is then read and the
indices are transformed to the standard asymmetric unit, and compared
to all reflections previously encountered. Those reflections not
yet utilized will then also be written to the output file, but will
not contain distribution coefficients.
***** FILES *****
The input main phased file should be a PHASIT style output file (long
format, i.e. includes probability distribution coefficients). Only
the indices are used, so that this file may contain either FM*FO and
FO or FO and FC in the amplitude slots.
If an additional phased file is used, it should also be a PHASIT style
output file (long format), but it MUST contain FM*FO and FO in the
amplitude slots as indices, FO and the distribution coefficients are
to be used. It could thus be generated in PHASIT, SF mode with IHLCF=1
and ISIGA=0 or 1, or in PHASIT, phasing mode.
The input complete native file can be one of three types. If the
filename ends with ".MU" or ".mu", then a XENGEN like MULIST is
assumed. Thus the file should contain records with
H, K, L, RES, F, Sig(F), F+, Sig(F+), F-, Sig(F-), Iflag
in format (3I4, 1X, F6.4, 6(1X, F8.2) 1X, I2 ).
The "Iflag" parameter is not used and may be absent.
If the filename ends with ".SCA" or ".sca", then a SCALEPACK
file is assumed. After a variable number of header records
(see the FILE FORMATS section), reflection records follow and
contain
H, K, L, I+, sig(I+), I-, sig(I-)
in format (3I4, 4F8.1)
Note the use of intensities rather than F's. The last two items
in each record may be omitted. If present, they would be used
only if I+ was not measured.
If the file name does not end with ".MU", ".mu", ".SCA" or ".sca"
each record is assumed to contain
H, K, L, F, Sig(F)
and is read in free format, i.e. each item must be separated by at
least one space or a comma. The indices must be INTEGERS and the F and
Sig(F) values REALS.
In all cases the corresponding F values should be identical (or at
least on the same scale) as those input to PHASIT.
The output file is generally named extrfl.d for compatability with the
supplied template command procedures. Its records will contain H, K,
L, FNAT, A_B, C_D in format (3I4, F10.2, 2I12) where the distribution
coefficients are packed two per word in A_B and C_D according to
A_B = ( IFIX(A*100) + 16384 )*32768 + IFIX(B*100) + 16384
C_D = ( IFIX(C*100) + 16384 )*32768 + IFIX(D*100) + 16384
If distribution coefficients are not available the A_B and C_D values
are zero.
This file can be used for phase extension in option 3 of BNDRY.
Note that MISSNG should always be used to prepare the file for phase
extension rather than some other program. This is because MISSNG will
insure that the output indices correspond to the same standard
asymmetric unit as the rest of the program package. If this is not
done, it is possible for redundant (symmetry related) reflections to
creep in to the data set.
2.09 MRGDF WRITE-UP
MRGDF is a program to create coefficients for difference or
cross-difference Fourier synthesis calculations, i.e. try to solve a
new derivative from phase information obtained from one or more other
derivatives. It can also be used simply to search for additional heavy
atom sites once initial estimates of protein phases become available,
although the "difference coefficients" file output from PHASIT may
be better suited in this case since one can then also subtract out
the heavy atoms already present in the model, and also generate a
"calculated" Patterson for comparison with the "observed" one. The
program is interactive and prompts for the names of the input and
output files, and a d spacing cutoff. The output file can be used in
FSFOUR with MAPTYP=3 to compute the difference or the cross-difference
Fourier synthesis. If the input derivative file is one of the
"merged" data files originally input to PHASIT, then the coefficients
output can be used to compute a difference Fourier to identify other
heavy atom sites which may have been overlooked. In that case it is
not a "cross-difference" Fourier but a straight difference Fourier.
Program PSRCH can be used to list the strongest peaks in the map. The
program will also prompt the user to supply values for derivative to
native scale and delta B factors, if rescaling is requested. If
utilized, this option enables the user to change the scaling
originally carried out in CMBISO, to reflect the fact that additional
scattering power is present in the derivative data set. In that case
the new scaling parameters should be those determined from PHASIT in
phase refinement mode.
***** FILES *****
The input scaled data file is identical (in form) to the isomorphous
"merged" data file input to PHASIT. Each record should contain
H, K, L, FP, SIG(FP), FPH, SIG(FPH)
with FP and FPH already properly scaled together as in CMBISO. This
file refers to the new derivative which is to be solved, or to a
current derivative for which one wants to search for additional heavy
atom sites. It is read in free format.
The input protein phase file can be one of two types. Usually it will
be the last output file from BNDRY, or an output file from PHASIT (in
protein phasing mode). In general, it should contain the best
available phases. The form of the file would then be identical to that
output from BNDRY or PHASIT (in protein phasing mode). It is also
possible however, to input a protein phase file which contains records
with the "short" reflection file form (only h,k,l,fo,fc,phi) as
generated by GREF or PHASIT (in structure factor calculation mode with
IHLCF=0). In that case there is no figure of merit present, thus FOM=
1. is used during generation of the output coefficients. This would be
the case if the protein phases come from a complete (or partial)
protein model based structure factor calculation. The program can
automatically determine which type of file was input. Note however,
that a protein phase file generated by GREF should NOT be used here if
GREF was run using Bijvoet difference magnitudes as " FOBS", as the
phases on the file are then shifted by 90 degrees relative to their
true values. (See GREF writeup).
The output file is binary and is suitable for input to FSFOUR.
Each record contains
H, K, L, FOM*FPH, FOM*FP, PHICALC
where the indices are INTEGERS, the other quantities REALS and PHICALC
is in degrees. The figure of merit and phase come from the phased
file while FPH and FP come from the derivative data file.
2.10 MRGBDF WRITE-UP
MRGBDF is a program to create coefficients for Bijvoet difference
or cross Bijvoet difference Fourier synthesis calculations, i.e. try
to determine anomalous scatterer locations from phase information
obtained from one or more other derivatives. It can also be used
simply to search for additional anomalous scatterer sites once initial
estimates of protein phases become availabler, although the
"difference coefficients" file output from PHASIT may be better suited
in this case since one can then also subtract out the heavy atoms
already present in the model, and also generate a "calculated"
Patterson for comparison with the "observed" one. The program is
interactive and prompts for the names of the input and output files,
and a d spacing cutoff. A merged file input may be either a "native
anomalous scattering" or "derivative anomalous scattering" type file
(see PHASIT write-up), and the user will be prompted to identify the
type. The output file can be used in FSFOUR with MAPTYP=8 to compute
the Bijvoet-difference Fourier synthesis. If the input Bijvoet pair
file is one of the "merged" data files originally input to PHASIT,
then the coefficients output can be used to compute a Bijvoet
difference Fourier to identify additional anomalous scatterer sites
which may have been overlooked. In that case it is not a "cross"
Fourier but a straight Bijvoet difference Fourier. If the Bijvoet pair
file input corresponds to a new derivative, then the coefficients can
be used for a "cross" Bijvoet difference Fourier to reveal the
locations of anomalous scatterers in the new derivative. The program
is also useful to determine whether the heavy atom hand designation
agrees with the anomalous scattering data. When everything
is consistent maps computed should reveal POSITIVE peaks at the
appropriate anomalous scatterering sites for all Bijvoet pair data
sets. If the hand for the heavy atoms used in phasing is inconsistant,
then the map will reveal NEGATIVE peaks at the true sites, i.e. those
related to the input (incorrect) set BY A CENTRE OF SYMMETRY.
Program PSRCH can be used to list the strongest peaks (both positive
and negative) in the map, and program HNDCHK can be used to aid in
hand determination by examining the density precisely at any arbitrary
location. In general, one could phase the data using both possible
hands and check the results as just described.
If derivative Bijvoet differences are used, The program will prompt
the user to supply values for derivative to native scale and delta B
factors, if rescaling is requested. If utilized, this option enables
the user to change the scaling originally carried out in CMBANO, to
reflect the fact that additional scattering power is present in the
derivative data set. In that case the new scaling parameters should be
those determined from PHASIT in phase refinement mode.
***** FILES *****
The input Bijvoet pair file is identical (in form) to one of the
"merged" data files input to PHASIT. Each record should contain either
H, K, L, F+, SIG(F+), F-, SIG(F-)
or
H, K, L, FP, SIG(FP), FPH+, SIG(FPH+), FPH-, SIG(FPH-)
This file refers to the new derivative which is to be solved, or to a
current data set for which one wants to search for additional
anomalous scatterer sites. It is read in free format.
The input protein phase file can be one of two types. Usuallly it will
be the last output file from BNDRY, or an output file from PHASIT (in
protein phasing mode). In general, it should contain the best
available phases. The form of the file would then be identical to that
output from BNDRY or PHASIT (in protein phasing mode). It is also
possible however, to input a protein phase file which contains records
with the "short" reflection file form (only h,k,l,fo,fc,phi) as
generated by GREF or PHASIT (in structure factor calculation mode,
IHLCF=0). In that case there is no figure of merit present, thus FOM=
1. is used during generation of the output coefficients. This would be
the case if the protein phases come from a complete (or partial)
protein model based structure factor calculation. The program can
automatically determine which type of file was input. Note however,
that a phase file generated by GREF should NOT be used here unless
GREF was used to compute structure factors from a complete protein
model.
The output file is binary and is suitable for input to FSFOUR.
Each record contains
H, K, L, FOM*F+, FOM*F-, PHI+
where the indices are INTEGERS, the other quantities REALS and PHI+ is
in degrees.
PHI+ = PHICALC for + type output file.
PHI+ = -PHICALC for - type output file.
The figure of merit and PHICALC come from the phased file while
F+ and F- come from the Bijvoet pair data file. The Bijvoet difference
Fourier should be computed with coefficients
-i * (FOM*F+ - FOM*F-) * exp (i * PHI+)
where the -i factor is applied after expansion to a hemisphere, and
during the expansion, care is taken to insure that the differences are
"flipped" if putting the reflection into the desired hemisphere
involves an inversion. This is taken care of automatically in FSFOUR
if MAPTYP=8 is selected.
2.11 RD31 WRITE-UP
RD31 is a program which reads the binary phase files output from
either PHASIT, BNDRY, GREF, MAPINV, MRGDF, MRGBDF or IMPORT and
converts it to a formatted file. The formatted file can be examined
and edited if desired, or it can be used to interface the current
phase information with other programs. Originally this program was
used to convert the binary file (generated on a Silicon Graphics IRIS
computer) to ASCII so that it can be transferred over an ethernet link
for use on a VAX. The program is interactive and prompts only for the
names of the input and output files. It can read both the "full" style
reflection records (includes probability distribution coefficients,
FOM etc), or the "short" style records produced by MAPINV, GREF, MRGDF
etc, and automatically determine which type was input. Program MK31B
can then be used to recreate the binary file from the output of this
program.
***** FILES *****
The input binary file should be the phased file output from PHASIT (in
protein phasing mode), BNDRY or IMPORT. It can also be the short phase
file output from GREF, MAPINV, MRGDF, MRGBDF or PHASIT (in structure
factor calculation mode, IHLCF=0).
The output ASCII file contains records with
H, K, L, FMFO, FO, PHIBEST, IPRAB, IPRCD, MK, FM
in FORMAT ( 3I4, 2F10.2, F7.2, 2I12, I5, F6.3 )
H,K,L = Miller indices
FMFO = Figure of merit weighted structure factor amplitude
(either FOM * FP or FOM * F+)
FO = Observed structure factor amplitude (either FP or F+)
PHIBEST = Best (centroid) phase, in degrees.
IPRAB Hendrickson-Lattman coefficients A,B,C,D for the phase
= probability distribution used, packed two per word as
IPRCD (IFIX(A*100)+16384)*32768 + IFIX(B*100)+16384 and
(IFIX(C*100)+16384)*32768 + IFIX(D*100)+16384
MK = Restricted phase indicator. For general reflections
MK=1, for centric reflections MK > 1 and one of the
allowed phase values is (MK-1)*15 degrees (the other
possibility is 180 degrees away).
FOM = Figure of merit associated with PHIBEST and used for
weighting.
If a file with "short" records was input, only the indices, F values
and phase are written in the output records.
2.12 MK31B WRITE-UP
MK31B reads the ASCII version of the phase file created by RD31 and
creates a binary version. Both "full" or "short" files can be input,
and the program automatically determines which type was provided. See
RD31 section for details on the files. The program is interactive and
prompts only for the names of the input and output files.
2.13 PSRCH WRITE-UP
PSRCH is a program which searches the map computed by FSFOUR and
lists the coordinates and heights for the largest peaks (positive
or negative). Only unique peaks will be listed. It has other options
useful in small molecule crystallographic applications, but for
protein work only the peak list is used, hence the other options are
bypassed and will not be described. This is essentially the peak
search program used in the MULTAN78 direct methods programming
package.
INPUT DATA (UNIT 5)
CARD 1 PAMFIL (free format)
PAMFIL = Name of input file containing cell parameters
and symmetry information (used only to get
running log file name).
CARD 2 MAPFIL (free format)
MAPFIL = Name of map file input.
CARD 3 NPEAK, NEGP (free format)
NPEAK = Number of largest peaks to list (Max=180)
= 0 gives the default of (11*n)/9 where n = number
of independent atoms (excluding H) as input to
FSFOUR
NEGP = 0 to list positive peaks, 1 to list negative
peaks
***** FILES *****
The input binary map file should be the output file from FSFOUR.
FSFOUR must have been run with the parameter NORN=0 specifying the XZ
section orientation.
2.14 CMBISO WRITE-UP
CMBISO is an interactive program to merge native data with
derivative isomorphous replacement data. It is used to prepare a
"merged" file for PHASIT to phase by the isomorphous replacment
method, to make an input file for TOPDEL to create difference
Patterson coeficients, to make input files for MRGDF for cross
Fouriers and to make input files for GREF for refining heavy atom
sites against isomorphous differences. The program is interactive
and prompts for the names of input and output files, and whether or
not one wants to include additional non-Wilson scaling corrections.
The program matches up all derivative reflection data with the
corresponding native data, scales the derivative data to the native,
ond outputs merging R factor statistics and statistics regarding the
isomorphous differences. The reflections need not be indexed
identically in both input files as symmetry information is used to
match up the data. Each input data set however, should contain only
unique reflections. Note that Friedels law is assumed to hold when
matching up the data. The data sets are initially scaled by computing
a relative Wilson plot, and applying scale and thermal corrections
derived from it. The user is then asked whether additional non-Wilson
scaling should also be done. If it is, then the user is asked whether
anisotropic or local scaling should be done. If anisotropic scaling
is requested, the reciprocal lattice vectors are orthogonalized and
the elements of a symmetric 3x3 scaling tensor are refined by two
cycles of least squares. The anisotropic scaling is then applied to
all of the derivative reflections, and the tensor elements are
printed out. For an isotropic distribution the diagonal elements
should be 1.0 and the off diagonal elements zero. Thus deviations
from these quantities indicate the degree of anisotropy. If local
scaling is requested, then a scale factor for each reflection is
determined by a least squares fit of the F's for all neighboring
reflections within a given sphere radius to the corresponding native
F's, neglecting the central reflection to be scaled. For each
reflection the sphere radius is initially chosen to encompass about
125 reflections, and the derived scale factor is accepted if at least
80 neighbors are found. If needed, the sphere radius will be
incrementally adjusted until either a preset maximum is reached, or
80 neighbors are found. If the maximum is reached, then the scale
factor will still be accepted if 40 neighbors are found. If not,
the program will stop and indicate that the data set is too sparse
for meaningful local scaling. The mean and minimum number of
neighbors used is then listed. For both anisotropic and local
scaling, the minimum and maximum scale factors that were applied are
listed.
***** FILES *****
Each input file must contain records with Miller indices and the
corresponding reflection data values. The files however, can be one
of three types. If the file name ends with ".MU" or ".mu", then it
is assumed to be a "MULIST" i.e. a file generated by program MAKEMU
(in the XENGEN system) or by program FBSCALE. In that case each
record is assumed to contain
H, K, L, RES, F, Sig(F), F+, Sig(F+), F-, Sig(F-), Iflag
in format (3I4, 1X, F6.4, 6(1X, F8.2) 1X, I2 ). Only the indices, F
and Sig(F) are used. The "Iflag" parameter may be absent.
If the filename ends with ".SCA" or ".sca", then a SCALEPACK
file is assumed. After a variable number of header records
(see the FILE FORMATS section), reflection records follow and
contain
H, K, L, I+, sig(I+), I-, sig(I-)
in format (3I4, 4F8.1)
Note the use of intensities rather than F's. The last two items
in each record may be omitted. If present, they would be used
only if I+ was not measured.
If the file name does not end in ".MU", ".mu", ".SCA" or ".sca"
each record is assumed to contain
H, K, L, F, Sig(F)
and is read in free format, i.e. each item must be separated by at
least one space or a comma. The indices must be INTEGERS and the F and
Sig(F) values REALS.
The output file contains records with
H, K, L, FP, Sig(FP), FPH, Sig(FPH)
in format ( 3I4, 4F10.2). The last 2 quantities are rescaled to match
the native data set. This file is suitable for input to PHASIT using
the ISOFLG=0 option, or for input to MRGDF to prepare coefficients for
difference or cross difference Fouriers.
2.15 CMBANO WRITE-UP
CMBANO is a program to merge native data with derivative anomalous
scattering data. It is used to prepare a "merged" file for use in
PHASIT to phase by derivative anomalous scattering, to make an input
file for TOPDEL to create Bijvoet difference Patterson coeficients, to
make input files for MRGBDF for Bijvoet difference Fouriers and to get
input files for GREF for refining heavy atom sites against anomalous
scattering differences. The program is interactive and prompts for the
names of the input and output files, and whether or not one wants to
include anisotropic scaling corrections. The program matches up all
derivative Bijvoet pairs with the corresponding native data, and
scales the derivative data to the native. The reflections need not be
indexed identically in both input files as symmetry information is
used to match up the data. Each input data set however, should contain
only unique reflections. The data sets are initially scaled by
computing a relative Wilson plot and applying the scale and thermal
corrections derived from it. The user is then asked whether any
additional non-Wilson scaling should be done. If it is, then the
user is asked whether anisotropic or local scaling should be done.
If anisotropic scaling is requested, the reciprocal lattice vectors
are orthogonalized and the elements of a symmetric 3x3 scaling tensor
are refined by two cycles of least squares. The anisotropic scaling is
then applied to all of the derivative reflections, and the tensor
elements are printed out. For an isotropic distribution the diagonal
elements should be 1.0 and the off diagonal elements zero. Thus
deviations from these quantities indicate the degree of anisotropy. If
local scaling is requested, then a scale factor for each reflection is
determined by a least squares fit of the F's for all neighboring
reflections within a given sphere radius to the corresponding native
F's, neglecting the central reflection to be scaled. For each
reflection the sphere radius is initially chosen to encompass about
125 reflections, and the derived scale factor is accepted if at least
80 neighbors are found. If needed, the sphere radius will be
incrementally adjusted until either a preset maximum is reached, or
80 neighbors are found. If the maximum is reached, then the scale
factor will still be accepted if 40 neighbors are found. If not,
the program will stop and indicate that the data set is too sparse
for meaningful local scaling. The mean and minimum number of
neighbors used is then listed. For both anisotropic and local
scaling, the minimum and maximum scale factors that were applied are
listed. Note that the additional scaling (anisotropic or local), if
invoked, is applied ONLY TO SCALE THE DERIVATIVE DATA TO THE NATIVE,
and NOT to scale F+ to F- within the derivative. Thus one may still
want to apply some type of additional scaling to the F+,F- values
prior to running this program.
***** FILES *****
The native data file must contain Miller indices and the corresponding
reflection data values. The derivative data file must contain Miller
indices and the corresponding reflection data including Bijvoet pairs.
The files however, can be one of three types. If the file name ends
with ".MU" or ".mu", then it is assumed to be a "MULIST" i.e. a file
generated by program MAKEMU (in the XENGEN system) or by program
FBSCALE. In that case each record is assumed to contain
H, K, L, RES, F, Sig(F), F+, Sig(F+), F-, Sig(F-), Iflag
in format (3I4, 1X, F6.4, 6(1X, F8.2) 1X, I2 ). For the native file
only the indices, F and Sig(F) are used. For the derivative file only
the indices, F+, Sig(F+), F-, and Sig(F-) are used. If "Iflag" is
present in the derivative file, then it will be used to screen for
viable anomalous scattering data, and one can consult the XENGEN
documentation for the meaning of the Iflag variable. If it is absent,
then only reflections with both F+ and F- values greater than zero
will be used. (This criteria also applied even if Iflag is present).
If the filename ends with ".SCA" or ".sca", then a SCALEPACK
file is assumed. After a variable number of header records
(see the FILE FORMATS section), reflection records follow and
contain
H, K, L, I+, sig(I+), I-, sig(I-)
in format (3I4, 4F8.1)
Note the use of intensities rather than F's. The last two items
in each record may be omitted FOR THE NATIVE SET. If present in the
native file they would be used only if I+ was not measured. For the
derivative file in general all quantities should be present in each
record as if either member of the Bijvoet pair is missing the
reflection will not be used. Only reflection with all F's greater
than zero will be used.
If the filename does not end in ".MU", ".mu", ".SCA" or ".sca"
each record is assumed to contain
H, K, L, F, Sig(F)
if it is the native file and
H, K, L, F+, Sig(F+), F-, Sig(F-)
if it is the derivative file. The file is then read in free format
i.e. each item must be separated by at least one space or a comma. The
indices must be INTEGERS and all F and Sig values REALS. Reflections
are accepted only if the F (or F+ AND F-, for derivative data) is/are
greater than zero.
It is absolutely ESSENTIAL that only valid measurements of BOTH
F+ and F- are used by the program, and the screening criteria above
is designed to insure that. However, depending on the source of the
data it is still possible for invalid reflections to be used. For
example, some data reduction programs output only the mean F and
del ano, and the user then writes a small program to convert this
to the individual F+, F- values. If a del ano of zero is encountered,
it MAY mean that one of the F's (either F+ or F-) WAS NEVER MEASURED!
Yet the conversion program might output this as a Bijvoet pair with
a Bijvoet difference of zero! Likewise some data reduction programs,
even if they output individual F+ and F- values, actually set one
of them equal to the other if only one was measured. This again leads
to erroneous Bijvoet differences. Know what your data reduction
program is doing, and be very wary of any non-centric reflections
with F+ EXACTLY equal to F-, or have del ano equal to zero!
The output file contains records with
H, K, L, FP, Sig(FP), FPH+, Sig(FPH+), FPH-, Sig(FPH-)
in format ( 3I4, 6F10.2). The last 4 quantities are rescaled to match
the native data set. This file is suitable for input to PHASIT using
the ISOFLG=2 option, or for input to MRGBDF to prepare coefficients
for Bijvoet difference or cross Bijvoet difference Fouriers.
2.16 TOPDEL WRITE-UP
TOPDEL is a program to examine/select reflection data based on the
magnitude of isomorphous or anomalous scattering differences. It can
read in any of the "merged" files input to PHASIT and will select from
them data based on user supplied d spacing and F/sigma cutoffs. All
selected data is then sorted according to the magnitudes of the
differences, and the 15 largest differences are displayed along with
the mean and standard deviation for all selected data. Generally one
uses the list to identify (and reject) outliers (reflections with
abnormally large differences) for the purpose of Patterson map (and
possibly phasing) calculations. If outliers are present they usually
will show up as a "break" at the top of the otherwise smoothly
diminishing difference magnitude list. The program is interactive and
prompts for the names of files, type of data file input, and the d
spacing and F/sigma cutoffs. After displaying the list, the user is
prompted as to whether or not a file should be prepared for Patterson
map calculations. If a map is requested the user is prompted for the
output file name, what percentage of the selected data is to be
output, and how many reflections (starting from the top of the sorted
list) are to be rejected. Thus if 60% of the data is requested with
two rejections, the two largest differences are rejected and the
remaining largest differences are output until 60% of the total number
of selected reflections is reached.
Note! If outliers are rejected, the rejection only applies to the
output Fourier file. If you want to reject the outliers from all
subsequent phasing calculations, you must manually edit the outliers
from the "merged" file input to PHASIT.
***** FILES *****
The program will prompt for the input file name and type of file input.
The allowed types are all ASCII, read in free format and can be any of
the "merged" files accepted by PHASIT, i.e. records of either
h, k, l, FP, Sig(FP), FPH, Sig(FPH)
or
h, k, l, FP+, Sig(FP+), FP-, Sig(FP-)
or
h, k, l, FP, Sig(FP), FPH+, Sig(FPH+), FPH-, Sig(FPH-)
For the first type isomorphous replacement difference magnitudes
ABS(FP-FPH) are selected, displayed and (possibly) output. For the
other types anomalous scattering difference magnitudes are used
instead.
If a Fourier output file for Patterson map calculations is requested
by the user a binary file will be written with 6 words per record. The
first three words are INTEGERS and the other three REALS. The records
will contain
h, k, l, FPH, FP, 0.
or
h, k, l, F+, F-, 0.
depending on whether isomorphous or anomalous data was input.
Note that the output file can be used in FSFOUR to compute
either isomorphous or anomalous difference Patterson maps by setting
MAPTYP=5. Native Patterson maps can also be computed with the file
by setting MAPTYP=6.
2.17 GREF WRITE-UP
GREF is program for the refinement of rigid groups against x-ray
diffraction data. It can be used to refine entire protein domains,
substructures or even individual atoms. It is space group general and
can refine up to 24 groups with a total of up to 20000 atoms in the
asymmetric unit. Each input atom must belong to a group, but the
atoms can be partitioned into groups in any possible manner. The
program is also used to refine heavy atom positions by defining each
heavy atom to be a "one atom group", and refining only the group
centroid position, occupancy and (possibly) thermal factor (e.g. omit
refinement of the group orientation parameters). The program can read
any of the "merged" reflection files input to PHASIT (for heavy atom
refinement against isomorphous or anomalous scattering differences) or
a general file for protein group refinement against native data.
Scattering factors can be either the normal type for refinement
against native data or isomorphous differences, or 2*delta f" for
refinement against anomalous scattering differences. One can select
all data, only centric data, or only acentric data for the
calculations. The output refined coordinate file can be used as input
to PHASIT for subsequent phase calculations.
INPUT DATA (UNIT 5)
CARD 1 PAMFIL (free format)
PAMFIL = Name of input file containing cell parameters,
symmetry information etc.
CARD 2 INPCDS (free format)
INPCDS = Name of file containing input atomic coordinates.
CARD 3 INPREF (free format)
INPREF = Name of file containing input reflection data.
CARD 4 NCYCLS,IFOUR,SC,TO,CUTS,CUTMN,CUTMX,IWGHT,NXSCAT (free format)
NCYCLS = # of refinement cycles (= 0 for
a single structure factor calculation)
IFOUR = 0 for no Fourier coefficient output.
= 1 to write final Fourier coefficients
to file.
SC = scale factor, such that Fobs=SC*Fcalc.
TO = Overall isotropic thermal factor. If
zero, then individual thermal factors
for each group must be supplied. If
non-zero, applies to all atoms, and
thermal factors for each group SHOULD
NOT be input.
CUTS = Data selection cutoff. Rejects data
with Fobs < CUTS*Sig(Fobs).
CUTMN = Data selection cutoff. Rejects data
with sin(theta)/lambda < CUTMN.
CUTMX = Data selection cutoff. Rejects data
with sin(theta)/lambda > CUTMX.
IWGHT = Weighting factor indicator.
= 0 for weights of 1./Sig(Fobs)**2
= 1 for unit weights.
NXSCAT = number of additional atomic
types for which scattering
factors will be input. Note
that 20 types are already
stored in the program (see
below), thus this is usually
nonzero only for exotic
atoms or wavelengths other
than CU K alpha.
CARD 4A OUTREF (free format)
***** Include this card ONLY if IFOUR=1 *****
OUTREF = Name of output phased reflection file
CARD 4B OUTCDS (free format)
***** Include this card ONLY if NCYCLS > 0 *****
OUTCDS = Name of output coordinate file
CARDS 4C Extra scattering factors (all free format)
***** Include this set of records ONLY if NXSCAT > 0 *****
Up to 5 additional atomic types may be input. For each additional
atomic type, include the following 3 records
REC 1 (A(J),J=1,4) (free format)
A(J) = Coefficients for analytical
approximation to scattering
factors, as in Int. Tables,
Vol IV, pages 99-101.
REC 2 (B(J),J=1,4) , C (free format)
B(J) = Coefficients for analytical
approximation to scattering
C = factors, as in Int. Tables,
Vol IV, pages 99-101.
REC 3 DEL f' , DEL f'' (free format)
DEL f' = real part of anomalous
scattering correction term.
DEL f'' = imaginary part of anomalous
scattering correction term.
CARD 5 IFLTYP,ICLTYP,ISFTYP,MINCEN,IMODE (free format)
These parameters indicate what type of
reflection file is input, what type of
data is to be used, and what scattering
factors are to be used.
IFLTYP = 0 for h,k,l,FO,Sig(FO) input, uses
FOBS=FO and SIG= Sig(FO).
= 1 for h,k,l,FP,Sig(FP),FPH,Sig(FPH)
input, uses FOBS=ABS(FP-FPH) and
SIG=mean Sig.
= 2 for h,k,l,FP+,Sig(FP+),FP-,Sig(FP-)
input, uses FOBS=ABS(FP+ - FP-) and
SIG=mean Sig.
= 3 for h,k,l,FP,Sig(FP),FPH+,Sig(FPH+),
FPH-,Sig(FPH-) input, uses
FOBS=ABS(FPH+ - FPH-) and SIG=
(Sig(FPH+)+Sig(FPH-))/2.
ICLTYP = 0 to use all data types.
= 1 to use only centric data.
= 2 to use only acentric data.
ISFTYP = 0 to use normal scattering factors.
= 1 to use only 2.*delta f" as
scattering factors.
MINCEN = minimum number of centric reflections
to be used without including 25%
strongest differences for acentric
reflections. Applied only if ICLTYP=1
(suggested value=75, but space group
considerations may dictate otherwise,
see NOTES).
IMODE = 0 If atom types derived from first
character in atom name (only C,N,O,S,
Fe recognized).
= 1 If atom type code number explicitly
input for each atom.
CARD 6 (NGP, NAG(I), I=1,NGP) (free format)
NGP = # of groups (each atom must belong to
a group)
NAG(I) = # of atoms in group I. It is assumed
that the first NAG(1) atoms form group
1, the next NAG(2) atoms form group 2
etc.
CARD 7 (OCC(I), I=1,NGP) (free format)
OCC(I) = Occupancy factor for group I.
CARD 8 (BETA(I), I=1,NGP) (free format)
BETA(I) = Isotropic thermal factor for group I.
NOTE! Include this card only if TO
is zero.
FOLLOWING CARDS ARE TO BE INCLUDED ONLY IF REFINING ( NCYCLS > 0 )
CARD 9 ISC,ITO (free format)
ISC = 0 to hold scale factor fixed.
= 1 to refine
ITO = 0 to hold overall thermal factor
fixed.
= 1 to refine. (it can be refined only
if TO > 0 ).
The following card must be repeated for each of the NGP groups.
CARDS 10 ITX,ITY,ITZ,IRX,IRY,IRZ,IBT,IOC (free format)
ITX = 0 to hold group centroid fixed
at respective x,y, or z coordinate.
ITY = 1 to refine group centroid
translation for respective coordinate.
ITZ =
IRX = 0 to hold group orientation angle
fixed with respect to x,y,or z
IRY = axis (orthogonal axes).
1 to refine group rotation angle
IRZ = about corresponding axis (all
rotations about group centroid).
IBT = 0 to hold group thermal factor
fixed.
= 1 to refine. (applicable only if
TO is non zero.)
IOC = 0 to hold group occupancy factor
fixed.
= 1 to refine.
***** FILES *****
INPREF - INPUT REFLECTION DATA (free format)
This file can have any of 4 types of structures, with the particular
type designated by IFLTYP. All types are read in free format, i.e.
each item must be separated by at least one blank or a comma. For all
types there should be data for one reflection in each record.
For IFLTYP=0 records with h, k, l, FP, Sig(FP) and the data used
as input. Typically native data for protein refinement.
For IFLTYP=1 records with h, k, l, FP, Sig(FP), FPH, Sig(FPH) and
the data used is ABS(FP-FPH), mean Sigma. Typically
isomorphous replacement data for heavy atom refinement.
For IFLTYP=2 records with h, k, l, FP(h,k,l), Sig( FP(h,k,l) ),
FP(-h,-k,-l), Sig( FP(-h,-k,-l) ) and the data used
as ABS( FP(h,k,l) - FP(-h,-k,-l) ), mean Sigma.
Typically native anomalous scattering data for heavy
atom refinement.
For IFLTYP=3 records with h,k,l,FP,Sig(FP),FPH(h,k,l),
Sig( FPH(h,k,l) ),FPH(-h,-k,-l), Sig( FPH(-h,-k,-l) )
and the data used is ABS( FPH(h,k,l) - FPH(-h,-k,-l) )
and (Sig( FPH(h,k,l) ) + Sig( FPH(-h,-k,-l) ) )/2.
Typically derivative anomalous scattering data for
heavy atom refinement.
INPCDS - INPUT ATOMIC PARAMETERS FORMAT (1X,A1,5X,A1,I3,A4,5F10.5,I5)
One atom per record, containing CHN,RTYPE,IRES,ATOM,X,Y,Z,B,OC,ITYP
CHN = Single character chain identifier (not
used)
RTYPE = One letter amino acid code (not used)
IRES = Sequence number (not used)
ATOM = Atom name (used only if IMODE=0)
X =
Y = Fractional atomic coordinates
Z =
B = Isotropic thermal factor (not used as
TO or BETA values superseed it)
OC = Occupancy factor (not used as OCC
value superseeds it)
ITYP = 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,
16,17,18,19 or 20 for C,N,O,S,Fe,Pt,
Hg,Au,Pb,Os,I,Zn,Ca,Mg,Cd,U,P,Br,Cl,
or Sm, respectively. ITYP= 21
through 20+NXSCAT for the additional
types, in same order as originally
input. (This field used only if
IMODE=1)
OUTCDS - OUTPUT ATOMIC PARAMETER FILE (same data and format as INPCDS)
Generated only if NCYCLS > 0. Contains the new parameters. If heavy
atom refinement was done this file can be inserted in a PHASIT deck
for phase calculations. It can also be input back into GREF for
further refinement. The format is compatible with the Hendrickson-
Konnert program PROTIN, or with PHASIT's structure factor calculation
mode.
Note that this file may have the same name as INPCDS, although
the original contents will then be destroyed.
OUTREF - OUTPUT FOURIER COEFFICIENTS (BINARY)
Generated following last structure calculation only if IFOUR=1.
Contains records of h, k, l, FOBS, FCAL, PHI where the indices
are INTEGERS and the other data REALS. PHI is in degrees. This file
can be used in program FSFOUR for Fourier calculations, and converted
to ASCII by program RD31.
***** NOTES *****
1) Generally when refining heavy atom parameters against isomorphous
differences only centric data is used. In space groups where there is
an insufficent number of centric reflections to refine all needed
parameters, the program will include the 25% strongest differences for
acentric reflections. The input parameter MINCEN determines how many
acceptable centric reflections MUST be found to SUPPRESS the automatic
inclusion of the acentric data. A good value of MINCEN is 75, but in
some space groups it may be necessary to set it to a large
(unobtainable) number to force automatic inclusion of the strongest
differences for acentric data. An example of this would be space group
P2(1) with more than one heavy atom input. In that case there might be
a hundred or more centric reflections, but all will be h,0,l
reflections, thus the y coordinate of the second atom can not be
refined without including some acentric data. Similarly, in space
group P1 only centric reflections should be requested (even though
there aren't any), but MINCEN should be nonzero. This will force ONLY
THE 25% LARGEST ACENTRIC DIFFERENCES to be used. For orthorhombic
space groups there generally are enough centric reflections to refine
all parameters, thus MINCEN=75 is usually sufficient. The MINCEN
parameter is used only if ICLTYP=1, i.e when only centric reflections
are requested.
2) When refining heavy atom parameters against anomalous scattering
differences, one should use only a certain percentage of the largest
differences. If ICLTYP=2 and ISFTYP=1, then heavy atom refinement
against anomalous scattering differences is assumed and the program
will automatically select the 25% strongest Bijvoet differences for
use in the calculations.
3) If anomalous scattering data is used (ISFTYP=1), then the output
phases on file 31 are not correct as they are 90 degrees less than
their true values. This results from use of scattering factors of
2*delta f" instead of i*(2*delta f"). The computed structure factor
amplitudes however, are correct thus the refinement is still valid.
Note also that the structure factor calculation is insensitive to the
hand of the heavy atoms.
4) Although the refined values of the thermal factors and occupancies
are output on the new parameter file, they are ignored on input as
these parameters are set based on values in the input control file.
Accordingly, one must update the control file if additional cycles are
to resume where previous cycles left off. Only the new positional
parameters are used from the input coordinate file. One must also
update the scale factor in the input control file in a similar manner.
The new values of the scale, thermal and occupancy factors are listed
on the log file.
5) If a group contains only a single atom, then the three orientation
angles can not be refined. If a group contains only two atoms, then
only two of the orientation angles can be refined (it usually doesn't
matter which two, although it may if the inter-atom vector happens to
be parallel to one of the orthogonal axes ). If only one atom is
input, then the scale factor and occupancy factor can not both be
refined as they are identical in that situation.
6) With low resolution data, occupancy and thermal factors are highly
correlated and often can not be refined simultaneously.
2.18 IMPORT WRITE-UP
IMPORT is a program which allows the user to enter the PHASES
package with externally derived phase information. It is generally
used when one wants to bypass the PHASIT program, i.e. phases and
Hendrickson-Lattman coefficients are computed via some external
program, and one wants to use this phase information within the PHASES
package, often for solvent flattening or phase combination with a
partial structure. The program is interactive and prompts for the
names of input and output files. It then reads the externally prepared
reflection file, converts the indices, phase and Hendrickson-Lattman
coefficients to correspond to the reflection in the "standard" PHASES
asymmetric unit, identifies reflections with restricted phases, and
writes the information in a PHASIT style file. This output file can be
used within the package wherever a PHASIT file could be used, i.e. in
FSFOUR, MISSNG, BNDRY, MRGDF, MRGBDF, RD31 etc.
***** FILES *****
The input file should be an ASCII (formatted) file with each record
containing the following data:
H, K, L, FOBS, FOM, PHI, A, B, C, D
where
H,K,L = Miller indices
FOBS = Native structure factor amplitude
FOM = Figure of merit
PHI = Best (centroid) phase, in degrees
A,B,C,D = Hendrickson-Lattman coefficients for phase probability
distribution
The file is read in free format, i.e. items must be separated by at
least one blank space or comma. The indices are read as Fortran
INTEGERS whereas all other data are read as REALS.
The output PHASIT style binary file contains records with
H, K, L, FMFO, FO, PHIBEST, IPRAB, IPRCD, MK, FM
where
H,K,L = Miller indices
FMFO = Figure of merit weighted structure factor amplitude
(either FOM * FP or FOM * F+)
FO = Observed structure factor amplitude (either FP or F+)
PHIBEST = Best (centroid) phase, in degrees.
IPRAB Hendrickson-Lattman coefficients A,B,C,D for the phase
= probability distribution used, packed two per word as
IPRCD (IFIX(A*100)+16384)*32768 + IFIX(B*100)+16384 and
(IFIX(C*100)+16384)*32768 + IFIX(D*100)+16384
MK = Restricted phase indicator. For general reflections
MK=1, for centric reflections MK > 1 and one of the
allowed phase values is (MK-1)*15 degrees (the other
possibility is 180 degrees away).
FOM = Figure of merit associated with PHIBEST and used for
weighting.
See the PHASIT write-up for more information
2.19 EXTRMAP WRITE-UP
Program EXTRMAP extracts a region from an input electron density
map prepared by FSFOUR, and writes it to a file in a form suitable for
input to any of the averaging programs (MAPORTH,MAPAVG,SKEW,LSQROT
etc.) The file can also be used in MAPVIEW if the non-fsfour map mode
is specified. This program is generally used to extract a 3D region
from the map which encompasses the dimer, trimer etc to be averaged,
i.e. a crystallographic asymmetric unit. There are no restrictions on
the specified output region, i.e. unit cell edges can be crossed, both
in the positive and negative direction. Note that the same thing can
be done with MAPVIEW in interactive mode, but EXTRMAP is better suited
for incorporation into a batch control file for multiple cycle runs.
INPUT DATA (UNIT 5)
RECORD I PAMFIL (free format)
PAMFIL = Input parameter file, used only to get the
"running log" filename.
RECORD II INPMAP (free format)
INPMAP = Input map file, as generated by FSFOUR
RECORD III OUTMAP (free format)
OUTMAP = Output map file
RECORD IV XMIN, XMAX, YMIN, YMAX, ZMIN, ZMAX (free format)
XMIN =
XMAX =
Minimum and maximum coordinates, fractional
YMIN =
YMAX = defining volume to be extracted and output.
ZMIN =
ZMAX =
***** FILES *****
INPUT MAP FILE (BINARY) - standard FSFOUR output, in default
orientation i.e. NORN=0
OUTPUT MAP FILE (BINARY) - contains extracted region
record 1) A,B,C,AL,BE,GA,NX,NY,NZ,IXMN,IYMN,IZMN,IXMX,IYMX,IZMX
with first 6 values REAL*4, next 9 INTEGER*4, lengths in Angstroms,
angles in degrees.
NX =
Number of grid points defining one "cell length" along
NY = respective axis. Implicitly defines grid spacing as
del x = A/NX, del y = B/NY and del z = C/NZ
NZ =
IXMN, IXMX =
Minimum, maximum grid index defining map region such
IYMN, IYMX = that x (fractional) = IX * (del x) / A etc.
There are no restrictions on magnitudes or signs.
IZMN, IZMX =
The map follows as (IYMX-IYMN+1)*(IZMX-IZMN+1) records, with
each containing one row (IXMX-IXMN+1 REAL*4 values) along X,
starting at IXMN. Y is slowest varying, i.e. the file could have
been created with the following FORTRAN code:
DO 30 IY=IYMN,IYMX
DO 20 IZ=IZMN,IZMX
20 WRITE(LU)(RHO(IX,IY,IZ),IX=IXMN,IXMX)
30 CONTINUE
2.20 EXTRMSK WRITE-UP
Program EXTRMSK extracts a region from an input solvent mask file
(prepared by BNDRY) and writes it to a file in a form suitable for
input to any of the averaging programs (MAPORTH,MAPAVG,SKEW,LSQROT
etc.) This program is needed only if one wants to edit a SOLVENT mask,
and even then only if the desired asymmetric unit volume must span a
cell edge. Note that a similar result can frequently be obtained with
MAPVIEW provided one selects the entire map region, i.e. x,y,z all
going from 0 to .999, and the mask from BNDRY is read in as well. Then
upon exiting from MAPVIEW one can save a subset of the map and mask.
However, if the desired subregion to be extracted does not lie
completely within the bounds of the input map (for example, a cell
edge must be crossed), then this program must be used instead. In
EXTRMSK there are no restrictions on the specified output region, i.e.
unit cell edges can be crossed, both in the positive and negative
direction.
INPUT DATA (UNIT 5)
RECORD I PAMFIL (free format)
PAMFIL = Input parameter file, used only to get the
"running log" filename.
RECORD II INPMSK (free format)
INPMSK = Input mask file, as generated by BNDRY
RECORD III OUTMSK (free format)
OUTMSK= Output mask file
RECORD IV XMIN, XMAX, YMIN, YMAX, ZMIN, ZMAX (free format)
XMIN =
XMAX =
Minimum and maximum coordinates, fractional,
YMIN =
YMAX = defining volume to be extracted and output.
ZMIN =
ZMAX =
***** FILES *****
INPUT (AND OUTPUT) MASK FILES (BINARY)
record 1) A,B,C,AL,BE,GA,NX,NY,NZ,IXMN,IYMN,IZMN,IXMX,IYMX,IZMX
with first 6 values REAL*4, next 9 INTEGER*4, lengths in Angstroms,
angles in degrees.
NX =
Number of grid points defining one "cell length" along
NY = respective axis. Implicitly defines grid spacing as
del x = A/NX, del y = B/NY and del z = C/NZ
NZ =
IXMN, IXMX =
Minimum, maximum grid index defining map region such
IYMN, IYMX = that x (fractional) = IX * (del x) / A etc.
There are no restrictions on magnitudes or signs.
IZMN, IZMX =
The mask follows as (IYMX-IYMN+1)*(IZMX-IZMN+1) records, with
each containing one row (IXMX-IXMN+1 BYTE values) along X,
starting at IXMN. Y is slowest varying, i.e. the file could have
been created with the following FORTRAN code where the array
MASK is FORTRAN type BYTE:
DO 30 IY=IYMN,IYMX
DO 20 IZ=IZMN,IZMX
20 WRITE(LU)(MASK(IX,IY,IZ),IX=IXMN,IXMX)
30 CONTINUE
2.21 MAPAVG WRITE-UP
Program MAPAVG, for the averaging of electron density map
regions according to noncrystallographic symmetry. The NC symmetry
related regions may be in the same map, in different maps (crystals)
or both. The program expects the names of all input (unaveraged) map
files, all corresponding mask files, all output (averaged) map files
and the operators defining the noncrystallographic symmetry. The input
map files should be created from FSFOUR maps by running EXTRMAP or
MAPVIEW to extract the map region which encompasses only the dimer,
trimer etc to be averaged for each crystal. For cross-crystal
averaging monomers may be used as well. Each mask map must cover
EXACTLY the same region as its corresponding input map. The mask map
is generally created by MAPVIEW, and possibly transformed by TRNMSK,
although if it is derived from an atomic model it may be created by
MDLMSK. The operators are generally refined by LSQROT or LSQROTGEN
prior to use in averaging. If cross-crystal averaging is done an
additional least squares refinement pass is automatically included
prior to averaging to put the density maps from different crystals
on a common scale. After averaging however, each output map, will be
on the same scale as it was originally input.
INPUT DATA (UNIT 5)
CARD I PAMFIL (free format)
PAMFIL = Name of input parameter file, used only to get the
"running log" filename.
CARD II NCRYST (free format)
NCRYST = Number of different crystals (maps) to be used.
(maximum = 6)
The following block of cards III-VII must be repeated NCRYST times,
once for each crystal.
CARD III INPMAP (free format)
INPMAP = Name of input (unaveraged) map file for this
crystal.
CARD IV INPMSK (free format)
INPMSK = Name of input mask file for this crystal.
CARD V OUTMAP (free format)
OUTMAP = Name of output (averaged) map file for this
crystal.
CARD VI NMOL, (MSK(j), j=1,NMOL) (free format)
NMOL = Total number of molecules related by
noncrystallographic symmetry WITHIN THIS CRYSTAL
(eg 2 for twofold, 3 for threefold etc, MAX=12.
Note that it may be one if only cross-crystal
averaging of monomers is used)
MSK(1) = Mask no. identifying envelope mask for molecule 1
in this crystal
MSK(2) = Mask no. identifying envelope mask for molecule 2
in this crystal
.
.
MSK(NMOL) = Mask no. identifying envelope mask for molecule NMOL
in this crystal
Note that the mask numbers should correspond to those used during mask
creation (1-12), and refinement of the operator(s).
The following card must be repeated NMOL -1 times, with each entry
providing the operator which moves molecule 1 to each additional
NC related molecule WITHIN THIS CRYSTAL, eg for a pure threefold,
operator which moves molecule 1 to molecule 2, and operator which
moves molecule 1 to molecule 3 must be supplied, but the parameters
however, will be the same except for CHI. In that case all three
molecules may have the same mask no. Note that if nmol=1 this card
should NOT be included!
CARD(S) VII PHI, PSI, CHI, OX, OY, OZ, T (free format)
Spherical polar angles defining direction and rotational order
PHI = of noncrystallographic symmetry axis, oriented with respect to
orthogonal frame with X along a, Y along c* cross a, and Z
along x cross y (i.e. c*).
PSI = Psi = angle between NC symmetry axis and +Y axis. Phi = angle
between projection of NC symmetry axis on XZ plane and +X axis.
CHI = +Phi = CCW rotation about +Y axis as measured from +X axis.
+Chi = CW rotation about the directed axis, when viewed from
the +axis toward the origin. All angles in degrees.
OX =
Origin of NC symmetry rotation axis, in angstroms with respect
OY =
to the orthogonal axes. The axis passes through this point.
OZ =
T = Post rotation translational shift (in angstroms) parallel to the
rotation axis.
Note that the transformation operator input is defined as that which
moves molecule 1 to molecule J (both molecules within this crystal,
with J ranging from 2 to NMOL) via
Xj = (Rm) (X1 - Xo) + Xo + T*Rx
where Rm is a 3x3 rotation matrix expressed in terms of the spherical
polar angles, Xj, X1 are 3 element column vectors containing new and
old coordinates, respectively, Xo is a 3 element column vector
containing coordinates of the origin point for the rotation axis, T is
a post rotation translation shift scalar (in angstroms) and Rx is a 3
element column vector containing direction cosines of the rotation
axis.
The translation shift T is for a translation parallel to the
rotation axis (screw like) as translations in any other direction
can be achieved simply by changing the rotation axis origin. An
initial estimate of T can be obtained from two points P1, P2 related
by the NC symmetry from
T = DX cos(PHI)sin(PSI) + DY cos(PSI) -DZ sin(PHI)sin(PSI)
where DX = P2x-P1x, DY = P2y-P1y, DZ = P2z-P1z
and the P's are expressed in the orthogonal axial system.
Note the directionality of the transformation (P1 going to P2 as
opposed to P2 going to P1) affects the sign of T (and CHI).
THIS IS THE END OF INPUT UNLESS DOING CROSS-CRYSTAL AVERAGING
**** The following cards should be included ONLY if NCRYST > 1 ****
Cards VIII must be repeated NCRYST -1 times, with each entry
providing the operator which moves molecule 1 in crystal 1 to molecule
1 in crystal 2, molecule 1 in crystal 1 to molecule 1 in crystal 3,
molecule 1 in crystal 1 to molecule 1 in crystal 4 etc.
CARD(S) VIII PHI, PSI, CHI, OX, OY, OZ, T (free format)
PHI =
PSI =
All defined as described above. Note that the operator
CHI =
is applied to ORTHOGONAL coordinates in crystal 1 to
OX =
generate ORTHOGONAL coordinates in the target crystal.
OY =
OZ =
T =
NOTES: Each input mask must coincide exactly with its corresponding
input map.
CROSS-CRYSTAL AVERAGING: If the different crystals contain different
aggegation states of the molecule within their respective asymmetric
units (eg monomer in one crystal, dimer in another). Then the
crystal with the lowest agregation state should come first in the
input list, and mask assignments in the other crystals must uniquely
identify molecules of this same size. Thus for example, if a
crystal contained a dimer having pure NC twofold symmetry and it was
the only crystal used, normally a single mask encompassing the
entire dimer would be supplied (mask numbers would be identical for
molecules 1 and 2). If however, in addition to averaging over this
twofold, one also averages with another crystal form containing only
a monomer, then the monomer crystal should come first in the list,
and different mask numbers must be used within the dimer crystal
to distinguish the individual monomers. If all crystal forms contain
the same basic unit (eg dimers, trimers etc), then individual mask
numbers for each monomer are not required, but may still be used as
long as it is done consistantly in all crystals.
***** FILES *****
INPUT MAP FILES (BINARY)
record 1) A,B,C,AL,BE,GA,NX,NY,NZ,IXMN,IYMN,IZMN,IXMX,IYMX,IZMX
with first 6 values REAL*4, next 9 INTEGER*4, lengths in Angstroms,
angles in degrees.
NX =
Number of grid points defining one "cell length" along
NY = respective axis. Implicitly defines grid spacing as
del x = A/NX, del y = B/NY and del z = C/NZ
NZ =
IXMN, IXMX =
Minimum, maximum grid index defining map region such
IYMN, IYMX = that x (fractional) = IX * (del x) / A etc.
There are no restrictions on magnitudes or signs.
IZMN, IZMX =
The map follows as (IYMX-IYMN+1)*(IZMX-IZMN+1) records, with
each containing one row (IXMX-IXMN+1 REAL*4 values) along X,
starting at IXMN. Y is slowest varying, i.e. the file could have
been created with the following FORTRAN code:
DO 30 IY=IYMN,IYMX
DO 20 IZ=IZMN,IZMX
20 WRITE(LU)(RHO(IX,IY,IZ),IX=IXMN,IXMX)
30 CONTINUE
INPUT MASK FILES (BINARY)
Header record identical to map file.
Mask records similar to normal map records except that
the mask values are written as FORTRAN type "BYTE" (INTEGER*1).
Only grid points with mask values of 0, 10, 20, 30, 40 etc will
be used (i.e. inside envelope masks 1,2,3,4,5 etc, respectively).
OUTPUT MAP FILES (BINARY)
Identical (in structure) to input map file, but contains density
"averaged" over the specified points.
2.22 MAPORTH WRITE-UP
This program orthogonalizes an electron density map for later use
with programs LSQROT or LSQROTGEN. This program is only needed if one
wants to refine the noncrystallographic symmetry operator(s), and even
then only if the unit cell is not orthogonal. One generally extracts a
region from the map encompassing one dimer, trimer etc. with MAPVIEW
or EXTRMAP, and inputs only the extracted map here for
orthogonalization. One can optionally othogonalize an input mask file
in addition to the map if desired, as may be the case if the mask will
be used to delineate volumes to use in the refinement.
INPUT DATA (UNIT 5)
CARD I PAMFIL (free format)
PAMFIL = Input file specifying cell parameters, symmetry,
used only to get "running log" file.
CARD II INPMAP (free format)
INPMAP = Input (non-orthogonal) map, from EXTRMAP or
MAPVIEW
CARD III IRANGE (free format)
IRANGE = 0 for normal operation.
= 1 to simply compute orthogonal coordinates for
all points in input map, and ouput range (in
orthogonal coordinates) which just encompasses
all of the input map. The range can then be used
to determine map parameters for a subsequent run
with IRANGE=0.
******** following CARDS read only if IRANGE = 0 ********
CARD IV OUTMAP (free format)
OUTMAP = Output map file to contain orthogonal map.
CARD V a', b', c' (free format)
a' = Cell lengths (in angstroms) for the output orthogonal
map. They should be large enough to cover the same
b' = volume as the input map, when it is referenced in the
orthogonal system. New orthogonal cell a',b',c' has a'
c' = along old a, b' along old c* cross old a, c' along a'
cross b' ( i.e. old c*)
CARD VI MGX,MGY,MGZ,LXMN,LXMX,LYMN,LYMX,LZMN,LZMX (free format)
MGX =
Number of grid points defining one "cell length" along
MGY = the respective orthogonal axis. Implicitly defines grid
spacing as del x = a'/MGX, del y = b'/MGY and del z = c'/MGZ
MGZ =
LXMN, LXMX =
Minimum, maximum grid index defining output map region
LYMN, LYMX = such that x (fractional) = LX * (del x) / a' etc.
There are no restrictions on magnitudes or signs.
LZMN, LZMX =
CARD VII IMASK (free format)
IMASK = 0 For no mask input.
= 1 for input mask corresponding to input map. The mask will
also be "orthogonalized" so that it can be used by LSQROT
or LSQROTGEN.
******** following cards read ONLY if MASK=1 ********
CARD VIII INPMSK (free format)
INPMSK = Input mask file, from MAPVIEW or EXTRMSK
CARD IX OUTMSK (free format)
OUTMSK = Output (orthogonal) mask file
NOTES: If a mask is input, it must coincide exactly with the input
map.
******** FILES ********
INPUT MAP FILE (BINARY)
record 1) A,B,C,AL,BE,GA,NX,NY,NZ,IXMN,IYMN,IZMN,IXMX,IYMX,IZMX
with first 6 values REAL*4, next 9 INTEGER*4, lengths in Angstroms,
angles in degrees.
NX =
Number of grid points defining one "cell length" along
NY = respective axis. Implicitly defines grid spacing as
del x = A/NX, del y = B/NY and del z = C/NZ
NZ =
IXMN, IXMX =
Minimum, maximum grid index defining map region such
IYMN, IYMX = that x (fractional) = IX * (del x) / A etc.
There are no restrictions on magnitudes or signs.
IZMN, IZMX =
The map follows as (IYMX-IYMN+1)*(IZMX-IZMN+1) records, with
each containing one row (IXMX-IXMN+1 REAL*4 values) along X,
starting at IXMN. Y is slowest varying, i.e. the file could have
been created with the following FORTRAN code:
DO 30 IY=IYMN,IYMX
DO 20 IZ=IZMN,IZMX
20 WRITE(LU)(RHO(IX,IY,IZ),IX=IXMN,IXMX)
30 CONTINUE
INPUT MASK FILE (BINARY)
Header record identical to map file.
Mask records similar to normal map records except that
the mask values are written as FORTRAN type "BYTE" (INTEGER*1).
Only grid points with mask values of 0, 10, 20, 30, 40 etc will
be transformed (i.e. inside envelope masks 1,2,3,4,5 etc,
respectively).
OUTPUT MAP FILE (BINARY)
Identical (in structure) to input map file, but cell, map and
density values correspond to the orthogonal "cell".
OUTPUT MASK FILE (BINARY)
Header record identical to map file.
Mask records similar to normal MASK records, but cell, map and
mask values correspond to the orthogonal "cell".
2.23 LSQROT WRITE-UP
This program refines the orientation and position of a
noncrystallographic symmetry axis by least squares. It is applicable
only to pure rotational noncrystallographic symmetry axes, and the
rotation must be N-FOLD where N is an integer. An input map (which
MUST be orthogonal) is read in along with control information
specifying initial values for the operator, and what area in the map
to consider. The map area considered can be all points within a given
distance from an arbitrary input point, all points within an input
mask, or all points simultaneously satisfying both conditions. The
input map usually encompasses a dimer, trimer etc and was extracted
from a FSFOUR map via programs MAPVIEW or EXTRMAP. The input mask, if
one is used, must correspond precisely to the input map. If the input
map (and mask) does not correspond to an orthogonal system, program
MAPORTH should be used to convert them before they can be used here.
After each cycle of refinement the correlation coefficient is printed
along with the new parameters. One should always start with a low
resolution map (roughly 6 angstrom data, on a 2 angstrom grid) in case
the initial estimates are inaccurate. As the refinement converges
higher resolution data and finer map grids should be used. It is often
sufficient to refine within a sphere of radius 25-35 angstroms
centered on a point on the rotational axis near the center of gravity
of the dimer, trimer etc. This enables refinement of the operator
without the need for an input mask (although one will be needed later
for averaging). Usually correlation coefficients of about 0.4 or
higher (in a 4 angstrom map) indicate the noncrystallographic symmetry
axis is well positioned, and that averaging will be useful.
INPUT DATA (UNIT 5)
CARD I PAMFIL (free format)
PAMFIL = Input file specifying cell and symmetry
parameters, used only to get "running log" file
CARD II INPMAP (free format)
INPMAP = Input map (orthogonal)
CARD III PHI, PSI, OX, OY, OZ, NFOLD (free format)
PHI = Spherical polar angles defining direction of the non-
crystallographic symmetry axis, oriented with respect
PSI = to orthogonal frame with X along a, Y along c* cross
a, and Z along x cross y (i.e. c*). PSI = angle
between NC symmetry axis and +Y axis. PHI = angle
between projection of NC symmetry axis on XZ plane
and +X axis. +PHI= CCW rotation about +Y axis as
measured from +X axis.
OX =
Origin of NC symmetry rotation axis, in angstroms
OY = with respect to the orthogonal axes. The axis passes
through this point.
OZ =
NFOLD = Order of the rotational axis, e.g 2,3,4 etc.
CARD IV NOBS, NCYCLE, ISPHER, IMASK (free format)
NOBS = 2 times number of reflections used to compute map
(used only to compute sigmas)
NCYCLE = Number of refinement cycles
ISPHER = 0 use all points in map
= 1 use only grid points within a specified sphere.
IMASK = 0 for no mask input
= 1 to only use grid points within envelope specified
by input mask. (also subject to ISPHER criteria)
CARD V INPMSK (free format)
******** include this card ONLY if IMASK=1 ********
IMASK = Input mask file
CARD VI XCEN, YCEN, ZCEN RAD (free format)
******** include this card ONLY if ISPHER=1 ********
XCEN =
Sphere center, in Angstroms, with respect to orthogonal
YCEN =
coordinate system.
ZCEN =
RAD = Sphere radius, in Angstroms
CARD VII ( IVAR(I), I=1,5 ) (free format)
Variable selection information
IVAR(1) = 1 to refine PHI, 0 to hold fixed
IVAR(2) = 1 to refine PSI, 0 to hold fixed
IVAR(3) = 1 to refine OX, 0 to hold fixed
IVAR(4) = 1 to refine OY, 0 to hold fixed
IVAR(5) = 1 to refine OZ, 0 to hold fixed
NOTES: Input map must be orthogonal. If the crystal system does not
have orthogonal axes, program MAPORTH must be run to orthogonalize
the map (and mask, if one is to be used).
If a mask is input, it must coincide exactly with the input map.
Normally all parameters are refined, but occasionally one must use
the IVAR selection flags to hold a parameter fixed. An example would
be the case where PSI is close to 0, in which case PHI is then
indeterminate (and irrelevant!). One could then hold PHI fixed to
avoid matrix singularities.
******** FILES ********
INPUT MAP FILE (BINARY)
record 1) A,B,C,AL,BE,GA,NX,NY,NZ,IXMN,IYMN,IZMN,IXMX,IYMX,IZMX
with first 6 values REAL*4, next 9 INTEGER*4, lengths in Angstroms,
angles in degrees.
NX =
Number of grid points defining one "cell length" along
NY = respective axis. Implicitly defines grid spacing as
del x = A/NX, del y = B/NY and del z = C/NZ
NZ =
IXMN, IXMX =
Minimum, maximum grid index defining map region such
IYMN, IYMX = that x (fractional) = IX * (del x) / A etc.
There are no restrictions on magnitudes or signs.
IZMN, IZMX =
The map follows as (IYMX-IYMN+1)*(IZMX-IZMN+1) records, with each
containing one row (IXMX-IXMN+1 REAL*4 values) along X, starting at
IXMN. Y is slowest varying, i.e. the file could have been created with
the following FORTRAN code:
DO 30 IY=IYMN,IYMX
DO 20 IZ=IZMN,IZMX
20 WRITE(LU)(RHO(IX,IY,IZ),IX=IXMN,IXMX)
30 CONTINUE
INPUT MASK FILE (BINARY), if needed
Header record identical to map file.
Mask records similar to normal map records except that the mask values
are written as FORTRAN type "BYTE" (INTEGER*1). Only grid points with
mask values of 0, 10, 20, 30, 40 etc will be used (i.e. inside
envelope masks 1,2,3,4,5 etc, respectively).
2.24 LSQROTGEN WRITE-UP
This program refines the orientation and position of a general
noncrystallographic symmetry operator by least squares. It is
applicable to any noncrystallographic symmetry transformation,
including arbitrary rotation angles and post rotation translations.
The operator being refined may relate regions of density within the
same crystal or in different crystals, so that cross-crystal averaging
is possible. The input map(s) (which MUST be orthogonal) are read in
along with control information specifying initial values for the
operator, the number of crystals, and what areas in the map(s) to
consider. The map areas considered can be all points within a given
distance from arbitrary input points, all points within input masks,
or all points simultaneously satisfying both conditions.
The input map(s) usually encompasses a dimer, trimer etc and are
extracted from FSFOUR maps via programs MAPVIEW or EXTRMAP. The input
masks, if used, are prepared by MAPVIEW or MDLMSK and must correspond
precisely to the input maps. In the single crystal case if masks are
used, different mask values corresponding to the different molecules
present allow a single mask file to be used (see MAPVIEW). One merely
specifies which molecules (mask numbers) are to be used in refinement.
If the input map (and its corresponding mask) do not correspond to an
orthogonal system, program MAPORTH should be used to convert them
before they can be used for operator refinement here. After each cycle
of refinement the correlation coefficient is printed along with the
new parameters. In the two crystal case a scale factor relating the
density within the appropriate envelopes in each crystal is
automatically refined. In most cases one should start with a low
resolution map (roughly 6 angstrom data, on a 2 angstrom grid) in case
the initial operator parameters are inaccurate. As the refinement
converges higher resolution data and finer map grids should be used.
It is often sufficient to refine considering only density within
spheres of radius 15-25 angstroms, centered on either the aggregate
centroid (if pure rotational symmetry is present), or centered on
each molecules center of gravity (the general case). This enables one
to refine the operator without the need for an input mask (although a
mask will be needed later for averaging). Usually correlation
coefficients of about 0.4 or higher (in a 4 angstrom map) indicate the
noncrystallographic symmetry axis is well positioned, and that
averaging will be useful.
INPUT DATA (UNIT 5)
CARD I PAMFIL (free format)
PAMFIL = Input file specifying cell and symmetry
parameters, used only to get "running log" file
CARD II NCYCLE, NCRYST (free format)
NCYCLE = No. of cycles of least squares refinement.
NCRYST = No. of crystals (1 or 2). Normally 1 but for
cross-crystal averaging it should be 2.
CARD III MAPFILE1 (free format)
MAPFILE1 = Name of file containing map for crystal 1.
***** Include card IIIA ONLY if NCRYST=2 *****
CARD IIIA MAPFILE2 (free format)
MAPFILE2 = Name of file containing map for crystal 2.
CARD IV PHI, PSI, CHI, OX, OY, OZ, T (free format)
Spherical polar angles defining direction and rotational
PHI = order of noncrystallographic symmetry axis, oriented with
respect to orthogonal frame with X along a, Y along c*
cross a, and Z along x cross y (i.e. c*). PSI = angle
PSI = between NC symmetry axis and +Y axis. PHI = angle between
projection of NC symmetry axis on XZ plane and
CHI = +X axis. +PHI = CCW rotation about +Y axis as measured
from +X axis. +CHI = CW rotation about the directed axis,
when viewed from the +axis toward the origin
OX =
Origin of NC symmetry rotation axis, in angstroms with
OY = respect to the orthogonal axes. The axis passes through
this point.
OZ =
T = Post rotation translational shift (in angstroms) parallel
to the rotation axis.
Note that the transformation operator refined is defined as that which
moves molecule "A" to molecule "B" via
Xb = (Rm) (Xa - Xo) + Xo + T*Rx
where Rm is a 3x3 rotation matrix expressed in terms of the spherical
polar angles, Xb, Xa are 3 element column vectors containing new and
old coordinates, respectively, Xo is a 3 element column vector
containing coordinates of the origin point for the rotation axis, T is
a post rotation translation shift scalar (screw like, in angstroms)
and Rx is a 3 element column vector containing direction cosines of
the rotation axis. All components of the operator are given in terms
of orthogonal coordinates in Angstroms, and the operator is applied
to (and yields) orthogonal coordinates.
For cross-crystal averaging applications molecule "A" is assumed to
reside in crystal 1 and molecule "B" in crystal 2. The ORTHOGONAL
coordinate systems in both crystals are then simply superimposed.
Note that the operator is defined relative to only the (common)
ORTHOGONAL axes, so one need not be concerned about its orientation
realatve to each set of CRYSTAL axes.
CARD V ISPHERE_A, MSK_A (free format)
ISPHERE_A = 0 consider all points in map (subject only to
MSK_A criteria below) for molecule A.
= 1 consider only grid points within specified
sphere for molecule A. (also subject to
MSK_A criteria below).
MSK_A = 0 no mask input for molecule A.
= 1 consider only grid points within envelopes
specified by input mask for molecule A. (also
subject to ISPHERE_A criteria)
Note that ISPHERE_A and MSK_A should not BOTH
be 0, although both can be 1 in which case
both criteria are applied for grid point
selection.
***** Include cards VA and VB ONLY if MSK_A = 1 *****
CARD VA MASK_A (free format)
MASK_A = Mask number (from 1-12) identifying points within
molecular envelope for "A". The value should
correspond to that used during mask creation.
CARD VB MASKFILE1 (free format)
MASKFILE1 = Name of file containing mask for crystal 1.
***** Include card VC ONLY if ISPHERE_A = 1 *****
CARD VC XCENA,YCENA,ZCENA,RADA (free format)
XCENA =
Sphere center, in Angstroms, with respect to orthogonal
YCENA =
coordinate system, situated in molecule "A".
ZCENA =
RADA = Sphere radius, in Angstroms, for molecule "A"
CARD VI ISPHERE_B, MSK_B (free format)
ISPHERE_B = 0 consider all points in map (subject only to
MSK_B criteria below) for molecule B.
= 1 consider only grid points within specified
sphere for molecule B. (also subject to MSK_B
criteria below).
MSK_B = 0 no mask input for molecule B.
= 1 consider only grid points within envelopes
specified by input mask for molecule B. (also
subject to ISPHERE_B criteria)
Note that ISPHERE_B and MSK_B should not BOTH
be 0, although both can be 1 in which case
both criteria are applied for grid point
selection.
***** Include cards VIA and VIB ONLY if MSK_B = 1 *****
CARD VIA MASK_B (free format)
MASK_B = Mask number (from 1-12) identifying points within
molecular envelope for "B". The value should
correspond to that used during mask creation.
CARD VIB MASKFILE2 (free format)
MASKFILE2 = Name of file containing mask for crystal 2.
(may be same as MASKFILE1, if NCRYST=1).
***** Include card VIC ONLY if ISPHERE_B = 1 *****
CARD VIC XCENB,YCENB,ZCENB,RADB (free format)
XCENB =
Sphere center, in Angstroms, with respect to orthogonal
YCENB =
coordinate system, situated in molecule "B".
ZCENB =
RADB = Sphere radius, in Angstroms, for molecule "B"
CARD VII ( IVAR(I), I=1,7 ) (free format)
Variable selection information
IVAR(1) = 1 to refine PHI, 0 to hold fixed
IVAR(2) = 1 to refine PSI, 0 to hold fixed
IVAR(3) = 1 to refine CHI, 0 to hold fixed
IVAR(4) = 1 to refine OX, 0 to hold fixed
IVAR(5) = 1 to refine OY, 0 to hold fixed
IVAR(6) = 1 to refine OZ, 0 to hold fixed
IVAR(7) = 1 to refine T, 0 to hold fixed
NOTES: Input maps (and masks, if used) must be orthogonal. If the
crystal systems do not have orthogonal axes, program MAPORTH must be
run to orthogonalize the maps (and masks, if used).
If masks are input, they must coincide exactly with their
corresponding maprs.
Normally all parameters are refined, but occasionally one must use
the IVAR selection flags to hold one or more parameters fixed.
An example would be the case where PSI is close to 0, in which case
PHI is then indeterminate (and irrelevant!). One could then hold PHI
fixed to avoid matrix singularities. Also, in cases where pure
translations are involved, one could hold all of the angles fixed.
The translation shift T is for a translation parallel to the
rotation axis (screw like) as translations in any other direction
can be achieved simply by changing the rotation axis origin. An
initial estimate of T can be obtained from two points P1, P2
related by the NC symmetry from
T = DX cos(PHI)sin(PSI) + DY cos(PSI) -DZ sin(PHI)sin(PSI)
where DX = P2x-P1x, DY = P2y-P1y, DZ = P2z-P1z
and the P's are expressed in the orthogonal axial system.
Note the directionality of the transformation (P1 going to P2 as
opposed to P2 going to P1) affects the sign of T (and CHI). If
the transformation operator is available as a simple 3x3 matrix
and 1x3 vector WHICH OPERATES ON ORTHOGONAL COORDINATES AS IN THE
PROTEIN DATA BANK FRAMEWORK, then the program O_to_sp can be used
to convert that representation to the spherical polar angles, axis
offset and post rotation translation needed here. An example of
this usage would be to get the matrix and vector from one of the
"lsq" options in the graphics program "O", and use o_to_sp to
convert the information to PHASES style.
******** FILES ********
INPUT MAP FILES (BINARY)
record 1) A,B,C,AL,BE,GA,NX,NY,NZ,IXMN,IYMN,IZMN,IXMX,IYMX,IZMX
with first 6 values REAL*4, next 9 INTEGER*4, lengths in Angstroms,
angles in degrees.
NX =
Number of grid points defining one "cell length" along
NY = respective axis. Implicitly defines grid spacing as
del x = A/NX, del y = B/NY and del z = C/NZ
NZ =
IXMN, IXMX =
Minimum, maximum grid index defining map region such
IYMN, IYMX = that x (fractional) = IX * (del x) / A etc.
There are no restrictions on magnitudes or signs.
IZMN, IZMX =
The map follows as (IYMX-IYMN+1)*(IZMX-IZMN+1) records, with
each containing one row (IXMX-IXMN+1 REAL*4 values) along X,
starting at IXMN. Y is slowest varying, i.e. the file could have
been created with the following FORTRAN code:
DO 30 IY=IYMN,IYMX
DO 20 IZ=IZMN,IZMX
20 WRITE(LU)(RHO(IX,IY,IZ),IX=IXMN,IXMX)
30 CONTINUE
INPUT MASK FILES (BINARY), if needed
Header record identical to map file.
Mask records similar to normal map records except that the mask values
are written as FORTRAN type "BYTE" (INTEGER*1). Only grid points with
mask values of 0, 10, 20, 30, 40 etc will be used (i.e. inside
envelope masks 1,2,3,4,5 etc, respectively).
2.25 SKEW WRITE-UP
Program SKEW, for the conversion of a "normal" input map (and
optionally, mask) to a "skewed" cell, such that the new b axis will
correspond to a specified direction. The input map is usually created
by MAPVIEW or EXTRMAP, and the input mask (if any) is usually created
by MAPVIEW or EXTRMSK. This program is generally used if the input
noncrystallographic symmetry operator is purely rotational, with the
rotational order given by 360/N where N is an integer, i.e. pure
twofolds, threefolds etc. In that case it is much easier to create the
averaging mask (in MAPVIEW) when looking directly down the
noncrystallographic symmetry axis, and convert it back to the standard
orientation (via program TRNMSK) for use in averaging cycles. Thus one
would use MAPVIEW or EXTRMAP to extract a region from the map
encompassing the dimer, trimer etc. to be averaged, skew that
extracted map, input it into MAPVIEW to trace out the mask in the
skewed direction, and then convert the skewed mask from MAPVIEW back
into the standard orientation with TRNMSK. One would then input the
standard (non-skewed) mask into MAPVIEW again, along with the submap
from which the skewed map was originally created, and invoke the
MAKE ASU option. This last step allows one to check for redundant
entries (by CRYSTAL symmetry) within the envelope, and to correct them
prior to saving the final mask for averaging.
INPUT DATA (UNIT 5)
CARD I PAMFIL (free format)
PAMFIL = Input parameter file containing cell and symmetry
information, used only to get "running log" file
CARD II INPMAP (free format)
INPMAP = Input map file, from MAPVIEW or EXTRMAP
CARD III PHI, PSI, OX, OY, OZ (free format)
Spherical polar angles defining direction of
PHI = noncrystallographic symmetry axis, oriented with respect
to orthogonal frame X,Y,Z with X along a, Y along c* cross
PSI = a, and Z along X cross Y (i.e. c*). PSI = angle between NC
symmetry axis and +Y axis. PHI = angle between projection
of NC symmetry axis on XZ plane and +X axis. +PHI= CCW
rotation about +Y axis as measured from +X axis.
OX =
Origin of NC symmetry rotation axis, in angstroms with
OY =
respect to the orthogonal axes. The axis passes through
this point.
OZ =
CARD IV IRANGE, IMASK (free format)
IRANGE = 0 for normal operation (coordinate range for OUTPUT
skewed map will be input to the program by the user)
= 1 to determine coordinate range for OUTPUT map which just
encompasses the input volume, output it, and stop.
IMASK = 0 for normal operation (only skewed map created)
= 1 to additionally create skewed mask, from input mask
****** FOLLOWING CARDS READ ONLY IF IRANGE=0 ******
CARD V OUTMAP (free format)
OUTMAP = Output (skewed) map file
CARD VI CELL,MX,MY,MZ, LXMN,LXMX,LYMN,LYMX,LZMN,LZMX (free format)
CELL = length (in Angstroms) for "cell" parameters in "skewed" cell.
MX =
Number of grid points defining one "cell length" along
MY = respective axis in "skewed" cell. Implicitly defines grid
spacing as del x = CELL/MX, del y = CELL/MY and del z = CELL/MZ
MZ =
LXMN, LXMX =
Minimum, maximum grid index defining output map region
LYMN, LYMX = such that x (fractional) = LX * (del x) / CELL etc.
There are no restrictions on magnitudes or signs.
LZMN, LZMX =
****** FOLLOWING CARDS READ ONLY IF IMASK=1 ******
CARD VII INPMSK (free format)
INPMSK = Input mask file (standard orientation)
CARD VIII OUTMSK (free format)
OUTMSK = Output (skewed) mask file.
******** FILES ********
INPUT (AND OUTPUT) MAP FILES (BINARY)
record 1) A,B,C,AL,BE,GA,NX,NY,NZ,IXMN,IYMN,IZMN,IXMX,IYMX,IZMX
with first 6 values REAL*4, next 9 INTEGER*4, lengths in Angstroms,
angles in degrees.
NX =
Number of grid points defining one "cell length" along
NY = respective axis. Implicitly defines grid spacing as
del x = A/NX, del y = B/NY and del z = C/NZ
NZ =
IXMN, IXMX =
Minimum, maximum grid index defining map region such
IYMN, IYMX = that x (fractional) = IX * (del x) / A etc.
There are no restrictions on magnitudes or signs.
IZMN, IZMX =
The map follows as (IYMX-IYMN+1)*(IZMX-IZMN+1) records, with each
containing one row (IXMX-IXMN+1 REAL*4 values) along X, starting at
IXMN. Y is slowest varying, i.e. the file could have been created with
the following FORTRAN code:
DO 30 IY=IYMN,IYMX
DO 20 IZ=IZMN,IZMX
20 WRITE(LU)(RHO(IX,IY,IZ),IX=IXMN,IXMX)
30 CONTINUE
NOTES: If a mask is input, it must coincide exactly with the input
map.
INPUT (AND OUTPUT) MASK FILES (BINARY)
Header record identical to map files.
Mask records similar to normal map records except that the mask values
are written as FORTRAN type "BYTE" (INTEGER*1). Grid points with mask
values of 0, 10, 20, 30, 40 etc correspond to envelope masks
1,2,3,4,5 etc, respectively.
2.26 BLDCEL WRITE-UP
BLDCEL is a program to rebuild an electron density (and optionally
a mask) map covering one complete unit cell, from an input map (and
mask) covering an asymmetric unit. Typically an asymmetric unit
encompassing a dimer, trimer etc is extracted from a FSFOUR map by
EXTRMAP or MAPVIEW, and averaging is done only within that submap in
program MAPAVG. The averaged asymmetric unit map and corresponding
mask is input here, along with the FSFOUR map from which the
asymmetric unit map was extracted. An exact copy of the FSFOUR map is
first created, but then density values at grid points which lie within
the averaging envelopes are replaced by their values from the averaged
map. Averaged density also replaces values at grid points related by
crystal symmetry to the input averaged map. Thus the output map
corresponds to the averaged map, except that it covers one full cell
and conforms to the space group symmetry. Density at values which were
not averaged simply retain their values from the original map. The
format is identical to that produced by FSFOUR, thus this map is
suitable for inversion, peak search etc. If desired, the input
asymmetric unit mask can also be expanded to a full cell mask. There
is never any need to do this for averaging purposes, but it is useful
if one wants to use SOLVENT FLATTENING masks which were edited by
program MAPVIEW. In that case one would extract a region from the
solvent mask (created by BNDRY, option 1) which covers an asymmetric
unit, by using MAPVIEW or EXTRMSK. Edit the extracted solvent mask in
MAPVIEW. Then use the MAKE ASU expansion option in MAPVIEW to create
an edited mask file obeying crystal symmetry. Finally, input the
edited mask file here, and request that it also gets expanded to a
full cell. The output mask file then can be used for solvent
flattening, since it covers a full cell, obeys space group symmetry
and has the same structure as a normal solvent flattening mask.
INPUT DATA (UNIT 5)
RECORD 1 PAMFIL (free format)
PAMFIL = Input parameter file specifying cell and symmetry
information.
RECORD 2 INPMAP (free format)
INPMAP = Input map file (unaveraged, full cell, from
FSFOUR)
RECORD 3 OUTMAP (free format)
OUTMAP = Output map file (full cell, averaged)
RECORD 4 INPASU (free format)
INPASU = Input map (averaged, asymmetric unit)
RECORD 5 INPMSK (free format)
INPMSK = Input mask (asymmetric unit)
RECORD 6 NEWMSK (free format)
NEWMSK = 0 for no expansion of mask to full cell
= 1 to also expand input mask to full cell
RECORD 7 OUTMSK (free format)
******* include this record only if NEWMSK=1 *******
OUTMSK = Output mask file (full cell)
***** FILES *****
INPUT MAP FILE "INPMAP" (BINARY)
Standard FSFOUR map, default orientation i.e. NORN=0
OUTPUT MAP FILE "OUTMAP" (BINARY)
Same structure as input file "INPMAP"
INPUT MAP FILE "INPASU" (BINARY)
record 1) A,B,C,AL,BE,GA,NX,NY,NZ,IXMN,IYMN,IZMN,IXMX,IYMX,IZMX
with first 6 values REAL*4, next 9 INTEGER*4, lengths in Angstroms,
angles in degrees.
NX =
Number of grid points defining one "cell length" along
NY = respective axis. Implicitly defines grid spacing as
del x = A/NX, del y = B/NY and del z = C/NZ
NZ =
IXMN, IXMX =
Minimum, maximum grid index defining map region such
IYMN, IYMX = that x (fractional) = IX * (del x) / A etc.
There are no restrictions on magnitudes or signs.
IZMN, IZMX =
The map follows as (IYMX-IYMN+1)*(IZMX-IZMN+1) records, with each
containing one row (IXMX-IXMN+1 REAL*4 values) along X, starting at
IXMN. Y is slowest varying, i.e. the file could have been created with
the following FORTRAN code:
DO 30 IY=IYMN,IYMX
DO 20 IZ=IZMN,IZMX
20 WRITE(LU)(RHO(IX,IY,IZ),IX=IXMN,IXMX)
30 CONTINUE
INPUT MASK FILE "INPMSK" (BINARY)
Header record identical to map file INPASU.
Mask records similar to normal map records in INPASU except that the
mask values are written as FORTRAN type "BYTE" (INTEGER*1). Only grid
points with mask values of 0, 10, 20, 30, 40 etc will be used (i.e.
inside envelope masks 1,2,3,4,5 etc, respectively).
OUTPUT MASK FILE OUTMSK (BINARY)
Same structure as input mask file INPMSK, but covers full cell
2.27 MDLMSK WRITE-UP
This program creates a mask file by identifying all points on a map
grid which are within a given distance from any atom in an input
model. It therefore can be used to create model based masks for use in
averaging (or solvent flattening, if the mask is symmetrized in
MAPVIEW and expanded to a full cell in BLDCEL). The program is
interactive and prompts for the names of the standard parameter file,
an input coordinate file, an output mask file, the atomic radius, a
mask number (from 1 to 12) uniquely identifying the molecule and
parameters for the map grid. The output mask file can be viewed in
MAPVIEW, provided one selects a map region identical to the mask
region, and uses the same grid. If multiple molecules are present in
the asymmetric unit, MDLMSK can be run several times over each
molecule separately (but covering a volume which encompasses all
molecules), specifying a unique mask number for each, but covering the
same region and using the same map grid. The separate ouput files can
then be combined into a single mask file with MRGMSK, which will
retain the identity of each molecular envelope mask. This combined
mask file can then be used for averaging or for refinement of
noncrystallographic symmetry operators in LSQROTGEN. If the
noncrystallographic symmetry is purely rotational with periodicity N
where N is an integer, then only a single mask is needed which
encompases the entire dimer, trimer etc. In that case the output mask
can be used for refinement in LSQROT.
****** FILES ******
INPUT COORDINATE FILE - ASCII with format ( 7X, A1, I3, A4, 5F10.5, I5)
Each record should contain
RT, IRES, ATOM, X, Y, Z, B, OCC, ITYP
where
RT = single letter amino acid code (not used)
IRES = sequence number (MUST be present and unique for each
residue)
ATOM = atom name (not used)
X, Y, Z = fractional atomic coordinates
B = Isotropic thermal factor (not used)
OCC = Occupancy factor (not used)
ITYP = Atomic type identifier (not used)
Note! This format is identical to that used by PHASIT, in
structure factor calculation mode.
OUTPUT MASK FILE (BINARY)
record 1) A,B,C,AL,BE,GA,NX,NY,NZ,IXMN,IYMN,IZMN,IXMX,IYMX,IZMX
with first 6 values REAL*4, next 9 INTEGER*4, lengths in Angstroms,
angles in degrees.
NX =
Number of grid points defining one "cell length" along
NY = respective axis. Implicitly defines grid spacing as
del x = A/NX, del y = B/NY and del z = C/NZ
NZ =
IXMN, IXMX =
Minimum, maximum grid index defining map region such
IYMN, IYMX = that x (fractional) = IX * (del x) / A etc.
There are no restrictions on magnitudes or signs.
IZMN, IZMX =
The mask follows as (IYMX-IYMN+1)*(IZMX-IZMN+1) records, with each
containing one row (IXMX-IXMN+1 BYTE values) along X, starting at
IXMN. Y is slowest varying, i.e. the file could have been created with
the following FORTRAN code, where array MASK is defined to be FORTRAN
type BYTE:
DO 30 IY=IYMN,IYMX
DO 20 IZ=IZMN,IZMX
20 WRITE(LU)(MASK(IX,IY,IZ),IX=IXMN,IXMX)
30 CONTINUE
2.28 MRGMSK WRITE-UP
This program is used to merge two different mask files into a
single mask file. It is needed when one wants to create an averaging
mask from a set of input atomic coordinates, and the
noncrystallographic symmetry present is not purely rotational with
order of rotation N where N is an integer, i.e. if arbitrary rotations
and/or post rotation translations are required. In that case MDLMSK
should be run two or more times, with each run generating a mask for a
different molecule in the asymmetric unit, and SPECIFYING A UNIQUE
MASK NUMBER. Program MRGMSK can then combine all of the individual
mask files into one, which can be used for averaging, operator
refinement etc. All of the input masks must cover precisely the same
map volume, and use the same map grid spacing. The output mask will
correspond to this volume and spacing as well. If both of the input
masks identify the same grid point as being within its envelope, the
status of that point is changed to indicate non-molecule, since it
is not clear to which molecule it should belong. The number of such
overlapping points is output. The output mask can be examined in
MAPVIEW, in which case the different molecular masks will be shown in
different colors. This program is interactive and prompts for the
input and output mask files, and the standard parameter file.
******** FILES ********
INPUT (AND OUTPUT) MASK FILES (BINARY)
record 1) A,B,C,AL,BE,GA,NX,NY,NZ,IXMN,IYMN,IZMN,IXMX,IYMX,IZMX
with first 6 values REAL*4, next 9 INTEGER*4, lengths in
Angstroms, angles in degrees.
NX =
Number of grid points defining one "cell length" along
NY = respective axis. Implicitly defines grid spacing as
del x = A/NX, del y = B/NY and del z = C/NZ
NZ =
IXMN, IXMX =
Minimum, maximum grid index defining map region such
IYMN, IYMX = that x (fractional) = IX * (del x) / A etc.
There are no restrictions on magnitudes or signs.
IZMN, IZMX =
The mask follows as (IYMX-IYMN+1)*(IZMX-IZMN+1) records, with each
containing one row (IXMX-IXMN+1 BYTE values) along X, starting at
IXMN. Y is slowest varying, i.e. the file could have been created with
the following FORTRAN code, where array MASK is defined to be FORTRAN
type BYTE:
DO 30 IY=IYMN,IYMX
DO 20 IZ=IZMN,IZMX
20 WRITE(LU)(MASK(IX,IY,IZ),IX=IXMN,IXMX)
30 CONTINUE
2.29 TRNMSK WRITE-UP
TRNMSK is a program to transform a mask file constructed in a
"skewed" cell back to its conventional cell. In certain situations
(when the noncrystallographic symmetry is purely rotational), it is
highly advantageous to trace out the mask (via MAPVIEW) in terms of a
cell "skewed" such that the new b axis corresponds to the
noncrystallographic symmetry axis direction. In that case it is
usually obvious where the NC symmetry breaks down, and the envelope
mask for averaging is readily obtained. However, skewing the map is
not necessary for the averaging cycles, thus it is desirable to
transform the "skewed" mask, once created, back to the original cell
for all subsequent calculations. TRNMSK accomplishes this.
INPUT DATA (UNIT 5)
CARD I PAMFIL = Input file containing cell parameters and
symmetry, used only to get "running log" file
CARD II INPMAP (free format)
INPMAP = Input map file (corresponds to region extracted
from original map before it was
"skewed")
CARD III INPMSK (free format)
INPMSK = Input mask file (skewed)
CARD IV OUTMSK (free format)
OUTMSK = Output mask file (unskewed), covers same region
as INPMAP
CARD V PHI, PSI, OX, OY, OZ (free format)
PHI =
PSI =
Input spherical polar angles and origin as
OX =
originally input to SKEW
OY =
OZ =
******** FILES ********
INPUT MAP - As described for SKEW
INPUT (AND OUTPUT) MASKS - As described for SKEW
2.30 RDHEAD WRITE-UP
This program can be used to print out the header information on
any map or mask file used in the noncrystallographic symmetry
averaging options. Since all of the programs assume that map and mask
files refer to the same structure, cover precisely the same region and
use the same grid, it is important to be sure that this is the case.
All of the software in fact, verifies this at run time anyway, but it
is still useful at times to see exactly what's on the header record, in
order to find out what went wrong if inconsistancies are reported. The
program prompts for the name of the map or mask file, and prints the
header information regarding cell constants, map periods and grid
range covered by the contents of the file. Note that this program can
be used for ANY mask file, including solvent masks created by BNDRY.
It can also be used for ANY submap map file arising from the averaging
related software (MAPVIEW output, EXTRMAP, SKEW, MAPORTH, MAPAVG) but
NOT for FSFOUR maps, which always cover a full cell anyway.
INPUT (AND OUTPUT) MASKS - As described for SKEW
2.31 PRECESS WRITE-UP
PRECESS can be used to construct and display "pseudo" precession
photographs created from input reflection data files. On SGI
hardware either precess or precess_X can be used. On other hardware
only precess_X can be used. Note also that if precess is run on
an SGI workstation, it should be invoked from a WINTERM window and
NOT from an XTERM window. Precess_X can be invoked from either
window. The user can interactively select the zone to display, and
scroll up or down through neighboring zones, selecting information
for any reflection by moving the cursor to it. Several input file
formats are recognized, including any of the "scaled" files used
within PHASES, XENGEN style "MULISTS" or "UREFLS" files, SCALEPACK
style files or a simple free format input file.
Data within the displayed zone are grouped into bins (256 for the
IRIS GL version, and 101 for X-window version) based on intensity,
and displayed with a corresponding gray scale scheme. Alternatively,
a full color display can be used. If requested, a pseudo background
based on the mean sigma's as a function of resolution can be added
to the display, creating a realistic image complete with beam stop
shadow. If a "scaled" file is input, the user is prompted for the
file type and to select either the native intensities or intensity
differences (isomorphous or anomalous, depending on the input file
type) for display.
The program will first prompt for an input "parameter file"
The program then prompts for a file name, which should be the
name of a data file with a .mu, .urf, .scl, .sca or a .dat extension,
for a resolution cutoff, for the desired zone, whether a pseudo
background is to be included, and whether a color or continuous gray
scale photograph display is desired.
After reading in the data file, a reasonable color scheme is
determined and the desired zone is displayed along with a menu. The
data is selected, when possible, in a manner which preserves anomalous
scattering information and, when possible, the actual measurements
for symmetry related reflections are used. Finally however, rather
than leaving "holes" in the picture, symmetry operations (including
freidel's relationship) are used to fill in missing data. Thus
the resulting display will conform to the true diffraction symmetry
if all required data were present on the input file, but if some
reflections were missing but their symmetry mates were present, the
intensities for the mates are used.
Moving the cursor to a menu item and pressing a mouse button will
then carry out the selected option. In most cases any of the mouse
buttons will suffice, but for some items the buttons have different
functions.
The "UP" and "DOWN" menu items change the color map
intensity thresholds, and thus the image intensity scale. For each
of these options the left mouse button makes a slight change, the
middle button a moderate change and the right button a substantial
change.
Pressing any mouse button while in the "EXIT" field terminates the
program.
Pressing the left or right mouse button while in the "ZONE" field will
toggle the next zone index direction (indicated by the arrow).
Pressing the middle mouse button while in this field will read in and
display the next (or previous) zone as desired, using the current
color intensity scheme.
Moving the cursor in the data display area results in the resolution
and intensity at the current cursor position to be displayed. If one
is near a bragg reflection however, the indices, integrated intensity
and its standard deviation are displayed along with the resolution.
Pressing any mouse button while in the "NEW DIRECTION" field will
allow the user to select another zonal direction (e.g. hk0 when
h0l was originally choosen), and/or select a new resolution cutoff.
Pressing any mouse button while in the "SAVE IMAGE" menu area will
save the entire screen contents as an "image" file, with the name
"prec_N.rgb", where N is a one or two digit number. Numbers start
from zero and are automatically incremented each time an image is
saved. Up to 100 images can be made in any job. NOTE!! This option
is not yet functional on the X-window version of precess.
For the purpose of photographing the display, it is often desirable
to remove the menu and color map since there is usually too much
contrast variation between them and the frame data for both to be
reliably recorded with the same exposure. The menu display can be
toggled on/off by pressing any mouse button while the cursor is
anywhere to the right of the menu items. Note however, that when the
menu is off all other functions are disabled, thus it must be
toggled back on to restore interactive functionality, and to enable
exiting from the program.
***** FILES *****
The type of input reflection file is deduced from the ending
part of the filename. Recognized endings are:
.MU, .mu, .SCL, .scl, .SCA, .sca, .URF, .urf, .DAT or .dat
XENGEN output will typically be either .MU format ( having F's,
NOT I's), or .URF format, although the .URF files can currently
be used only if they were created on a UNIX computer.
Any of the "scaled" file formats accepted by PHASIT can also be
used here, and will be assumed if a ".SCL" or ".sca" ending is
used. If this option is input, the user also will be asked as to
what type of data is in the file, and whether to display the native
intensities or intensity differences (isomorphous or anomalous,
as appropriate for the file type).
If the filename ends with ".SCA" or ".sca", then a SCALEPACK
file is assumed. After a variable number of header records
(see the FILE FORMATS section), reflection records follow and
contain
H, K, L, I+, sig(I+), I-, sig(I-)
in format (3I4, 4F8.1)
Note the use of intensities rather than F's. The last two items
in each record may be omitted. If present, they would be used
only if I+ was not measured.
A general, free format file can be used and is assumed if the
file ends in ".DAT" or ".dat", in which case each record must
contain
IH, IK, IL, F, SIG(F)
readable in free format, i.e. at least one blank or a comma
separates the entries.
2.32 O_TO_SP WRITE-UP
Interactive program, to extract from a 3D transformation
matrix and translation vector, the corresponding spherical polar
angles, axis location and post rotation translation along the
axis direction. Thus, for example, one can obtain an estimate
of the transformation operator via the "O" program, use this
program to convert the NC symmetry operator information to PHASES
format, and use the PHASES routines for operator refinement
averaging, skewing etc. The user is prompted for the elements of the
transformation matrix and vector, which can be found in the
appropriate O data block. Note that the program is not limited
to transformations from "O". As long as the input transformation
operation was to be applied to orthogonal coordinates which are
orthogonalized as in the PDB, then the output will be valid.
2.33 XPL_PHI WRITE-UP
Program to convert the PHASES binary phase file (long format) to
a reflection file suitable for input to the program XPLORE. The
program prompts for input and output file names. All reflections
in the input file are passed to the output file. The output file
will contain the indices, Fobs, phi and the figure of merit. It is
suitable for refinement in XPLORE, with or without phase restraints.
***** FILES *****
INPUT - Binary phase file (long format), as generated from PHASIT
or BNDRY
OUTPUT - ASCII file, containing indices, Fobs, phi (centroid)
and associated figure of merit.
2.34 PDB_CDS WRITE-UP
Interactive program to interchange coordinate files between
PDB and PHASES formats. The user is first prompted to determine if
the conversion will be from an input PDB file to a PHASES file. If
the response is no, the opposite direction is assumed. The user
is then prompted for input and output file names, and whether
or not occupancy and/or thermal factors are to be reset. If either
is to be reset, a prompt for the appropriate new value is given.
The user is then asked what residue range is to be selected for
output, and what chain ID (single character). The chain ID must match
that given in the input file. If no chain ID is present, supply a
blank in response to the prompt. After writing the selected atoms to
the output file, the user is then prompted again for another
range/chain ID. Processing continues until no more ranges/chains
are requested. If a PHASES style output file is requested, it is
suitable for use in programs PHASIT or GREF. The program will
recognize the 20 standard amino acids with their appropriate three
letter codes (PDB format, all upper case) or single letter codes
(PHASES format, upper case). It will also recognize several
additional "residue" types, with the following three letter and one
letter codes, respectively.
SUL U, WAT O, TDP Z, HEM X, CAL B, CAD J, MAG m, ZNC z,
PO4 p, ADE a, CYT c, GUA g, THY t, EXT e
Note that the codes are case sensitive. If a residue type is input
that is not one of the above, it still will be processed but will
be converted to the "extra" type (EXT or e) and a message will be
output. This presents no problem as residue types are never used
anywhere in the PHASES package. Howver, if a residue type was
converted when going to PHASES coordinates, one will have to
remember to manually convert it back to its original designation
following each run of PDB_CDS regenerating PDB coordinates.
When going from PDB to PHASES coordinates, the atom type must
also be deduced from the atom name. To do this the program will use
the first one or two characters of the atom name, possibly in
combination with the residue type. If the atom name starts with the
letters
C, N, O, S, P, FE, ZN or MG (all upper case)
the appropriate atom type will be recognized. However, if the residue
type is CAL or CAD, then an atom name starting with CA or CD will
be recognized as calcium or cadmium, respectively. If the atom name
is inconsistent with any of the above, it is still processed but the
type will be set to carbon and a message will be output. Note that even
if the atom type is not recognized, you can still use the file within
the PHASES package. You will just have to manually reset the atom type
code number in the output file to an appropriate number, and possibly
input the additional scattering factor information to PHASIT and/or
GREF. PHASIT and GREF always recognize the code numbers 1,2,...20 as
corresponding to
C, N, O, S, Fe+3, Pt+2, Hg+2, Au+3, Pb+2, Os+4, I-, Zn+2, Ca+2, Mg+2,
Cd+2, U+6, P, Br-, Cl- and Sm+3, respectively.
Thus any additional atom types must start with the code number 21.
See the PHASIT and GREF writeups for details on how to input the
scattering factors.
The residue names are never used within the PHASES package.
***** FILES *****
INPUT - either a standard PDB file or a PHASES style coordinate
file
OUTPUT - the inverse of what was input.
2.35 RMHEAVY WRITE-UP
This program is used to temporarily remove electron density near
heavy atom sites from an input map, so that a solvent mask can be
accurately created from the map via the automatic boundary procedure.
The strong density often found near heavy atom sites in initial
MIR or SIR maps can lead to an extension of the protein mask near
the heavy atom sites. Since these sites are nearly always on the
protein surface, the effect is that the protein envelope can
incorrectly extend into the solvent region (and if the envelope is
tight, therefore be depleted elsewhere in the protein region).
A list of heavy atom sites is read in along with a map and a distance
cutoff. The atoms are expanded by space group symmetry, and if
any grid point is within the cutoff distance from any atom in the
expanded list, its electron density value is set to zero. The
modified map is then output. Note that this map is normally used
ONLY for the purpose of SOLVENT MASK GENERATION. The original MIR
or SIR map is still used for all solvent flattening computations.
The procedure DOALL.SH will automatically accomplish this, provided
the file names in the supplied template procedures are adhered to.
If one does not wish to do this, simply comment out the two lines
rmheavy < rmhv.d >> mask1.l
mv nohv.map four.map
in each of the files mask1.sh, mask2.sh and mask3.sh
Typically the input file will contain coordinates of all heavy
metals that were used in the phasing, and a distance cutoff of
about 2.5 angstroms.
INPUT DATA (UNIT 5)
RECORD 1 PAMFILE (free format)
PAMFILE = Input parameter file specifying cell and
symmetry information.
RECORD 2 INPMAP (free format)
INPMAP = Input map file (full cell, from FSFOUR)
RECORD 3 OUTMAP (free format)
OUTMAP = Output map file (full cell, as in input)
RECORD 4 NA, RAD (free format)
NA = Number of input heavy atom sites
RAD = Distance cutoff, in angstroms.
the NA input atomic coordinate records now follow
RECORDS 5 ATNAME, X, Y, Z, B, OCC, ITYPE FORMAT(7X,A8,5F10.5,I5)
ATNAME = not used
ITYPE = not used
OCC = not used
X,Y,Z = Fractional atomic coordinates
B = not used
Note that the coordinate format is identical to that used in PHASIT,
thus a copy of the earlier phasing deck can be made and edited for
use here.
***** FILES *****
INPUT MAP - Standard FSFOUR map, covering a full cell
OUTPUT MAP - Same format as input map, but with heavy atom
density removed
2.36 CTOUR WRITE-UP
CTOUR is a program to create contoured plots of electron density
maps which can then be displayed or printed. The program accepts an
input map which is prepared by FSFOUR in the default orientation
(NORN=0), along with limit, direction and contouring information.
The output consists of one or more generic metafiles which can be
converted to the format needed for a given display by the appropriate
driver program, several of which are provided. Multiple plots can be
created within a single run, with each plot consisting of either an
individual map section, a mono projection over multiple sections, or a
stereo projection over multiple sections. Any map region may be
selected and viewed down either a direct cell axis or a reciprocal
cell axis (the latter used for projections). The metafiles created
will have the names plt001.plt, plt002.plt etc and can be viewed
via the driver programs VIEWPLT or VIEWPLT_X (on SGI or X-window
supporting workstations), by PLTTEK (on terminals supporting TEKTRONIX
4010 graphics) or converted to PostScript for subsequent printing
by MKPOST. Note that in general program MAPVIEW (or MAPVIEW_X) would
be preferred to examine contoured plots, since it allows the
interactive selection (and modification) of orientation, region and
contouring intervals. In some instances CTOUR has advantages however,
as it facilitates creation of hard copies for examination away from the
terminal or workstation, for creation of minimaps, and for stereo
plots. CTOUR is very useful for examination of difference Patterson
maps, where for example, all of the Harker sections can be generated
and then displayed simultaneously with the program VIEWPLT (or
VIEWPLT_X). It is recommended that one first examine the plots with
VIEWPLT or VIEWPLT_X before converting to PostScript as this can
be done extremely rapidly, whereas printing and even simply
displaying PostScript files can be much more time consuming.
INPUT DATA (UNIT 5)
CARD I MAPFIL (free format)
MAPFIL = Input map file (from FSFOUR, in default
orientation i.e. NORN=0)
CARD II CMIN,CMAX,CSTEP,IGRID,PSIZE,VDIS,RSCALE (free format)
CMIN =
Minimum, maximum and increment for contour
CMAX =
levels, on the scale set by RSCALE (see below)
CSTEP =
IGRID = 0 To include labels and border on plots
= 1 To include labels, border and grid lines on
plots (facilitates coordinate measurment for
Pattersons)
= 2 To eliminate labels, grid lines and border on
plots
PSIZE = Plot size in inches (usually 10. if hard copy
is to be produced).
VDIS = View distance in inches (usually 30., used
only for stereo plots. Decreasing it increases
the stereo effect).
RSCALE = Sets density scale for contours. If 0., then
density is scaled such that the largest value
in the unit cell is 999. If > 0., then the
density is on an absolute scale (minus the
F000/V term) when the F's used in map creation
are related to an absolute scale by the
factor RSCALE, i.e when F(abs)=RSCALE*F(input).
Regardless of the choice, the min, max and
sigma for the map on the chosen scale will be
listed on the output, and can be used to set
contour levels for a subsequent run.
**** The following card can be repeated as many times as desired ****
CARDS III NSEC,XMN,XMX,YMN,YMX,ZMN,ZMX,NORN (free format)
NSEC = 0 for individual sections, one plot per
section
= 1 for mono projection, one plot for entire
range
= 2 for stereo projection, one plot for
entire range
XMN =
XMX =
YMN = Minimum, maximum coordinates (fractional)
in a, b and c directions defining map
YMX = volume to be contoured.
ZMN =
ZMX =
NORN = 1 view as YZ sections (look down a or a*)
= 2 view as XZ sections (look down b or b*)
= 3 view as XY sections (look down c or c*)
************** EXAMPLES **************
1) The following script will compute a Patterson map and contour
three Harker sections. Three generic plot files (having the
names pltNNN.plt where NNN is a three digit number) will be
created. We start contouring at about 3% of the origin peak
height, which will be scaled to 999. and increase in steps
of 1% of the origin peak. We request that labels and a grid
are included to facilitate coordinate measurement. Finally,
we convert the generic plot files to Postscript. The
corresponding PostScript files will have the names pltNNN.pst
#compute the difference Patterson map
#
fsfour << eod > fsfour.l
seb.pam
Difference Patterson, 3A
0 48 72 80 5 0 20 0 0 0 0.
patt.ref
patt.map
eod
#
#now contour three Harker sections
#
ctour << eod2 > ctour.l
patt.map
30. 999. 10. 1 10. 30. 0.
0 0.5 0.5 0.0 1.0 0.0 1.0 1
0 0.0 1.0 0.5 0.5 0.0 1.0 2
0 0.0 1.0 0.0 1.0 0.5 0.5 3
eod2
#
#now convert all generic plot files to PostScript
#
mkpost *.plt
#
2) The following script will compute an MIR map and generate
a series of plots. First a small mono projection down the b*
axis is created. Then a minimap is made, contouring
individual sections. We start contouring at one sigma
and increase to the maximum in steps of sigma. Min, max
and sigma values were obtained from the log from a prior
short run which contoured only a single section. Labels
and a border are requested, but no grid lines. Finally,
all generic plot files are created to PostScript.
#compute the solvent flattened MIR map
#
fsfour << eod > fsfour.l
pdc.pam
PDC MIR MAP, 3A
0 144 80 120 1 0 20 0 0 0 0.
phi16cy.31
mir.map
eod
#
#now contour both a projection and individual sections
#
ctour << eod2 > ctour.l
mir.map
146. 999. 146. 0 10. 30. 0.
1 -.5 .5 -.05 .05 -.5 .5 2
0 -.42 .45 -.45 .42 -.08 .56 2
eod2
#
#now convert all generic plot files to PostScript
#
mkpost *.plt
#
2.37 VIEWPLT WRITE-UP
VIEWPLT is an interactive program to display one or more plots
created by CTOUR on an IRIS workstation. The analogous program
VIEWPLT_X can be used to display the plots on any workstation
supporting the X-Window protocol. As with MAPVIEW and PRECESS,
if VIEWPLT is run on an SGI workstation it should be invoked only
from a WINTERM window, and NOT from an XTERM window. VIEWPLT_X
can be invoked from either window type. When invoked, the program
prompts for the number of plot files to display (max = 10), and
then for each file name. The plots will be scaled to fit on the
display regardless of the size requested at plot creation time.
If a "/R" is appended to the end of a plot file name, then the
plot will be rotated by 90 degrees, which sometimes allows a
better fit of the plot to its allowed space. When the plot is
finished, the terminal bell will ring to notify the user.
Pressing the "return" will then terminate the program. VIEWPLT
is particularly useful for displaying contoured Harker sections,
as the multiple sections can be simultaneously displayed. It is
also useful to screen contoured projections or sections prior to
conversion to Postscript, as viewing or printing the PostScript
versions takes much longer.
****** FILES ******
INPUT FILES - Generic plot files created by CTOUR
2.38 PLTTEK WRITE-UP
PLTTEK is an interactive program to display plots created by
CTOUR on terminals supporting TEKTRONIX 4010 emulation. When
invoked, the program simply prompts for the name of a plot
file, and proceeds to display it on the terminal. After the
plot is completed, the terminal bell will ring. Pressing the
"return" then terminates the program. Prior to plotting, the
program sends an escape sequence appropriate to place DEC
VT240 series terminals in 4010 emulation mode. Upon termination
an escape sequence to return to native emulation is sent. This
sending of escape sequences can be eliminated by appending
"/NOEM" to the plot file name, which would be appropriate for
a terminal already in 4010 mode. The plot will be scaled to fit
on the display regardless of the size requested at plot creation
time. If a "/R" is appended to the end of a plot file name, then
the plot will be rotated by 90 degrees, which sometimes allows a
better fit of the plot to its allowed space. On UNIX systems use
of the directory delimiter "/" interferes with interpretation of
the "/NOEM" and "/R" switches, thus one should only specify file
names of files resident in the current working directory, where
no "/" other than for the switches will be present. Note that this
type of graphic display is orders of magnitude slower than VIEWPLT,
however in some instances it may be the only way to see a plot;
as for example, when one is working at home from a "dumb" terminal.
Also note that changing terminal emulation modes sometimes alters
behavior of terminals. After viewing the plots, it may be necessary
to log off and back in again, possibly powering down the terminal
in between, or to reissue terminal initialization commands
explicitly.
****** FILES ******
INPUT FILES - Generic plot files created by CTOUR
2.39 MKPOST WRITE-UP
MKPOST is a command to convert generic plot files created by
CTOUR to PostScript. It accepts one or more command line arguments,
which must be names of generic plot files. For each file, a new
PostScript version with extension ".pst" is created. The command
supports filename expansion options such as use of wildcards. Thus
the command
mkpost *.plt
will convert all generic plot files to PostScript, while using
mkpost plt001.plt
will convert only one file, creating the PostScript version
plt001.pst. The PostScript files can then be examined with a
previewer such as psview or xpsview, or printed on a PostScript
printer. MKPOST is actually just a shell script (UNIX) or command
procedure (VMS) to enable file name expansion. It simply creates a
list of filenames and pipes the list to program POSTPLOT, which
actually does the conversions. The program POSTPLOT will
automatically scale the plot so it will fit on standard A4 paper.
When doing this scaling, it may also rotate the plot by 90 degrees
to minimize any required shrinkage. If one insists on an orientation
which is inconsistant with this rotation, then the plot size
requested in CTOUR must be reduced to the point where no shrinkage
is needed at all. Also, note that on UNIX systems a new shell is
spawned, and it is important that the original working directory be
maintained or else one will encounter failures with "file not found"
type messages. This can happen if one has change directory commands
"cd" in their .cshrc or .login files. In that case, upon spawning
the new shell the working directory is changed and the plot files
originally present will not be found.
***** FILES *****
INPUT FILES - Generic plot files produced by CTOUR
OUTPUT FILES - PostScript versions of the input files
2.40 PSTATS WRITE-UP
PSTATS (Phase Statistics) is a program to compute and tabulate
mean phase differences between two phase sets as a function of
d spacing. The program is interactive and prompts for the
parameter file and the names of two phase files. The phase files
can be either the long or short forms (as produced from PHASIT,
BNDRY, or GREF), or even one of each. Reflections need not be
indexed identically in both files, but each file should contain
only unique reflections. The statistics enable one to determine
how different two phase sets are. When combining experimental
phase information with that obtained from a partial structure,
it may be useful to monitor these statistics and use them to
adjust the damping factor defining relative weights between
the two sources of information.
***** FILES *****
INPUT FILES - Binary "phased" files, either in long or short
format as produced by PHASIT, BNDRY or GREF.
2.41 HNDCHK WRITE-UP
HNDCHK is an interactive program to examine electron density
values at specified locations within a map, usually for the
purpose of determining the absolute configuration (hand). The
program prompts for the parameter file, name of the map file and
for the coordinates to be examined. The map file is read and the
minimum and maximum density values along with sigma for the map
are listed. Density values are then interpolated both at the
specified coordinates and at places related to them by a centre
of symmetry, and are listed. In general, one would compute MIR,
SIR etc phases, possibly solvent flattened, and then generate a
Bijvoet difference Fourier map (with coefficients from MRGBDF)
to be used with HNDCHK. One would then examine density values
exactly at the input heavy atom positions used in the phasing.
If the hand was correct, then large positive peaks should occur
at the input sites, whereas if the hand was incorrect larger
NEGATIVE peaks should occur at the TRUE heavy atom locations, i.e.
at places related to the input (incorrect) positions by a centre
of symmetry.
***** FILES *****
INPUT FILE - Standard FSFOUR map, usually computed with the
Bijvoet difference Fourier option (MAPTYP=8)
2.42 SLOEXT WRITE-UP
SLOEXT is a program to control the rate and range of phase
extension when extending phases to higher resolution by solvent
flattening, negative density truncation and/or NC symmetry
averaging. The program is automatically invoked by the "extnd.sh",
"extnda.sh" and "extndavg.sh" scripts, and functions by updating the
"extnd.d" or "extnda.d" files used by these scripts periodically
during the extension process. The initial and final resolution
cutoffs are specified by the user along with the number of map
modification/phase combination cycles to be carried out at each
resolution increment. The resolution is incremented in steps
corresponding to roughly one reciprocal lattice point in the
direction of the shortest reciprocal cell axis. If the initial and
final resolutions are equal, then there is no gradual extension and
only the number of cycles input is carried out. The controlling
scripts assume that the input to MAPINV specifies indices out to the
highest resolution to be encountered anywhere in the entire process.
Likewise, the "extrfl.d" file prepared by MISSNG also should contain
reflections out to the highest resolution.
INPUT DATA (UNIT 5)
RECORD 1 PAMFILE (free format)
PAMFILE = Input parameter file specifying cell and
symmetry information.
RECORD 2 DINIT, DFIN, NC_INC (free format)
DINIT = Initial d spacing cutoff (starting value,
usually the resolution of the initial phase set)
DFIN = Final d spacing cutoff (DFIN must be less than
or equal to DINIT. If DFIN = DINIT, then only
NC_INC cycles will be performed. Otherwise
for EACH RESOLUTION INCREMENT NC_INC cycles
will be performed).
NC_INC = Number of refinement cycles per resolution
increment (between 2 and 25).
RECORD 3 CNTFIL (free format)
CNTFIL = Name of file controlling phase extension. This
should be either "extnd.d" or "extnda.d" (UNIX
systems) or "extnd.dat" or "extnda.dat" (VMS
systems), depending on whether phase only or
phase plus amplitude extension is being done.
This file is referenced by the "extnd.sh",
"extnda.sh" and "extndavg.sh" scripts, or by
their VMS counterparts. The file is assumed to
exist and contains information as described in
the BNDRY write-up (option 3). It will be updated
periodically during the run.
3.00 EXAMPLES
This section contains samples of input for various programs and
procedures. In general, template files containing these examples
are also provided along with the programs on the distribution media.
In some cases (for example, solvent levelling), some practical
considerations are also discussed.
3.01 ***** SAMPLE INPUT PARAMETER FILE *****
LOGFILE=seb.log
LATTICE=P
45.33 68.33 79.62 90. 90. 90.
4
X,Y,Z
1/2-X,-Y,1/2+Z
1/2+X,1/2-Y,-Z
-X,1/2+Y,1/2-Z
3.02 ***** SAMPLE INPUT DECKS FOR PHASIT *****
EXAMPLE I
The deck below will compute SIR phases from a single isomorphous
replacement derivative data set. The resulting phase file can
then be used in the procedure DOALL to carry out Wang's ISIR
process, or in MRGDF or MRGBDF to solve new derivatives or look
for additional sites. The "difference coefficients" file can
be used to compute "observed" and "calculated" difference
Pattersons, heavy atom difference maps or heavy atom "double
difference" maps to find new sites.
The following data is assumed to be in a file called phasit.d
seb.pam
0 0
1 0 1
phasit.31
DIAMINO DICHLORO PT ( ISOMORPHOUS REPLACEMENT DATA )
monopt.scl
pt_iso_diff.31
4. 6. 0 1. 0. 0. 0. 0. 0. 0.
3
PT1 0.2539 0.1918 0.1376 35. 1.000 6
PT2 0.1754 0.0439 0.4578 20. 0.664 6
PT3 0.5474 0.0523 0.6964 50. 0.450 6
EXAMPLE II
The deck below can be used to compute SIRAS phases from isomorphous and
anomalous scattering data from a single derivative. The resulting file
can then be used directly for map computation; used in the procedure
DOALL to carry out solvent flattening/negative density truncation,
phase extension etc, starting with (and tying to) the SIRAS phases; or
in MRGDF or MRGBDF to solve new derivatives or look for additional
sites. The "difference coefficients" files can be used to compute
"observed" and "calculated" difference Pattersons, heavy atom
difference maps or heavy atom "double difference" maps to find new
sites.
The following data is assumed to be in a file called phasit.d
seb.pam
0 0
2 0 1
phasit.31
DIAMINO DICHLORO PT ( ISOMORPHOUS REPLACEMENT DATA )
monopt.scl
pt_iso_diff.31
4. 6. 0 1. 0. 0. 0. 0. 0. 0.
3
PT1 0.2539 0.1918 0.1376 35. 1.000 6
PT2 0.1754 0.0439 0.4578 20. 0.664 6
PT3 0.5474 0.0523 0.6964 50. 0.450 6
DIAMINO DICHLORO PT (DERIVATIVE ANOMALOUS DISPERSION DATA )
monoptano.scl
pt_ano_diff.31
4. 6. 2 1. 0. 0. 0. 0. 0. 0.
3
PT1 0.2539 0.1918 0.1376 35. 1.000 6
PT2 0.1754 0.0439 0.4578 20. 0.664 6
PT3 0.5474 0.0523 0.6964 50. 0.450 6
EXAMPLE III
The deck below assumes isomorphous replacement data is available for
two derivatives, and 5 passes of phase refinement, each consisting
of 3 cycles for each derivative will be done to refine nearly all
possible derivative parameters (except B's), i.e. MIR phases will be
computed and refined. The resulting file can then be used directly for
map computation; used in the procedure DOALL to carry out solvent
flattening/negative density truncation, phase extension etc, starting
with (and tying to) the MIR phases; or in MRGDF or MRGBDF to solve new
derivatives or look for additional sites. The "difference
coefficients" file can be used to compute "observed" and "calculated"
difference Pattersons, heavy atom difference maps or heavy atom
"double difference" maps to find new sites.
The following data is assumed to be in a file called phasit.d
seb.pam
0 0
2 1 1
phasit.31
DIAMINO DICHLORO PT ( ISOMORPHOUS REPLACEMENT DATA )
monopt.scl
pt_iso_diff.31
4. 6. 0 1. 0. 0. 0. 0. 0. 0.
3
PT1 0.2539 0.1918 0.1376 35. 1.000 6
PT2 0.1754 0.0439 0.4578 20. 0.664 6
PT3 0.5474 0.0523 0.6964 50. 0.450 6
HGCL2 ( ISOMORPHOUS REPLACEMENT DATA )
monohg.scl
hg_iso_diff.31
4. 6. 0 1. 0. 0. 0. 0. 0. 0.
2
HG1 0.3639 0.2218 0.1776 20. 1.000 7
HG2 0.4454 0.0939 0.2878 20. 0.800 7
5 0.2 6 2 1 0 1
1 SET 1
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
1 1 1
1 SET 1
1 1 1 0 0
1 1 1 1 0
1 1 1 1 0
1 1 1
1 SET 1
1 1 1 0 0
1 1 1 1 0
1 1 1 1 0
1 1 1
2 SET 2
0 0 0 0 0
0 0 0 0 0
1 1 1
2 SET 2
1 1 1 0 0
1 1 1 1 0
1 1 1
2 SET 2
1 1 1 0 0
1 1 1 1 0
1 1 1
EXAMPLE IV
Similar to example III, except that one of the temperature factors is
converted to anisotropic and is also refined, with the isotropic
equivalent restrained to its original value.
The following data is assumed to be in a file called phasit.d
seb.pam
0 0
2 1 1
phasit.311
DIAMINO DICHLORO PT ( ISOMORPHOUS REPLACEMENT DATA )
monopt.scl
pt_iso_diff.31
4. 6. 0 1. 0. 0. 0. 0. 0. 0.
3
PT1 0.2539 0.1918 0.1376 35. 1.000 6
PT2 0.1754 0.0439 0.4578 -20. 0.664 6
0. 0. 0. 0. 0. 0. 20.
0.5
PT3 0.5474 0.0523 0.6964 50. 0.450 6
HGCL2 ( ISOMORPHOUS REPLACEMENT DATA )
monohg.scl
hg_iso_diff.31
4. 6. 0 1. 0. 0. 0. 0. 0. 0.
2
HG1 0.3639 0.2218 0.1776 20. 1.000 7
HG2 0.4454 0.0939 0.2878 20. 0.800 7
5 0.2 6 2 1 0 1
1 SET 1
0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0
1 1 1
1 SET 1
1 1 1 0 0
1 1 1 1 1 1 1 1 1 1
1 1 1 1 0
1 1 1
1 SET 1
1 1 1 0 0
1 1 1 1 1 1 1 1 1 1
1 1 1 1 0
1 1 1
2 SET 2
0 0 0 0 0
0 0 0 0 0
1 1 1
2 SET 2
1 1 1 0 0
1 1 1 1 0
1 1 1
2 SET 2
1 1 1 0 0
1 1 1 1 0
1 1 1
For all examples, PHASIT can be run with the following control
information.
For UNIX, use the following in a shell script
phasit < phasit.d > phasit.l
For VMS, use the following in a .COM file
$ASSIGN PHASIT.D FOR005
$ASSIGN PHASIT.L FOR006
PHASIT
$DEASSIGN FOR005
$DEASSIGN FOR006
3.03 ***** SAMPLE INPUTS FOR PHASING BY SOLVENT LEVELING *****
A complete solvent flattening run can be executed by creating a few
small data files, and running the procedure DOALL. This will carry out
the complete sequence of protein-solvent boundary determination,
solvent flattening, and phase combination steps, in a manner
equivalent to that suggested by Wang in his ISIR process, although the
initial phases can be SIR, SAS, MIR, MIRAS or any combination
generated by PHASIT. It will generate an initial solvent mask, use it
for 4 cycles of solvent flattening/ phase combination, create a new
mask, use it for 4 cycles, create a third mask, use it for 8 cycles,
and, if desired, do additional phase extension cycles, and then
possibly phase AND AMPLITUDE extension cycles.
A series of files, all given .d extensions (UNIX) or .dat
extensions (VMS) should be created containing control information for
the forward and inverse Fourier transforms, for each option of the
BNDRY program and for RMHEAVY. In general, these are the only files
which will have to be changed for a new application, PROVIDED THE FILE
NAME CONVENTION IN THE CONTROL FILES IS ADHERED TO. The output from
PHASIT should be called phasit.31 and if phase extension is to be done,
then the output from MISSNG should be called extrfl.d and the file
sloext.d should also be prepared. If phase extension is not desired,
then one does not have to run MISSNG and create sloext.d, but
the line invoking the "extnd" procedure (@EXTND.COM in DOALL.COM for
VMS systems or sh extnd.sh in doall.sh for UNIX systems) should be
commented out. The individual program writeups should be consulted
for the meaning of the parameters. It is important that the grid
spacing selected in the input to FSFOUR be appropriate for the highest
resolution data to be used anywhere in the process, including phase
extended reflections. A grid spacing of about 1/3 of the smallest d
spacing is recommended. It is also VERY important that the index range
requested in the inputs to MAPINV cover at least a complete asymmetric
unit out to the maximum resolution to be used anywhere in the process,
including phase extended reflections. The particular asymmetric unit
covered need not be identical to that originally input implicitly to
PHASIT via the reflection files, but all reflections in the input
files should at least have symmetry related counterparts in the MAPINV
asymmetric unit. Since index limits in MAPINV are restricted to
minimum and maximum values along each reciprocal axis, in high
symmetry systems it may be necessary to cover more than an asymmetric
unit (this causes no problem).
Note also that MAPINV can compute structure factors only in the
hemisphere with L non-negative, thus one MUST request an asymmetric
unit in this hemisphere. This also creates no problem SINCE ANY
REFLECTION CAN ALWAYS BE RELATED TO ONE IN THIS HEMISPHERE BY
application of the Friedel symmetry operator, and this is
automatically done in the programs. THUS WHEN IN DOUBT, ONE CAN
ALWAYS SPECIFY A FULL HEMISPHERE, I.E. A RANGE OF -HMAX,HMAX,
-KMAX,KMAX AND 0,LMAX WHICH WILL WORK, but may not be the most
efficient way of doing things. For this reason one will NEVER have
to reindex the input data, as an appropriate range in MAPINV can
ALWAYS BE GIVEN!
Example inputs are now given. If the supplied doall and related
scripts are to be used without modification, then the filenames in
these samples should NOT be changed (except for the parameter file, of
course). One need change only the parameter file, solvent content and
resolution related parameters, the map periods and index range, and
the heavy atom coordinate file.
---- file fft.d (input to FSFOUR, for map calculation)----------------
seb.pam
COMPUTE ELECTRON DENSITY MAP
0 48 72 80 1 0 20 0 0 0 0.
four.ref
four.map
--- file minv1.d (input to MAPINV, for solvent boundary determination)
seb.pam
INVERT ELECTRON DENSITY MAP AFTER TRUNCATING NEGATIVES
four.map
minv.ref
0 0 0 16 0 24 27
0. 0. 1 0
--- file minv2.d (input to MAPINV, for normal map inversion) -----
seb.pam
INVERT ELECTRON DENSITY MAP AFTER SOLVENT FLATTENING
mod.map
minv.ref
0 0 0 16 0 24 27
0. 0. 0 0
--- file rmhv.d (input to RMHEAVY, for removal of heavy atoms ) ----
seb.pam
four.map
nohv.map
2 2.5
PT1 0.2539 0.1918 0.1376 35. 1.000 6
PT2 0.1754 0.0439 0.4578 20. 0.664 6
--- file bnd0.d (input to BNDRY, option 0, prepare SF for protein-
solvent boundary determination )------------------
seb.pam
0
9.
minv.ref
four.ref
--- file bnd1.d (input to BNDRY, option 1, create solvent mask)----
seb.pam
1
four.map
mask.map
.4
--- file bnd2.d (input to BNDRY, option 2, do solvent flattening and
negative density truncation) ----------------
seb.pam
2
four.map
mask.map
mod.map
.086
--- file bnd3.d (input to BNDRY, option 3, combine new phases with
original) --------
seb.pam
3
0 0. 1. 0 0
phasit.31
minv.ref
newphi.ref
--- file extnd.d (input to BNDRY, combine new phases with original,
including phase extension ) ------------
seb.pam
3
1 3.5 1. 0 0
phasit.31
minv.ref
extrfl.d
newphi.ref
--- file extnda.d (input to BNDRY, combine new phases with original,
including phase AND AMPLITUDE extension) --------
seb.pam
3
2 3.5 1. 0 0
phasit.31
minv.ref
extrfl.d
newphi.ref
--- file sloext.d (controls range and rate of phase extension) --
seb.pam
4. 3.5 8
extnd.d
--- file sloext2.d (controls range and rate of phase AND AMPLITUDE
extension --------
seb.pam
4. 3.5 8
extnda.d
Once the input is prepared, the phasing process can be carried out
either by running a series of command procedures as individual steps,
or by running a single command procedure which invokes all others.
The single procedure, called doall.sh or doall.com follows. In the
procedures that follow, it is assumed that phase extension will be
carried out, and that the additional files "extrfl.d" (prepared by
MISSNG) and "sloext.d" (see SLOEXT write-up) are available.
3.04 For UNIX, use the following commands in a shell script,
called doall.sh
# COMMAND PROCEDURE TO CARRY OUT THE ENTIRE CYCLING PROCESS FOR PHASING
# DATA BY SOLVENT LEVELLING
#
# COMPUTE THE FIRST SOLVENT MASK
sh mask1.sh
#
# COMPUTE 4 CYCLES OF SOLVENT LEVELLING (USING FIRST MASK)
sh cycle4.sh
#
# COMPUTE THE SECOND SOLVENT MASK
sh mask2.sh
#
# COMPUTE 4 CYCLES OF SOLVENT LEVELLING (USING SECOND MASK)
sh cycle8.sh
#
# COMPUTE THE THIRD SOLVENT MASK
sh mask3.sh
#
# COMPUTE 8 CYCLES OF SOLVENT LEVELLING (USING THE THIRD MASK)
sh cycle16.sh
#
# DO ADDITIONAL CYCLES OF SLOW PHASE EXTENSION (TO REFLECTIONS WITH
# NATIVE AMPLITUDES SUPPLIED), EITHER TO HIGHER RESOLUTION OR TO
# INITIAL RESOLUTION
sh extnd.sh
#
# IF DESIRED, DO ADDITIONAL CYCLES OF PHASE EXTENSION (INCLUDING
# DATA FOR WHICH THERE IS NO AMPLITUDE INFORMATION). THIS OPTION IS
# NOT ALWAYS DESIRABLE, THUS IT IS COMMENTED OUT. TO INVOKE IT, SIMPLY
# REMOVE THE # FROM THE FOLLOWING LINE
#sh extnda.sh
#
# THATS ALL
For VMS, use the following commands in a command procedure, called
DOALL.COM
$SET NOVERIFY
$! COMMAND PROCEDURE TO CARRY OUT THE ENTIRE CYCLING PROCESS FOR
$! PHASING DATA BY SOLVENT LEVELLING
$!
$! COMPUTE THE FIRST SOLVENT MASK
@MASK1.COM
$!
$! COMPUTE 4 CYCLES OF SOLVENT LEVELLING (USING FIRST MASK)
@CYCLE4.COM
$!
$! COMPUTE THE SECOND SOLVENT MASK
@MASK2.COM
$!
$! COMPUTE 4 CYCLES OF SOLVENT LEVELLING (USING SECOND MASK)
@CYCLE8.COM
$!
$! COMPUTE THE THIRD SOLVENT MASK
@MASK3.COM
$!
$! COMPUTE 8 CYCLES OF SOLVENT LEVELLING (USING THE THIRD MASK)
@CYCLE16.COM
$!
$! DO ADDITIONAL CYCLES OF PHASE EXTENSION (TO REFLECTIONS WITH
$! NATIVE AMPLITUDES SUPPLIED), EITHER TO HIGHER RESOLUTION OR TO
$! INITIAL RESOLUTION
@EXTND.COM
$!
$! IF DESIRED, DO ADDITIONAL CYCLES OF PHASE EXTENSION (INCLUDING
$! DATA FOR WHICH THERE IS NO AMPLITUDE INFORMATION). THIS OPTION IS
$! NOT ALWAYS DESIRABLE, THUS IT IS COMMENTED OUT. TO INVOKE IT,
$! SIMPLY REMOVE THE ! FROM THE FOLLOWING LINE
$! @EXTNDA.COM
$!
$! THATS ALL
3.05 EXPECTED OUTPUT FILES
Execution of the "doall" procedure will result in the following
files being present. (phasit.31 and phasit.log should be present prior
to running "doall.")
phasit.31 contains original MIR, SIR etc phases from PHASIT
phasit.l contains phasit printed output
mask1.14 contains first solvent mask
mask1.l contains mask1 printed output
phi4cy.31 contains phases after 4 cycles using first mask
cycle4.l contains printed output from first 4 cycles
mask2.14 contains second solvent mask
mask2.l contains mask2 printed output
phi8cy.31 contains phases after 4 cycles using second mask
cycle8.31 contains printed output from next 4 cycles
mask3.14 contains third solvent mask
mask3.l contains mask3 printed output
phi16cy.31 contains phases after 8 cycles using third mask
cycle16.l contains printed output from next 8 cycles
phiextnd.31 (if generated) contains phases after 8 cycles using
third mask, plus additional cycles of phase extension
to known amplitudes.
extnd.l (if generated) contains printed output from next 12
cycles
phiextnda.31 (if generated), contains phases after 8 cycles using
third mask, plus additional cycles of phase extension
to known amplitudes, plus additional cycles of phase
and amplitude extension.
extnda.l (if generated), contains printed output from next 12
cycles.
cycles.
4.00 NATIVE, DIFFERENCE AND "CALCULATED" PATTERSON MAPS
In protein crystallography one is generally interested in difference
Patterson maps to locate heavy atoms, in which the Fourier coefficients
are the squares of the DIFFERENCE in AMPLITUDES between native and
derivative data, or between members of a Bijvoet pair. Sometimes
however, it is useful to compute native Patterson maps, or to compute
"calculated" Patterson maps (generated from intensities computed
explicitly from an input atomic model). The native maps may provide
information about non-crystallographic symmetry, while the "calculated"
maps obtained from a tentative heavy atom structure can be compared
with the observed difference Pattersons to see how well the major
features are being explained. The latter method is particularly
useful in high symmetry systems, where even a small number of heavy
atom sites gives rise to many Patterson peaks. Examining the observed
and calculated Pattersons side by side (perhaps in VIEWPLT) can then
provide confidence in the heavy atom interpretation.
DIFFERENCE PATTERSONS - Difference Pattersons (either isomorphous or
anomalous) can be computed by two different routes in PHASES. The
first approach is to generate a standard "phased" file containing
h,k,l,Fo,Fc,Phi, and use it in FSFOUR with the MAPTYP=5 option.
Generally programs CMBISO or CMBANO do the initial data preparation,
and their output files are then fed to TOPDEL to select the data
according to various criteria, screen for and reject outliers,
and write the appropriate information to the output file for FSFOUR.
The output file will then contain either FPH and FP, or F+ and F- in
the amplitude slots, depending on whether isomorphous or anomalous
data were input. The second approach, which is useful only after
at least one site is found in the derivative, is to use the
"difference coefficient" file output from PHASIT in FSFOUR with
MAPTYP=6. In the isomorphous case if the input site(s) are correct
this should lead to a cleaner map, since the FPH to FP scale factor
has been refined, and also because the angular difference between
the FP and FPH vectors are compensated for. The FO and FC slots in
the file then contain (FPH-FP)obs,corrected and FHcal, respectively.
For anomalous data these slots contain (FPH+ - FPH-)obs and
(FPH+ - FPH-)calc or their counterparts for native anomalous data.
NATIVE PATTERSONS - Native Patterson maps can be generated in several
ways, depending on what information is currently available. In all
cases one must prepare a standard input "phased" file containing
h,k,l,Fo,Fc,Phi, and request the appropriate option in FSFOUR to create
the desired coefficients from the input data. One way to do this is
to run CMBISO inputting the native file twice (as both the native and
derivative data sets), and then run TOPDEL selecting ALL coefficients
to be output (you can still use d and F/sigma cutoffs, but output
100% of the data!). The R factor and all differences will of course,
be zero, but the output file will contain native amplitudes in both
the Fo and Fc slots, and thus the native Patterson can be generated
by requesting MAPTYP=6 in FSFOUR. Another approach would be to run
PHASIT, SF mode with IHLCF=0 and ISIGA=0, using a single "dummy" atom
arbitrarily positioned as the model. The output file will then give a
bad R factor, but it will contain Fo and Fc in the amplitude slots,
and selecting MAPTYP=6 in FSFOUR will again give the desired native
Patterson. The first method allows one to use d spacing and F/sigma
cutoffs, while the second always uses all of the data. In either case
the native Pattersons can be searched for peaks, contoured, displayed,
printed etc. with PSRCH, MAPVIEW, CTOUR, MKPOST etc.
"CALCULATED PATTERSONS" - Patterson maps corresponding to an input
atomic model are also generated by preparing the normal "phased" file
containing h,k,l,Fo,Fc,Phi, and by selecting the appropriate
coefficient option (MAPTYP=7) in FSFOUR. In this case it is important
that the second amplitude slot truely contains Fc. One way to do this
is to run PHASIT, SF mode with IHLCF=0 and ISIGA=0, and to include
all of the desired atoms in the model. If a heavy atom model is
used as the input, the R factor will be meaningless (since scaling
is to the NATIVE amplitudes rather than differences), but the output
file would still be appropriate for the "calculated" difference
Patterson as the map scale is arbitrary anyway. Another way is to use
GREF to prepare the file, by requesting that an output Fourier file
be written. In that case the file created can contain the proper
DIFFERENCE amplitude in the Fo slot, and the model based Fc in the
Fc slot. Then the SAME file could be used in FSFOUR to create both
the observed difference Patterson (MAPTYP=6) and the modeled version
of it based on the heavy atoms (MAPTYP=7). Once again, the FSFOUR map
can be searched for peaks, contoured etc. as any normal map. Finally,
the "difference coefficients" file written by PHASIT can be used in
FSFOUR with MAPTYP=7 to compute the "calculated" difference Patterson
based on the input heavy atom model. The advantage of doing it this
way is that the model, and hence the FC's, then would reflect all
refined scaling parameters (possibly including anisotropic B's), and
also models based solely on anomalous scatterers could be used.
5.00 REFINING HEAVY ATOM PARAMETERS
There are two general ways to refine heavy atom parameters within
the PHASES package: refinement against isomorphous or anomalous
amplitude differences; or "phase refinement", i.e. by minimizing lack
of closure. Isomorphous/anomalous difference refinement is carried
out with the program GREF, and has the advantage that only data from
the crystal being refined is used (and the native, in the isomorphous
case). It is therefore independent of all other derivatives, and is
particularly useful in the case of common sites between multiple
derivatives since there can be no "cross talk" or bias. Also, if this
refinement is carried out against centric data only, there are few
assumptions made about the protein phase and the refinement is
usually very reliable. It is nearly always used for the first
derivative as no reliable protein phase estimates are available at
that time, but it's not a bad idea to do this initially for each
derivative. The disadvantage is that refinement of all parameters with
centric data may not be possible in some space groups. For example,
in P2 the only centric data available are of the type h0l, thus one
can not refine ANY y coordinates. In GREF one can have the program
automatically include the 25% strongest differences for acentric data
along with the centric data to enable refinement of SOME y's, but then
one is introducing assumptions about the protein phase which are only
approximately valid, thus weakining the refinement. Also, in a space
group like P2 the origin is not fixed in the y direction, so even the
RELATIVE y coordinates BETWEEN DERIVATIVES can not be refined, even
when acentric data ARE included. For refinement in GREF one would
generally start by assigning the major site an occupancy of 1.0, other
sites appropriate occupancies and all heavy atoms B values of 15 or
20. Then do a single cycle refining only the scale factor (which you
can initially assign any positive value, usually 0.1). After an
estimate of the scale factor is obtained, one can refine coordinates
as appropriate, along with the scale factor. One can then refine
coordinates, scale factor and occupancies simultaneously, but the
occupancy of the MAJOR SITE should ALWAYS be held fixed at 1.0. Also,
when polar axes are present even with sufficient data available for
refinement (e.g. acentrics included) the coordinates of the MAJOR SITE
along POLAR DIRECTIONS should still NOT be refined, as they are needed
to fix the origin in the polar directions. Finally, if the resolution
is sufficiently high one can then include B values in the refinement,
but if there are indications of instability the B's can usually be
held at their initial values without introducing much error. Attempts
to simultaneously refine parameters which are not independent (i.e.
coordinates for ALL atoms along polar directions, both scale factor
AND occupancy when only one atom is input, ALL coordinates of an atom
on a special position in the space group) or to refine parameters when
there is no data determining that parameter (coordinates of ANY atom
along polar direction when using ONLY centric data) will result in
a singular matrix being obtained, and an aborted refinement.
Paramaters obtained from refinement against amplitude differences are
generally well suited to initiate subsequent phasing calculations or
further "phase refinement" in program PHASIT.
"Phase refinement" is carried out with the program PHASIT, and is
ideally suited for refinement with multiple derivatives/data sets
although it can also be used with a single derivative. In PHASIT
either conventional refinement, or "maximum likelihood" options may
be selected, and if only one derivative is used the program will
automatically switch to maximum likelihood mode. Phase refinement
requires an estimate of the protein phase, which is why it's better
suited for the multiple derivative case, since SIR or SAS estimates
alone are usually very poor. The advantages of phase refinement are
that in general, all parameters may be refined including native to
derivative scaling parameters, and the corresponding weights (expected
lack of closure estimates) are also implicitly "refined". Since the
origin is fixed by the protein phase estimates, refinement is possible
for coordinates along polar directions, and the origin can thus be
properly established between derivatives. Phase refinement is however,
sensitive to the hand of the heavy atom sets, and it is assumed that
all input sets correspond to the SAME origin and hand. A useful
procedure is to initially start with all parameters as described
above, (after correlating origins and hand between derivatives with
cross difference Fouriers) and then refine only the FH scale factor
for one cycle. Then refine the FH scale factor along with coordinates,
then along with coordinates and occupancies (again always holding the
MAJOR site occupancy to 1.0 in each derivative). Then one can include
the FPH scale factor along with the other parameters, and finally
include the B factors. When refining anomalous scattering data sets
one would generally do the same, except that both the FPH and FH scale
factors should NOT be refined simultaneously (they can be alternated),
and for NATIVE anomalous scattering the FPH scale factor should NEVER
be refined!
A useful option is to use protein phase estimates obtained from an
external source during phase refinement rather than calculated from
the current heavy atom parameters and data. Thus one could refine
initially as described, then modify the phases by solvent flattening
and/or NC symmetry averaging, and then refine the parameters again
this time against the modified phases. The new parameters are then
used to compute phases to start another round of density modification.
This procedure has been helpful in several cases, and usually is
particularly good for refining the FPH scale factor. For conventional
phase refinement in PHASIT one would select a figure of merit cutoff
in the range 0.4 to 0.6 and use weights of 1/E**2. For maximum
likelihood mode one would select a figure of merit cutoff in the range
0.1 to 0.2 and use unit weights. Most successful protein structure
determinations have utilized phase refinement to obtain the final
MIR type phases, although refinement against differences is often
done first to obtain starting values for the parameters.
6.00 HEAVY ATOM DIFFERENCE, DOUBLE DIFFERENCE AND
CROSS DIFFERENCE FOURIER MAPS
There are several ways to compute heavy atom based difference
or cross difference type Fourier maps within the PHASES package.
1) HEAVY ATOM DIFFERENCE or DOUBLE DIFFERENCE FOURIERS.
The first approach is to refine heavy atom parameters against
isomorphous or anomalous difference AMPLITUDES in program GREF,
and request that a Fourier file be written. If this file is
used in FSFOUR with MAPTYP=1, then the observed difference
Fourier, i.e. that which should reveal all heavy atom sites
can be obtained. If the file is used with the MAPTYP=3 option,
then a "double difference" map is computed, i.e. the heavy
atoms included in the structure factor calculation are subtracted
out, so that the map should show only additional sites. The
limitations with this approach are that the "observed" amplitudes
ABS(FPH-FP) or ABS(F+ - F-) are approximations, since vector
differences rather than amplitude differences should be used,
and that the heavy atom model may be crude since the FPH to FP
scale factor has not been refined, and anisotropic thermal
parameters for the heavy atoms can not be used. Also, if used
with anomalous data the absolute configuration can not be
obtained since absolute values of delta F were used.
The second approach is to compute phases and/or refine heavy
atom parameters in program PHASIT, and use the "difference
coefficients" files it produces in FSFOUR with the MAPTYP=1
or MAPTYP=3 options. If MAPTYP=1 the "observed" difference
Fourier showing all heavy atoms will be obtained, however the
results should be improvements over those obtained with the
previous method. This results from the fact that the FPH to
FP scaling parameters can be refined, the heavy atom thermal
factors may be refined anisotropically, and phase difference
information is used to correct the "observed" amplitudes to
account for the fact that the two vectors are not colinear.
In this case for isomorphous data sets the corrected
"observed" differences, calculated heavy atom amplitudes and
calculated heavy atom phases are used to compute the map.
For anomalous data the "observed" and "calculated" Bijvoet
differences are used along with the protein phases shifted
by 90 degrees, to give true "Bijvoet difference" or "Bijvoet
double difference" maps, so that the absolute configuration
is preserved. Again, MAPTYP=1 should show all anomalous
scatterers while MAPTYP=3 should have those included in the
model subtracted out. These methods use phase information
computed only from the heavy atoms or anomalous scatterers,
although in the anomalous case all such information is
combined to estimate protein phases.
A third approach is to combine observed AMPLITUDE
differences [ i.e. (FPH-FP) or (F+ - F-) ] directly with
estimates of the protein phases to compute difference
or Bijvoet difference Fouriers. One would then generate
protein phases either by MIR, SIR, BNDRY, or from a model in
PHASIT, and combine the phases with observed amplitude
differences in programs MRGDF or MRGBDF for isomorphous or
anomalous data, respectively. The maps would then be computed
in FSFOUR using the MAPTYP=3 or MAPTYP=8 options for
difference or Bijvoet difference maps, respectively. The
advantage of this approach is that the protein phases
themselves may be better, since one can use solvent flattened
and/or NC symmetry averaged phases in the synthesis. For
the isomorphous case the output coefficients file would then
contain indices, FPH, FP, PHI_pro, and for the anomalous
case the file would contain indices, F+, F-, PHI_pro. A
disadvantage is that one can not "subtract out" the heavy
atoms used in the phasing, so that they will also appear
in the maps possibly making it more difficult to detect
minor sites.
2) CROSS DIFFERENCE FOURIERS.
This is accomplished similarly to the third option above,
except that in MRGDF or MRGBDF a data file corresponding to
a new derivative, i.e. one which was never used in phasing,
is merged with an existing protein phase file. The cross
difference Fourier (or cross Bijvoet difference Fourier)
is then obtained in FSFOUR with MAPTYP=3 or MAPTYP=8,
respectively. These maps should show all heavy atom or
anomalous scatterer sites in the new derivative, which can
then be checked against the appropriate difference
Patterson. The advantage of doing this, in addition to
helping solve the new derivative, is to assure that heavy
atom sites in the new derivative correspond to the same
origin and hand as those used in the original phasing.
7.00 CREATING/EDITING SOLVENT MASKS
In most cases adequate solvent masks are prepared as part of the
"doall" procedure, which carries out a reciprocal space equivalent
of the automated protein-solvent boundary determination method
described by Wang with the added modification that density in the
immediate vicinity of heavy atoms is ignored during mask construction.
Solvent masks however, can also be created by hand, from coordinates
for an input model or by starting with any of these masks and editing
them. Solvent masks MUST have a one-to-one correspondence with FSFOUR
maps, and thus they also MUST cover one full cell on the same grid
used for the map, and be oriented as xz sections. They also must have
the structure as described in the "file formats" section. This happens
automatically if the masks are constructed by the "doall" procedure,
but care must be taken to insure these features if the masks are
created by other means. Pre-existing solvent masks can be examined
and/or edited in MAPVIEW, or MAPVIEW can be used to create the masks
"from scratch" by hand tracing boundaries in contoured maps. Several
options are now described.
*** Examining/editing "normal" (i.e. full cell) solvent masks ***
These masks (named mask1.14, mask2.14 and mask3.14, if created by
the "doall" procedure) can be examined in MAPVIEW or MAPVIEW_X by
inputting any FSFOUR map with the same grid, specifying that masks
will be used, selecting 0 to 0.999 for each of the x, y and z ranges,
specifying the xz section orientation and "recovering" the
pre-existing mask file. From the menu contoured sections can then be
selected and displayed. Clicking the mouse with the cursor in the
"show mask" menu area will then display the solvent mask as blue dots
on the solvent grid points. One can then use the menu options to
scroll through the sections, displaying both contoured density and
the solvent mask. One could also use the "trace mask" menu option
as described in the MAPVIEW writeup to edit the mask with the mouse,
but at this point it is not desirable to do this as the full cell
map is displayed, and one may have to make identical edits in each
symmetry related envelope. If this is not done very carefully one
could easily destroy the space group symmetry in the mask. A better
approach, if editing is to be done, is simply to examine the map and
mask to determine the coordinate range which would carve out only one
contiguous molecule (asymmetric unit) by following the fractional
coordinates as the cursor moves across the screen (displayed in the
lower right hand corner). Note that when determining the range one can
cross into neighboring cells, although only the one-cell-translated
map region is displayed. Once an appropriate range is deduced, write
it down and exit MAPVIEW without saving any files. Then run EXTRMAP
and EXTRMSK to extract that same range from the FSFOUR map and solvent
mask, respectively, to create the corresponding "submaps". This
allows one to deal only with a contiguous asymmetric unit, and to
select regions spanning cell edges. Now run MAPVIEW again this time
inputting the non-fsfour (i.e. submap) and its corresponding mask
file. Editing can then be done on the submask. After editing all
appropriate sections, use the "MAKEASU" menu option to symmetrize the
submask, and scroll through the masks again to confirm that everything
is as desired. Once you are happy with it, exit MAPVIEW and when
prompted, request that the entire submask region be saved to a file.
At this point you have the edited mask covering an asymmetric unit.
Run BLDCEL inputting the submap, edited submask and original FSFOUR
map to expand the submask (and submap) to a full cell. You can delete
the output map file, but the output mask file now corresponds to the
edited solvent mask, expanded to a full cell obeying space group
symmetry. It can now be used for solvent flattening (or examined
again in MAPVIEW just as the original mask was to confirm the
expansion).
***** Creating solvent masks from a model *****
If atomic coordinates are available from a tentative model, these
coordinates can be used to create a solvent mask. To do this one
should first prepare a PHASES style file containing the atomic
coordinates (possibly from a PDB file via PDB_CDS), and determine the
range (in fractional coordinates) which encompasses the model atoms.
Then enlarge the range (on each end) slightly to account for the
radius to be assigned to each atom. MDLMSK can then be run to create
a mask file just encompassing the molecule. When prompted in MDLMSK,
the periods (number of grid points along each axis) should be
specified EXACTLY as in the input to FSFOUR, to insure that the maps
to be computed later will have the same grid as the mask. The adjusted
fractional coordinate range for the model should then be specified
along with a mask number (use 1 for pure solvent masks), and a radius
of about 1.8 angstroms. In the mask the outer boundary will be
appropriate, but there will typically be many small holes in the
interior caused by use of a Van der Waal's size radius. Use of a
larger radius could avoid these holes, but would artificially extend
the outer boundary. To avoid this one generally uses the smaller
radius, and then edits the masks to preserve the outer boundary but
fill in the interior holes. This can be done very quickly in MAPVIEW.
To do this run MAPVIEW inputting a FSFOUR map (any one will do, as
long as the periods are the same as that used in MDLMSK) and request
that masks will be used. Then input the same coordinate range as in
MDLMSK, request the xz section orientation and "recover" the mask
file from MDLMSK. You can effectively turn off the density display
by selecting a high contour level, and scroll through the sections
editing each via the "show mask" and "trace mask" options described in
the MAPVIEW writeup. Just quickly trace around the already displayed
outer boundary to preserve it, and the interior holes will be filled
automatically when you are done with each section. When finished, use
the "MAKASU" option to symmetrize the mask region. Then exit MAPVIEW
and request that both the entire map and mask regions be written to
files. You then will have an edited mask file encompassing the model,
and the corresponding submap file. The last step is to convert the
edited mask to a full cell mask. To do this, run BLDCEL inputting the
submap, corresponding edited mask and original FSFOUR map. The output
map file can be deleted, but the output mask file will be a full cell
version of the edited, model based mask which now also obeys space
group symmetry. It can then be used for solvent flattening (for
example, replacing the mask3.14 file in the cycle16.sh, extnd.sh or
extndavg.sh procedures), and also can be examined in MAPVIEW as
described earlier.
8.00 INCORPORATION OF PARTIAL STRUCTURE INFORMATION
In many cases a significant fraction of the structure can be
reliably determined from an electron density map, but some regions in
the map are less well defined. In that case it is often useful to
incorporate phase information obtained from the partial structure into
the phasing process. This can be done in several ways, all of which
require running the PHASIT program once (in SF calculation mode,
IHLCF=0, ISIGA=0) to generate partial structure phases and
amplitudes, and running the BNDRY program once (option 3, with
ICMB=0 or 1) to combine the partial structure phase information with
prior phase probability distributions cast in terms of Hendrickson-
Lattman coefficients. Different strategies can be employed depending
on which prior distributions are used, weighting during the phase
combination and what is done AFTER the phase combination step. The
most common procedures are now described.
In all procedures, first run PHASIT in SF calculation mode using
IHLCF=0 and ISIGA=0, and call the output phase file MODEL.31.
This file contains the partial structure phase and amplitude
information.
Now you have some choices.
1) Combine the partial structure information with the original (MIR,
SIR etc) probability distributions (usually in file called PHASIT.31
generated by PHASIT, but possibly introduced via the IMPORT program).
This can be done with a small control file to run BNDRY, option 3,
using ICMB=0 or 1 for either Sim or Sigma_A weighting, respectively,
(see BNDRY write-up) during phase combination. Call the output file
PHICOMBINED.ORG. This file will contain phase, figure of merit and
HL coefficients for the COMBINED data. If the partial structure was
large enough, you may be able to use these phases directly to get a
good map.
2) If you want to proceed with solvent flattening cycles, just copy
the file PHICOMBINED.ORG to PHASIT.31 (first saving the ORIGINAL
PHASIT.31 i.e. no partial structure contributions, in another file).
Now you can invoke the default procedure DOALL.COM without changing
anything, and at each phase combination step the MODEL+SIR etc
distributions will serve as the "anchored" phases with which those
newly obtained from solvent flattening will be combined.
3) If you wish instead to combine the partial structure information
with distributions obtained AFTER solvent flattening, do the same as
in 1), but use the best phases available (usually in file obtained
from a previous run called phi16cy.31, phiextnd.31 etc) instead of the
original PHASIT.31 file. Call the output file PHICOMBINED.FIN. One
could then proceed with solvent flattening cycles as in 2), but
usually this is not necessary and the phases in file PHICOMBINED.FIN
are used for the final map.
These 3 options (partial structure + MIR etc with no flattening,
[partial structure + MIR etc] followed by flattening, and parital
structure + flattened MIR) seem to be the most useful, and can all be
carried out without tampering with the default control files. One only
has to create additional small control files for single runs of PHASIT
(SF mode) and BNDRY (option 3). Other options making use of partial
structure information are described in the section on "REDUCED BIAS
NATIVE, COMBINED AND DIFFERENCE FOURIERS."
Note that in the case where a molecular replacement solution was
obtained, then one has no MIR like phase probability distributions to
combine solvent flattened (e.g. map inverted) phases with. In that
case (or if one has MIR phase information, and simply wants to abandon
it), one can use PHASIT in SF calculation mode but with IHLCF=1 and
ISIGA=0 or 1. That will create Hendrickson-Lattman coefficients for
the partial structure, and these distributions can then be used as the
"anchored" phases with which those newly obtained from solvent
flattening will be combined. This may also be useful if want wants to
tie noncrystallographic symmetry averaged phases to model phases.
Another option is available involving phase extension. Suppose one has
MIR, SIR etc data to only 4.0 angstrom resolution, native data to 3.0
angstrom resolution, and a partial structure available. One could
first compute the partial structure phases out to 3.0 angstrom
resolution (PHASIT SF mode, IHLCF=0, ISIGA=0) and MIR etc phases out
to 4.0 angstrom resolution (PHASIT, phasing mode). Then run MISSNG to
get the file "extrfl.d" containing reflections between 4 and 3
angstroms. The three output files could then be combined in a single
run of BNDRY (option 3, ICMB=0 or 1, with phase extension requested)
to get a hybrid file. The output file would then contain MIR combined
with partial structure phases to 4 angstroms, and partial structure
phases between 4 and 3 angstroms. This file could then be used for
direct calculation of a map or to initiate solvent flattening cycles
as described earlier. Yet another variation would be to do the same
thing but requesting IHLCF=1 and ISIGA=0 or 1. In that case during
solvent flattening iterations the map inverted phases would also be
tethered to the partial structure phases for the high resolution
data.
Clearly many other options or sequences are available. The key to
successful use of the programs in this fashion is understanding that
the phase combination program (BNDRY, option 3) merges at least two
files, one of which must contain phase information cast in
Hendrickson-Lattman coefficients, and the other containing only
calculated phases and amplitudes (possibly to higher resolution). If
phase extension is also desired, a third file with the additional
reflections may contain only indices and amplitudes, but it may also
contain phase probability distribution coefficients for some or all
of the reflections. The output file always contains the COMBINED
information cast in the HL coefficient form. It is thus suitable for
use either for direct map calculations or as an input file for the
BNDRY, MISSNG, MRGDF, MRGBDF, RD31 etc programs.
9.00 REDUCED BIAS NATIVE, COMBINED AND DIFFERENCE FOURIER MAPS
When making use of partial structure information, either obtained
from a model via structure factor calculations or from inversion of
a density map, the resulting phases are always biased towards the
partial structure. Read (Acta Cryst. A42, 140-149, 1986) has shown
how this bias can be reduced significantly when using the partial
structure phases directly for map calculations, and how to properly
weight partial structure derived phase information when combining
it with other (e.g MIR, SIR etc.) phase information. Both procedures
require first determining "Sigma_A", which is related to the
contributions from "missing" or "incorrect" parts of the structure
and varies with resolution, to compute the proper weight (and thus
FOM) for the partial structure phases. The procedures described by
Read have been implemented as options in both PHASIT and BNDRY,
and can be invoked as follows:
(1) COMBINED PHASE MAPS
For simple phase combination the Sigma_A procedure can be invoked in
the BNDRY program (option 3), by setting ICMB=1. Sigma_A weighting is
then used instead of Bricogne's modification of Sim's weighting
scheme during phase combination. This can be used either with model
phases or map inverted phases, and thus can be done automatically
in the "doall" or "extndavg" procedures.
(2) REDUCED BIAS DIFFERENCE MAPS
These maps are similar to conventional Fo-Fc maps phased with the
partial structure, but the coefficients are
FOM * FOBS - D * FCALC * exp(i * phicalc)
where D is derived from the Sigma_A values and phicalc is the phase
from the partial structure. The appropriate map can be produced by
running PHASIT, SF mode with ISIGA=2, and then requesting a Fo-Fc map
in FSFOUR. Note however, that if chosen, FOM*FOBS and D*FCALC will
occupy the Fo and Fc slots in the output file, thus other map types
requiring pure Fo and/or pure Fc values will be inaccessible.
(3) REDUCED BIAS NATIVE MAPS
These maps are similar to conventional 2Fo-Fc maps phased with the
partial structure, but the coefficients are
2 * FOM * FOBS - D * FCALC * exp(i * phicalc)
for acentric reflections where D is derived from Sigma_A and
FOM * FOBS * exp(i * phicalc)
for centric reflections.
The appropriate map can be produced by running PHASIT, SF mode setting
ISIGA=3, and then requesting a 2Fo-Fc map in FSFOUR. Note however,
that if chosen, FOM*FOBS and D*FCALC (acentric) or FOM*FOBS/2 and 0
(centric) will occupy the Fo and Fc slots in the output file, thus
other map types requiring pure Fo and/or pure Fc values will be
inaccessible.
(4) SIGMA_A WEIGHTED PROBABILITY DISTRIBUTION COEFFICIENTS
Phase probability distribution coefficients and corresponding FOM
based on Sigma_A weighting for structure factors computed entirely
from an atomic model can be obtained by running PHASIT, SF mode with
ISIGA=1. The file may then be used as the "anchor" phases to which map
inverted phases are tethered. It is particularly useful for merging
information during phase extension when high resolution phases come
from a partial structure and lower resolution phases come from MIR
type calculations.
10.00 INCORPORATION OF NONCRYSTALLOGRAPHIC SYMMETRY AVERAGING
Whenever there are multiple copies of identical molecules present
in the crystallographic asymmetric unit and/or the same molecule is
present in multiple crystal forms, one has the opportunity to
improve the phases by averaging the corresponding electron density in
the related molecules, replacing the density for each molecule with
the average, and inverting the "averaged" density map(s) to obtain new
structure factor amplitudes and phases. These new amplitudes and
phases can then be accepted immediately, but are more frequently
combined with the original MIR, SIR etc phase information in a
probabilistic manner, just like those obtained from solvent flattening
or from a partial structure. Indeed, solvent flattening and imposition
of non-negativity of electron density can be applied in addition to
the noncrystallographic symmetry averaging, leading to powerful
phasing algorithms. The resulting phases (either alone or combined
with MIR, SIR etc), are typically combined with the observed
amplitudes, and the process is cycled until convergence is obtained.
The power of the method increases as the number of molecules averaged
increases, but averaging over even a dimer is still extremely useful
when combined with MIR, SIR data, etc. Programs required to carry out
the steps needed for successful noncrystallographic symmetry averaging
are currently included in the PHASES package, and sample control
scripts are given (called "extndavg.sh and extndavg_mc.sh", for the
single and multiple crystal averaging cases, respectively) which
replace the "extnd.sh" script in a normal solvent flattening run. The
scripts insert the averaging related steps into the normal solvent
flattening process, thus the complete multi-cycle task can be carried
out by executing them. Prior to running the scripts however, there are
several related tasks to be performed, which include determination of
the location, direction and nature (rotational order) of the
noncrystallographic symmetry operator(s), and construction of one or
more "averaging envelopes" or "averaging masks" delineating the
volume(s) occupied by the molecules to be averaged. Initial estimates
for the noncrystallographic symmetry operator(s) are usually obtained
from rotation/translation functions which are not included in the
PHASES package as they are readily available elsewhere, however if the
operators are specified by 3x3 rotation matrices and 3 element
translation vectors (as for example, in the program "O"), then the
PHASES program O_TO_SP can be used to convert them to PHASES format.
Everything else, including refinement of the operator(s) and
construction of the envelope mask(s) is part of PHASES. All map
interpolation programs (MAPAVG, SKEW, MAPORTH etc) utilize powerful 64
point spline algorithms, thus the map grids for averaging need not be
any finer than for normal calculations. Many of the noncrystallographic
symmetry averaging routines in PHASES were derived from programs
originally written by W. Hendrickson & J. Smith. In most instances they
have been heavily modified for use in PHASES, mostly to generalize the
algorithms, to optimize the code, and to provide compatabilty with the
rest of the package. The general averaging process as implemented in
PHASES is described below.
For both simplicity and reasons related to computational
efficiency, all of the averaging related calculations are best
performed on electron density "submaps," which cover only the map
region encompassing an asymmetric unit containing the molecules to be
averaged. This "asymmetric unit" need not be complete in the
crystallographic sense (that is, it may differ from a true asymmetric
unit in volume and have irregular borders), but it must encompass at
least the molecules to be averaged, although solvent regions may be
omitted. It may also span cell edges, if necessary. Since the standard
FSFOUR maps always cover a complete unit cell, the "submaps" (which
have a different format) can be created from them via the programs
MAPVIEW or EXTRMAP. Indeed, MAPVIEW will almost certainly be needed to
determine which region to extract in the first place. All envelope
creation, averaging, operator refinement, skewing etc will be done
using the submaps. After appropriate regions in the submaps are
averaged, program BLDCEL is used to regenerate complete unit cell maps
(FSFOUR format) conforming to the space group symmetry, which can then
be inverted by MAPINV. Thus MAPVIEW (or EXTRMAP) and BLDCEL serve as
the gateways between normal FSFOUR maps and submaps. Note that MAPVIEW
can display either type of map (and mask). Descriptions of the inputs
required for each of the programs mentioned can be found in the
appropriate program write-ups.
The keys to successful averaging are to obtain good "envelope"
masks which accurately identify the volume(s) in space in which the
noncrystallographic symmetry operator(s) is/are valid, and to obtain
accurate values for the operators themselves. These tasks always will
take one of two routes, depending on the nature of the
noncrystallographic symmetry. Within a given crystal, if the NC
symmetry is purely rotational with the order of rotation being N-FOLD,
where N is a small integer, then the task is simplified since one
needs only a single "envelope mask" which encompasses all N of the
molecules related by NC symmetry. That is to say the averaging can be
done without having to specify where one molecule stops and the next
starts. One only needs to know the bounds of the TOTAL AGGREGATION of
molecules. The procedure (A) below is then adequate to carry out the
necessary computations. If an arbitrary rotation angle and/or a
translational (eg screw like) shift is involved, the task is more
complicated since one then must create a SEPARATE ENVELOPE MASK
identifying each molecule. The procedure (B) below is then adequate
to carry out the computations. For multiple crystal averaging the same
steps and considerations are required, but multiple submaps (one for
each crystal form, along with corresponding envelope masks) are used.
Details related to multiple crystal averaging are described later.
(A) MASK PREPARATION STEPS AND OPERATOR REFINEMENT WITH PURE
ROTATIONAL SYMMETRY OF ORDER N
1) start with best possible map (usually solvent flattened MIR map, as
obtained via the "doall" procedure).
2) compute a map via "FSFOUR" (default orientation, i.e NORN=0)
3) run EXTRMAP (or MAPVIEW) to extract a submap from the FSFOUR map
which encompasses at least the dimer, trimer etc, related by KNOWN
(at least approximately) noncrystallographic symmetry.
4) if the unit cell is not orthogonal, run MAPORTH to convert the
submap to an orthogonal grid (but save the input submap as well)
5) run LSQROT (using orthogonal map), to refine the noncrystallographic
symmetry axis location and direction. Start with low resolution
(~6A map, 2A grid) refining only within a sphere of suitable radius
(usually 12-25A), centered about a point on the rotation axis which
is near the dimer, trimer etc center. Then gradually extend the map
resolution to about 3A (1A grid) and repeat the refinemnt. In a 4A
map, the correlation coefficient after refinement should be about
0.4 or higher. (Ignore the R factor, its always very high).
6) run SKEW (using the submap from 3), to generate "skewed" map
with new "b" axis aligned with noncrystallographic symmetry axis.
7) run MAPVIEW (using "skewed" map) to create mask (via "trace mask"
option) which encompasses only the region to be averaged. This
should include the entire dimer, trimer etc. In MAPVIEW, use only a
single mask (Mask No. 1). When exiting, save the "skewed" mask file.
8) run TRNMSK (using both the original submap from 3, and "skewed"
mask from 7 to convert the skewed mask to one corresponding to the
default (non-skewed) orientation (its grid will have one-to-one
correspondence with the original submap). Save this standard mask.
9) run MAPVIEW (using the original submap from 3), and "recover" the
standard mask file from 8. Then use "Make Asu" option, and possibly
edit masks until only non redundant density associated with the
desired dimer, trimer etc is within the mask. When exiting, save the
ENTIRE mask (no subset). It will be used in all future averaging
cycles.
Optionally, run LSQROT again this time using the default mask
output from 9 as basis for refinement (you may have to orthogonalize
it), instead of a sphere. If you do this, expect a drop in the
correlation coefficient. If the orientation changes significantly,
repeat steps 6-9.
Proceed to AVERAGING STEPS
(B) MASK PREPARATION STEPS AND OPERATOR REFINEMENT WITH ARBITRARY
ROTATIONAL ANGLE AND/OR TRANSLATION
Steps 1-4 same as in (A)
5) Run LSQROTGEN (using orthogonal map), to refine the
noncrystallographic symmetry operators relating molecule 1
(arbitrarily selected) to each other molecule. Start with low
resolution (~6A map, 2A grid) refining only within spheres of
suitable radius (typically 15A) centered on points near the centers
of molecule 1 and the target molecule, respectively. Then gradually
extend the map resolution to about 3A (1A grid) and repeat the
refinement. In a 4A map, the correlation coefficient after
refinement should be about 0.4 or higher. (Ignore the R factor, its
always very high). For N related molecules, there will be N-1
operators to refine.
6) Run MAPVIEW (using the submap from 3) to create SEPARATE envelope
masks for EACH MOLECULE to be averaged. Do this by making use of
the "set mask no." and "trace mask" options. When exiting, save
the mask file, as it now contains separate envelope information
for each molecule. Also, remember which mask No. you assigned to
which molecule.
7) Run MAPVIEW (using original submap from 3), and "recover" the
standard mask file from 6. Then use "Make Asu" option, and possibly
edit masks until only non redundant density associated with the
desired dimer, trimer etc is within molecular envelope masks. When
exiting, save the ENTIRE mask (no subset). It will be used in all
future averaging cycles.
Optionally, run LSQROTGEN again this time using the default mask
output from 7 as basis for refinement (you may have to orthogonalize
it), instead of spheres. If you do this, expect a drop in the
correlation coefficient. If the operator(s) change significantly,
repeat steps 6-7, otherwise continue.
AVERAGING STEPS
Prior to brute force cycling, run MAPAVG (using the original submap
from 3, and the corresponding mask from 9A or 7B) to generate an
"averaged" map. If the translation is small (or absent) use "SKEW" to
convert it so you can look down the NC symmetry axis. You can then use
"MAPVIEW" to view the map, and verify that averaging has indeed been
done successfully, that you are in fact looking down the NC symmetry
direction, and the axis goes through the origin. If so, proceed to
averaging cycles. If not, something went wrong earlier. Check program
inputs, outputs, polar axis conventions, etc.
At this point refined values of the noncrystallographic symmetry
operator(s) are available, along with envelope masks isolating the
regions to be averaged within the submap.
1) create the file "extrmap.d", which will specify what submap region
to extract from the FSFOUR map. It MUST correspond EXACTLY to the
same region used when creating the envelope masks. (You can read
the envelope mask header with RDHEAD if you forgot). Rename the
final mask file "asu.msk" See EXTRMAP write-up for information.
2) create the file "mapavg.d", to specify the transformation
operator(s) for averaging, and the envelope mask file. See MAPAVG
write-up for information.
3) create the file "bldcel.d", to specify the file names and options.
BLDCEL will take the "averaged" asymmetric unit submap from mapavg,
and build a complete cell FSFOUR style map from it. See BLDCEL
write-up.
4) Create the file "sloext.d" specifying phase extension information
and cycles to be performed (see SLOEXT write-up). If no phase
extension is to be done, make the upper and lower resolution
cutoffs identical and specify 16 cycles. Otherwise, specify the
resolution cutoffs and cycles per resolution increment, and run
MISSNG to create the "extrfl.d" file.
5) Create the file "extnd.d" specifying file names, extension options
and I/O type.
6) Verify that the phase files (phasit.31 and phi16cy.31), solvent
mask (mask3.14), and data files (bnd2.d, fft.d, minv2.d) from a
previous "doall" run are available.
7) Run the procedure "extndavg.sh". It will carry out the cycles of
NC symmetry averaging/solvent flattening/phase combination/phase
extension steps to combine "averaged" phases with the original MIR
phases.
***** CREATING AVERAGING ENVELOPE MASKS FROM A MODEL *****
If coordinates from a tentative model are available, they can also
be used to create the averaging envelope masks. The procedure is
esssentially that described in the CREATING/EDITING SOLVENT MASKS
section, with a couple of minor exceptions. First, after the initial
mask is constructed in MDLMSK and edited in MAPVIEW as described,
one is finished since unlike solvent masks, there is no need to make
"full cell" averaging masks with BLDCEL. Second, if the NC symmetry
operation involves arbitrary rotations and/or post rotation
translations, then MDLMSK must be run multiple times; once for each
NC symmetry related molecule. In each run a separate file should be
written and a different mask number must be used, but each file must
cover the same range (which is large enough to cover ALL copies). The
particular mask numbers used must be remembered as they will be needed
later when specifying which transformation operators are to be used in
MAPAVG to relate the molecules. The individual mask files should then
be edited and saved as described in the CREATING/EDITING SOLVENT MASKS
section. Once edited, the individual mask files must be combined into
a single mask file with program MRGMSK. The output file from MRGMSK
then can be used for averaging (i.e. as "asu.msk" in the "extndavg.sh"
procedure).
The output masks from MDLMSK or MRGMSK can be used for averaging
as long as a corresponding map region is provided. Thus the input used
to create the submaps ("extrmap.d" in the "extndavg.sh" procedure)
must specify the same range, and the map must have the same periods.
If the "extndavg.sh" procedure is to be used, then the solvent mask
must also be created from a map having the same periods. The output
masks can be examined/edited in MAPVIEW, again as long as the
corresponding map region (either explicitly selected from a FSFOUR map
in MAPVIEW, or previously extracted from a FSFOUR map by EXTRMAP and
input to MAPVIEW as a non-FSFOUR map) is provided. Once the output
masks from MDLMSK or MRGMSK are obtained, they can be used just like
any other averaging mask file, i.e. used for operator refinement in
LSQROT or LSQROTGEN, used in MAPAVG, manipulated in SKEW, BLDCEL etc.
10.01 AVERAGING WITH MULTIPLE CRYSTALS
All of the submap extraction and mask preparation steps used in
single crystal averaging as described earlier must be carried out
independently for each crystal, thus multiple submap and corresponding
mask files must be created. If a given crystal also contains multiple
NC symmetry related copies WITHIN IT, then the operators relating
molecule 1 to each of them must also be refined exactly as described
in the single crystal case. This will allow both intra and inter
crystal averaging to be carried out simultaneously. In addition, the
operators relating MOLECULE 1 in CRYSTAL 1 to MOLECULE 1 in EACH
OTHER CRYSTAL must also be refined. This can be done in program
LSQROTGEN by specifying the appropriate input. Once all of the
required operators and envelope masks are obtained, averaging can
proceed by specifying the appropriate input to program MAPAVG, and
by preparing the required input files for EXTRMAP and BLDCEL for
each crystal. A script file "extndavg_mc.sh" is supplied for multiple
crystal averaging in the case where there are two crystals. It
can easily be modified to include more crystals (up to 6), and
comments are embedded in it explaining where modifications are to
be made. The main difference in the procedure for multiple crystal
averaging is that all of the normal input files must be duplicated for
each crystal, and the standard file names for maps, masks, data files
etc must be be modified to uniquely identify the appropriate crystal.
During each cycle of multiple crystal averaging the full cell maps are
created and the submaps are extracted independently for each crystal.
Then an averaged version (averaged over ALL copies) is created for
each submap. For each crystal, the averaged submap is then expanded to
its full cell version, solvent flattened, Fourier inverted and the
resulting phases and amplitudes combined probabilistically with the
appropriate MIR or SIR phases. Thus a new improved map can be obtained
for each crystal. With multiple crystal averaging however, there is
currently no facility for slow phase extension, thus the file
"sloext.d" is not needed and the number of cycles to be done is hard
wired into the "extndavg_mc.sh" script. Phase extension however, still
can be done. It's just that the appropriate cutoffs are supplied only
in the "extnd.d" files for each crystal, and are constant for all
iterations. One can of course, still extend the resolution gradually
by repeating the process with iteratively with different cutoffs and
input files.
10.02 AVERAGING DIFFERENCE OR 2FO-FC MAPS
One usually does the averaging/solvent flattening iterations on
normal electron density maps, but in some cases it may be desirable
to average FO-FC or 2FO-FC maps. Examples might be when trying to
identify inhibitors, activators etc. soaked in to known crystal
structures, or when trying to build up density for missing sections
of the macromolecule itself. This can be accomplished by proper
preparation of the input files, and changing the map type
specification in the fft.d input file. To do this one must assure
that both FO and FC are available on the INITIAL file (called
phi16cy.31 in the "extndavg.sh" or "extndavg_mc.sh" scripts) used to
create the first map, and on the OUTPUT file ("newphi.ref" or
"newphi_N.ref," etc.) produced in each iteration. For the output
files this is done by specifying IOTYP=1 in the BNDRY option 3 input
("bnd3.d" or "bnd3_N.d" etc.). For the INITIAL file, one could
obtain it from a single run of BNDRY, option 3, again specifying
IOTYP=1, or from a run of PHASIT, structure factor mode specifying
IHLCF=0 and ISIGA=0, depending on whether the phase information comes
from MIR type calculations or from atomic coordinates for a model.
Note however, that the "anchor" phase file (called "phasit.31" or
"phasit_N.31" etc.) which the map inverted phases will be combined
with MUST contain FM*FO and FO in the amplitude slots along with
probability distribution coeficients, as would be the case if the
file was created with a normal PHASIT run in protein phasing mode (or
structure factor mode if the "long format" output was requested). As
long as these files are properly prepared and the appropriate
coefficients are selected in the fft input, iterations using map
types involving FC's will be obtainable. One must be aware however,
that the final output file (phiextndavg.31) will then also have
FO and FC in the amplitude slots, and thus can only be used in
FSFOUR for straight or difference type Fouriers, and NOT for
figure of merit weighted Fouriers.
If one is averaging with molecular replacement derived phase
information and has already proceeded as described in the DENSITY
MODIFICATION WITH MOLECULAR REPLACEMENT DERIVED PHASE INFORMATION
section, i.e. the "doall" procedure has been run using the modified
inputs, one need only to change the filename "phasit.31" in the
"extnd.d" input file to "anchor.31". Then averaging can proceed with
the "extndavg.sh" script using all of the preexisting files, once the
averaging mask, extrmap.d, mapavg.d., bldcel.d sloext.d (and possibly
extrfl.d) files are created.
10.03 SAMPLE INPUT FILES FOR AVERAGING
***** SAMPLE INPUT FILES FOR AVERAGING WITHIN ONE CRYSTAL *****
Sample input files for the averaging steps follow, along with a
listing of the supplied template command files "extndavg.sh" and
"extndavg.com". The command files can be used in place of the normal
"extnd.sh" or "extnd.com" file in a solvent levelling run. They will
perform additional cycles of averaging/solvent flattening/phase
combination/extension starting with the phases in file "phi16cy.31",
combining the "averaged" phases with MIR, SIR phase information in
file "phasit.31", and extending phases to additional amplitudes on
file "extrfl.d". They assume that all of the files needed for a
normal solvent flattening run (fft.d, bnd2.d, etc) are available, and
that the third mask from a previous run (mask3.14) is still available
for solvent flattening. If the template script file is to be used
unchanged, then all filenames should be EXACTLY as in the examples
(except for standard parameter file). Only the data relating to
submap ranges, resolution limits, number of cycles and the NC
operators should be changed. The final phases will be written to file
"phiextndavg.31", and printed information to "extndavg.l". The
procedure is run simply by entering "sh extndavg.sh" (UNIX) or
"@EXTNDAVG.COM" (VMS).
-- Sample input file extrmap.d or extrmap.dat, to extract submap ----
pdc.pam
four.map
asu.map
-.42 .45 -.45 .42 -.08 .56
--- Sample input file mapavg.d, for averaging over pure twofold -----
pdc.pam
1
asu.map
asu.msk
asu.avg
2 1 1
-102.16 83.81 180.0 1.082 -.746 .316 0.0
--- Sample input file bldcel.d, builds complete cell from averaged
submap ---
pdc.pam
four.map
avgcell.map
asu.avg
asu.msk
0
--- Sample input file extnd.d, specifying phase combination
data and options ---
pdc.pam
3
1 2.75 1. 0 0
phasit.31
minv.ref
extrfl.d
newphi.ref
Note that if one doe NOT want to include phase extension, then the
first "1" should be changed to zero and the line containing
"extrfl.d" should be omitted (see BNDRY write-up).
--- Sample input file sloext.d, controlling no. of averaging cycles
and phase extension information ---
pdc.pam
3. 2.75 8
extnd.d
Note that if one doe NOT want to include phase extension, then the
value of the two resolution limits should be made equal and the 8
changed to 16 to do a total of 16 averaging cycles (see SLOEXT
write-up).
************* procedure extndavg.sh ***************
# MODIFIED TO INCLUDE NONCRYSTALLOGRAPHIC SYMMETRY AVERAGING
#
# RESUME WHERE WE LEFT OFF AFTER FIRST 16 CYCLES
#
cp phi16cy.31 newphi.ref
ln phasit.31 minv.ref
#
# CREATE THE TEMPORARY FILE "EXTND.TMP" CONTROLLING EXTENSION
sloext < sloext.d > extndavg.l
#
# PERFORM THE REFINEMENT/PHASE EXTENSION ITERATIONS USING
# THE THIRD MASK
#
# DO ITERATIONS AS LONG AS THE FILE EXTND.TMP EXISTS (IT WILL
# AUTOMATICALLY BE DELETED WHEN THE SPECIFIED RESOLUTION/NUMBER
# OF ITERATIONS IS REACHED)
while
test -r EXTND.TMP
do
#
# COMPUTE ELECTRON DENSITY MAP FROM COMBINED PHASE INFORMATION
#
rm minv.ref
mv newphi.ref four.ref
fsfour < fft.d >> extndavg.l
rm four.ref
#
# EXTRACT REGION FROM MAP APPROPRIATE FOR AVERAGING
extrmap < extrmap.d >> extndavg.l
#
# AVERAGE THE ELECTRON DENSITY, WITHIN THE MOLECULAR ENVELOPE
mapavg < mapavg.d >> extndavg.l
rm asu.map
#
# REBUILD THE COMPLETE UNIT CELL
bldcel < bldcel.d >> extndavg.l
rm four.map asu.avg
mv avgcell.map four.map
#
# MODIFY AVERAGED ELECTRON DENSITY MAP ACCORDING TO MASK
#
ln mask3.14 mask.map
# use .06 for 3A,.086 for 3.5 and .112 for 4A
bndry < bnd2.d >> extndavg.l
rm four.map mask.map
#
# INVERT MODIFIED MAP TO OBTAIN STRUCTURE FACTORS
#
mapinv < minv2.d >> extndavg.l
rm mod.map
#
# COMBINE PRIOR PHASE INFORMATION WITH THAT FROM MAP INVERSION
# AND EXTEND PHASING TO ADDITIONAL AMPLITUDES
#
bndry < extnd.d >> extndavg.l
#
# ADJUST ITERATION COUNT AND/OR RESOLUTION CUTOFF (THIS WILL ALSO
# DELETE THE "EXTND.TMP" FILE AT THE APPROPRIATE TIME, THUS EXITING
# THE LOOP)
sloext < sloext.d >> extndavg.l
#
done
#
mv newphi.ref phiextndavg.31
mv minv.ref allcoef.31
# THATS ALL
************* procedure extndavg.com ***************
$SET NOVERIFY
$!
$! SEND ALL PRINTED OUTPUT TO EXTNDAVG.L
$ASSIGN EXTNDAVG.L SYS$OUTPUT
$!
$! MODIFIED TO INCLUDE NONCRYSTALLOGRAPHIC SYMMETRY AVERAGING.
$! RESUME WHERE WE LEFT OFF AFTER FIRST 16 CYCLES
$!
$COPY PHI16CY.31 NEWPHI.REF
$COPY PHASIT.31 MINV.REF
$COPY MASK3.14 MASK.MAP
$!
$! CREATE THE TEMPORARY FILE "EXTND.TMP" CONTROLLING EXTENSION
$ASSIGN SLOEXT.DAT FOR005
$SLOEXT
$DEASSIGN FOR005
$!
$! PERFORM THE PHASE EXTENSION/REFINEMENT ITERATIONS USING THE
$! THIRD MASK
$!
$LOOP3:
$!
$! DO ITERATIONS AS LONG AS THE FILE EXTND.TMP EXISTS (IT WILL
$! AUTOMATICALLY BE DELETED WHEN THE SPECIFIED RESOLUTION/NUMBER
$! OF ITERATIONS IS REACHED)
$FILESPEC=F$SEARCH("EXTND.TMP")
$IF FILESPEC .EQS. "" THEN GOTO DONE3
$!
$! COMPUTE ELECTRON DENSITY MAP FROM COMBINED PHASE INFORMATION
$DELETE MINV.REF;*
$RENAME NEWPHI.REF FOUR.REF
$ASSIGN FFT.DAT FOR005
$FSFOUR
$DEASSIGN FOR005
$DELETE FOUR.REF;*
$!
$! EXTRACT REGION FROM MAP APPROPRIATE FOR AVERAGING
$ASSIGN EXTRMAP.DAT FOR005
$EXTRMAP
$DEASSIGN FOR005
$!
$! AVERAGE THE ELECTRON DENSITY, WITHIN THE MOLECULAR ENVELOPE
$ASSIGN MAPAVG.DAT FOR005
$MAPAVG
$DEASSIGN FOR005
$DELETE ASU.MAP;*
$!
$! REBUILD THE COMPLETE UNIT CELL
$ASSIGN BLDCEL.DAT FOR005
$BLDCEL
$DEASSIGN FOR005
$DELETE FOUR.MAP;*
$DELETE ASU.AVG;*
$RENAME AVGCELL.MAP FOUR.MAP
$!
$! MODIFY AVERAGED ELECTRON DENSITY MAP ACCORDING TO MASK3
$! use .06 for 3A,.086 for 3.5 and .112 for 4A
$ASSIGN BND2.DAT FOR005
$BNDRY
$DEASSIGN FOR005
$DELETE FOUR.MAP;*
$!
$! INVERT MODIFIED MAP TO OBTAIN STRUCTURE FACTORS
$ASSIGN MINV2.DAT FOR005
$MAPINV
$DEASSIGN FOR005
$DELETE MOD.MAP;*
$!
$! COMBINE PRIOR PHASE INFORMATION WITH THAT FROM MAP INVERSION
$! AND EXTEND PHASING TO ADDITIONAL AMPLITUDES
$!
$ASSIGN EXTND.DAT FOR005
$BNDRY
$DEASSIGN FOR005
$!
$! ADJUST ITERATION COUNT AND/OR RESOLUTION CUTOFF (THIS WILL ALSO
$! DELETE THE "EXTND.TMP" FILE AT THE APPROPRIATE TIME, THUS EXITING
$! THE LOOP)
$ASSIGN SLOEXT.DAT FOR005
$SLOEXT
$DEASSIGN FOR005
$!
$GOTO LOOP3
$!
$DONE3:
$!
$DELETE MASK.MAP;*
$RENAME NEWPHI.REF PHIEXTNDAVG.31
$RENAME MINV.REF ALLCOEF.31
$PURGE ALLCOEF.31
$!
$DEASSIGN SYS$OUTPUT
$!
$! THATS ALL
***** SAMPLE INPUT FILES FOR AVERAGING WITH MULTIPLE CRYSTALS *****
Sample input files for the averaging steps follow, along with a
listing of the supplied template command files "extndavg_mc.sh" and
"extndavg_mc.com". The command files can be used in place of the
normal "extnd.sh" or "extnd.com" file in a solvent levelling run.
It will perform 16 cycles of averaging/solvent flattening/phase
extension for each crystal starting with the phases in files
"phi16cy_1.31" and "phi16cy_2.31" for crystals 1 and 2, respectively,
and combining the "averaged" phases with MIR, SIR phase information
in files "phasit_1.31" and "phasit_2.31" for crystals 1 and 2,
respectively. It also will extend phases to additional amplitudes on
files "extrfl_1.d" and "extrfl_2.d" for crystals 1 and 2,
respectively. The script assumes that all of the files needed for a
normal solvent flattening run (fft.d, bnd2.d, etc) are available
for each crystal, and that the third mask from a previous run
(mask3.14) is still available for solvent flattening in each crystal.
In order to keep input data and files associated with the proper
crystal, the "normal" file names should have an "_N" inserted
immediately proceeding the extension, i.e. fft_1.d, fft_2.d etc would
replace fft.d for crystals 1 and 2, respectively. If the template
script file is to be used unchanged, then all filenames should be
EXACTLY as in the examples (except for the standard parameter file).
Only the data relating to submap ranges and the NC operators should be
changed. The final phases will be written to files "phiextndavg_1.31",
and "phiextndavg_2.31" for crystals 1 and 2, respectively, and printed
information will be written to "extndavg_mc.l". The procedure is run
simply by entering "sh extndavg_mc.sh" (UNIX) or "@EXTNDAVG_MC.COM"
(VMS). For each crystal it assumes that the following files exist
where "N" is replaced by the crystal number, and that the file
"mapavg_mc.d" exists to control the averaging.
phi16cy_N.31 Starting phases, to get first map
fft_N.d fft grid info
extrmap_N.d submap extraction info
asu_N.msk averaging mask
bldcel_N.d info for reconstruction of full cell map from submap
bnd2_N.d solvent flattening info
mask_N.map solvent flattening mask
minv2_N.d map inversion info
extnd_N.d phase combination info
phasit_N.31 Anchor phases, to be combined with map inverted phases
extrfl_N.d Additional reflections, if phase extension requested.
Additionally, it is assumed that ALL file names (apart from the
parameter files) REFERENCED WITHIN the control files above (e.g.
files referred to within fft_N.d, extrmap_N.d, bldcel_N.d, bnd2_N.d,
minv2_N.d and extnd_N.d) also include the appropriate "_N" insertion
modifying the "standard" file names to distinguish data for different
crystals. Some examples are given below.
-- Sample input files fft_1.d and fft_2.d for two crystals ---
pdc1.pam pdc2.pam
COMPUTE DENSITY MAP COMPUTE DENSITY MAP
0 144 80 120 1 0 20 0 0 0 0 0 80 128 120 1 0 20 0 0 0 0
four_1.ref four_2.ref
four_1.map four_2.map
-- Sample input files extrmap_1.d and extrmap_2.d ----
pdc1.pam pdc2.pam
four_1.map four_2.map
asu_1.map asu_2.map
-.42 .49 -.56 .59 -.13 .63 -.62 .77 -.31 .35 -.25 .92
--- Sample input file mapavg_mc.d, for averaging over two copies
in crystal 1 and four copies in crystal 2 ---
pdc1.pam
2
asu_1.map
asu_1.msk
asu_1.avg
2 1 2
78.140 95.988 179.646 0.796 -0.67 0.239 0.152
asu_2.map
asu_2.msk
asu_2.avg
4 1 2 3 4
77.369 83.131 179.597 9.935 3.468 2.652 0.194
163.989 115.624 -177.918 -7.586 -5.270 35.599 0.333
184.007 27.036 180.196 1.551 -.575 38.236 0.451
283.568 92.472 -28.468 -5.787 -15.019 0.729 37.247
--- Sample input files bldcel_1.d and bldcel_2.d, builds complete
cell from averaged submaps for each crystal ---
pdc1.pam pdc2.pam
four_1.map four_2.map
avgcell_1.map avgcell_2.map
asu_1.avg asu_2.avg
asu_1.msk asu_2.msk
0 0
************* procedure extndavg_mc.sh ***************
# SCRIPT FOR NONCRYSTALLOGRAPHIC SYMMETRY AVERAGING IN THE CASE
# WHERE MULTIPLE CRYSTAL FORMS ARE USED
#
# This sample script is appropriate for the case where two crystals
# are to be averaged. It can readily be modified to include more
# crystals, by making additons in the four places as indicated.
#
# The single file "mapavg_mc.d" containing input for the mapavg
# program is assumed to be present to control the multi-crystal map
# averaging process. In addition, a series of files specific to each
# crystal is needed as described below.
#
# For each of the "N" crystals the following files are assumed to
# exist, where the "N" in the file name is to be replaced by the
# crystal number, i.e. 1, 2, 3, etc.
#
# phi16cy_N.31 Starting phases, to get first map
# fft_N.d fft grid info
# extrmap_N.d submap extraction info
# asu_N.msk averaging mask
# bldcel_N.d info for reconstruction of full cell map
# bnd2_N.d solvent flattening info
# mask3_N.14 solvent flattening mask
# minv2_N.d map inversion info
# extnd_N.d phase combination info
# phasit_N.31 Anchor phases, for combining with inverted phases
# extrfl_N.d Additional reflections, if phase extension requested
#
# Also, to distinguish data specific for each crystal all file
# names (other than the parameter files) REFERENCED WITHIN the files
# above should also have an "_N" inserted just prior to the file
# extension, where N is the crystal number.
#
# INITIALIZE TEMPORARY FILE NAMES FOR EACH CRYSTAL
#
# INITIALIZATION FOR CRYSTAL 1
cp phi16cy_1.31 newphi_1.ref
ln phasit_1.31 minv_1.ref
#
# INITIALIZATION FOR CRYSTAL 2
cp phi16cy_2.31 newphi_2.ref
ln phasit_2.31 minv_2.ref
#
# REPEAT ABOVE 4 LINES FOR EACH ADDITIONAL CRYSTAL, IF ANY,
# ADJUSTING THE FILE NAMES ACCORDINGLY
#
#
# PERFORM 16 CYCLES OF PHASE EXTENSION/AVERAGING, USING THIRD
# MASK FOR EACH CRYSTAL
for cycle
in 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
do
#
# COMPUTE THE FULL CELL MAP AND EXTRACT SUBMAP FOR CRYSTAL 1
rm minv_1.ref
mv newphi_1.ref four_1.ref
fsfour < fft_1.d >> extndavg_mc.l
rm four_1.ref
#
extrmap < extrmap_1.d >> extndavg_mc.l
#
#
# COMPUTE THE FULL CELL MAP AND EXTRACT SUBMAP FOR CRYSTAL 2
rm minv_2.ref
mv newphi_2.ref four_2.ref
fsfour < fft_2.d >> extndavg_mc.l
rm four_2.ref
#
extrmap < extrmap_2.d >> extndavg_mc.l
#
# REPEAT ABOVE 8 LINES FOR EACH ADDITIONAL CRYSTAL, IF ANY,
# ADJUSTING THE FILE NAMES ACCORDINGLY
#
#
#
# HAVE ALL THE NECESSARY MAPS, NOW DO THE AVERAGING. (THIS STEP
# DONE ONLY ONCE, SINCE MAPAVG HANDLES ALL CRYSTALS AT SAME TIME)
#
# AVERAGE THE ELECTRON DENSITY, WITHIN THE MOLECULAR ENVELOPES
mapavg < mapavg_mc.d >> extndavg_mc.l
#
#
#
# NOW DO THE SOLVENT FLATTENING, INVERSION AND PHASE COMBINATION
# SEPARATELY FOR EACH CRYSTAL
#
# REBUILD THE COMPLETE UNIT CELL FOR CRYSTAL 1
rm asu_1.map
bldcel < bldcel_1.d >> extndavg_mc.l
rm four_1.map asu_1.avg
#
# MODIFY AVERAGED ELECTRON DENSITY MAP ACCORDING TO MASK FOR
# CRYSTAL 1
ln mask3_1.14 mask_1.map
# use .06 for 3A,.086 for 3.5 and .112 for 4A
bndry < bnd2_1.d >> extndavg_mc.l
rm avgcell_1.map mask_1.map
#
# INVERT MODIFIED MAP TO OBTAIN STRUCTURE FACTORS FOR CRYSTAL 1
mapinv < minv2_1.d >> extndavg_mc.l
rm mod_1.map
#
# COMBINE PRIOR PHASE INFORMATION WITH THAT FROM MAP INVERSION
# AND EXTEND PHASING TO ADDITIONAL AMPLITUDES FOR CRYSTAL 1
bndry < extnd_1.d >> extndavg_mc.l
#
#
# REBUILD THE COMPLETE UNIT CELL FOR CRYSTAL 2
rm asu_2.map
bldcel < bldcel_2.d >> extndavg_mc.l
rm four_2.map asu_2.avg
#
# MODIFY AVERAGED ELECTRON DENSITY MAP ACCORDING TO MASK FOR
# CRYSTAL 2
ln mask3_2.14 mask_2.map
# use .06 for 3A,.086 for 3.5 and .112 for 4A
bndry < bnd2_2.d >> extndavg_mc.l
rm avgcell_2.map mask_2.map
#
# INVERT MODIFIED MAP TO OBTAIN STRUCTURE FACTORS FOR CRYSTAL 2
mapinv < minv2_2.d >> extndavg_mc.l
rm mod_2.map
#
# COMBINE PRIOR PHASE INFORMATION WITH THAT FROM MAP INVERSION
# AND EXTEND PHASING TO ADDITIONAL AMPLITUDES FOR CRYSTAL 2
bndry < extnd_2.d >> extndavg_mc.l
#
# REPEAT ABOVE 20 LINES FOR EACH ADDITIONAL CRYSTAL, IF ANY,
# ADJUSTING THE FILE NAMES ACCORDINGLY
#
done
#
# RENAME THE FINAL OUTPUT PHASE FILES FOR EACH CRYSTAL
#
# FOR CRYSTAL 1
mv newphi_1.ref phiextndavg_1.31
mv minv_1.ref allcoef_1.31
#
# FOR CRYSTAL 2
mv newphi_2.ref phiextndavg_2.31
mv minv_2.ref allcoef_2.31
#
# REPEAT ABOVE 4 LINES FOR EACH ADDITIONAL CRYSTAL, IF ANY,
# ADJUSTING THE FILE NAMES ACCORDINGLY
#
# THATS ALL
************* procedure extndavg_mc.com ***************
$SET NOVERIFY
$! MODIFIED FOR NONCRYSTALLOGRAPHIC SYMMETRY AVERAGING IN THE CASE
$! WHERE MULTIPLE CRYSTAL FORMS ARE USED
$!
$! THIS SAMPLE CONTROL FIlE IS APPROPRIATE FOR THE CASE WHERE TWO
$! CRYSTALS ARE TO BE AVERAGED. IT CAN READILY BE MODIFIED TO INCLUDE
$! MORE CRYSTALS, BY MAKING ADDITONS IN THE FOUR PLACES AS INDICATED.
$!
$! THE SINGLE FILE "MAPAVG_MC.DAT" CONTAINING INPUT FOR THE MAPAVG
$! PROGRAM IS ASSUMED TO BE PRESENT TO CONTROL THE MULTI-CRYSTAL MAP
$! AVERAGING PROCESS. IN ADDITION, A SERIES OF FILES SPECIFIC TO EACH
$! CRYSTAL IS NEEDED AS DESCRIBED BELOW.
$!
$! FOR EACH OF THE "N" CRYSTALS THE FOLLOWING FILES ARE ASSUMED TO
$! EXIST, WHERE THE "N" IN THE FILE NAME IS TO BE REPLACED BY THE
$! CRYSTAL NUMBER, I.E. 1, 2, 3, ETC.
$!
$! PHI16CY_N.31 STARTING PHASES, TO GET FIRST MAP
$! FFT_N.D FFT GRID INFO
$! EXTRMAP_N.D SUBMAP EXTRACTION INFO
$! ASU_N.MSK AVERAGING MASK
$! BLDCEL_N.D INFO FOR RECONSTRUCTION OF FULL CELL MAP
$! BND2_N.D SOLVENT FLATTENING INFO
$! MASK3_N.14 SOLVENT FLATTENING MASK
$! MINV2_N.D MAP INVERSION INFO
$! EXTND_N.D PHASE COMBINATION INFO
$! PHASIT_N.31 ANCHOR PHASES, FOR COMBINING WITH INVERTED PHASES
$! EXTRFL_N.D ADDITIONAL REFLECTIONS, IF PHASE EXTENSION REQUESTED
$!
$! ALSO, TO DISTINGUISH DATA SPECIFIC FOR EACH CRYSTAL ALL FILE
$! NAMES (OTHER THAN THE PARAMETER FILES) REFERENCED WITHIN THE FILES
$! ABOVE SHOULD ALSO HAVE AN "_N" INSERTED JUST PRIOR TO THE FILE
$! EXTENSION, WHERE N IS THE CRYSTAL NUMBER.
$!
$! SEND ALL PRINTED OUTPUT TO EXTNDAVG_MC.L
$ASSIGN EXTNDAVG_MC.L SYS$OUTPUT
$!
$!
$! INITIALIZATION FOR CRYSTAL 1
$COPY PHI16CY_1.31 NEWPHI_1.REF
$COPY PHASIT_1.31 MINV_1.REF
$COPY MASK3_1.14 MASK_1.MAP
$!
$! INITIALIZATION FOR CRYSTAL 2
$COPY PHI16CY_2.31 NEWPHI_2.REF
$COPY PHASIT_2.31 MINV_2.REF
$COPY MASK3_2.14 MASK_2.MAP
$!
$! REPEAT ABOVE 5 LINES FOR EACH ADDITIONAL CRYSTAL, IF ANY,
$! ADJUSTING THE FILE NAMES ACCORDINGLY
$!
$!
$!
$! PERFORM 16 CYCLES OF PHASE EXTENSION/AVERAGING, USING THIRD
$! MASK FOR EACH CRYSTAL
$!
$CYCLE = 0
$LOOP:
$CYCLE = CYCLE + 1
$!
$! COMPUTE THE FULL CELL MAP AND EXTRACT SUBMAP FOR CRYSTAL 1
$DELETE MINV_1.REF;*
$RENAME NEWPHI_1.REF FOUR_1.REF
$ASSIGN FFT_1.DAT FOR005
$FSFOUR
$DEASSIGN FOR005
$DELETE FOUR_1.REF;*
$!
$ASSIGN EXTRMAP_1.DAT FOR005
$EXTRMAP
$DEASSIGN FOR005
$!
$! COMPUTE THE FULL CELL MAP AND EXTRACT SUBMAP FOR CRYSTAL 2
$DELETE MINV_2.REF;*
$RENAME NEWPHI_2.REF FOUR_2.REF
$ASSIGN FFT_2.DAT FOR005
$FSFOUR
$DEASSIGN FOR005
$DELETE FOUR_2.REF;*
$!
$ASSIGN EXTRMAP_2.DAT FOR005
$EXTRMAP
$DEASSIGN FOR005
$!
$! REPEAT ABOVE 12 LINES FOR EACH ADDITIONAL CRYSTAL, IF ANY,
$! ADJUSTING THE FILE NAMES ACCORDINGLY
$!
$!
$!
$! HAVE ALL THE NECESSARY MAPS, NOW DO THE AVERAGING. (THIS STEP
$! DONE ONLY ONCE, SINCE MAPAVG HANDLES ALL CRYSTALS AT SAME TIME)
$!
$! AVERAGE THE ELECTRON DENSITY, WITHIN THE MOLECULAR ENVELOPES
$ASSIGN MAPAVG_MC.DAT FOR005
$MAPAVG
$DEASSIGN FOR005
$!
$!
$!
$! NOW DO THE SOLVENT FLATTENING, INVERSION AND PHASE COMBINATION
$! SEPARATELY FOR EACH CRYSTAL
$!
$! REBUILD THE COMPLETE UNIT CELL FOR CRYSTAL 1
$DELETE ASU_1.MAP;*
$ASSIGN BLDCEL_1.DAT FOR005
$BLDCEL
$DEASSIGN FOR005
$DELETE FOUR_1.MAP;*
$DELETE ASU_1.AVG;*
$!
$! MODIFY AVERAGED ELECTRON DENSITY MAP ACCORDING TO MASK FOR
$! CRYSTAL 1
$! use .06 for 3A,.086 for 3.5 and .112 for 4A
$ASSIGN BND2_1.DAT FOR005
$BNDRY
$DEASSIGN FOR005
$DELETE AVGCELL_1.MAP;*
$!
$! INVERT MODIFIED MAP TO OBTAIN STRUCTURE FACTORS FOR CRYSTAL 1
$ASSIGN MINV2_1.DAT FOR005
$MAPINV
$DEASSIGN FOR005
$DELETE MOD_1.MAP;*
$!
$! COMBINE PRIOR PHASE INFORMATION WITH THAT FROM MAP INVERSION
$! AND EXTEND PHASING TO ADDITIONAL AMPLITUDES FOR CRYSTAL 1
$ASSIGN EXTND_1.DAT FOR005
$BNDRY
$DEASSIGN FOR005
$!
$!
$! REBUILD THE COMPLETE UNIT CELL FOR CRYSTAL 2
$DELETE ASU_2.MAP;*
$ASSIGN BLDCEL_2.DAT FOR005
$BLDCEL
$DEASSIGN FOR005
$DELETE FOUR_2.MAP;*
$DELETE ASU_2.AVG;*
$!
$! MODIFY AVERAGED ELECTRON DENSITY MAP ACCORDING TO MASK FOR
$! CRYSTAL 2
$! use .06 for 3A,.086 for 3.5 and .112 for 4A
$ASSIGN BND2_2.DAT FOR005
$BNDRY
$DEASSIGN FOR005
$DELETE AVGCELL_2.MAP;*
$!
$! INVERT MODIFIED MAP TO OBTAIN STRUCTURE FACTORS FOR CRYSTAL 2
$ASSIGN MINV2_2.DAT FOR005
$MAPINV
$DEASSIGN FOR005
$DELETE MOD_2.MAP;*
$!
$! COMBINE PRIOR PHASE INFORMATION WITH THAT FROM MAP INVERSION
$! AND EXTEND PHASING TO ADDITIONAL AMPLITUDES FOR CRYSTAL 2
$ASSIGN EXTND_2.DAT FOR005
$BNDRY
$DEASSIGN FOR005
$!
$! REPEAT ABOVE 28 LINES FOR EACH ADDITIONAL CRYSTAL, IF ANY,
$! ADJUSTING THE FILE NAMES ACCORDINGLY
$!
$IF CYCLE .LT. 16 THEN GOTO LOOP
$!
$!
$! RENAME THE FINAL OUTPUT PHASE FILES FOR EACH CRYSTAL
$!
$! FOR CRYSTAL 1
$DELETE MASK_1.MAP;*
$RENAME NEWPHI_1.REF PHIEXTNDAVG_1.31
$RENAME MINV_1.REF ALLCOEF_1.31
$PURGE ALLCOEF_1.31
$!
$! FOR CRYSTAL 2
$DELETE MASK_2.MAP;*
$RENAME NEWPHI_2.REF PHIEXTNDAVG_2.31
$RENAME MINV_2.REF ALLCOEF_2.31
$PURGE ALLCOEF_2.31
$!
$! REPEAT ABOVE 6 LINES FOR EACH ADDITIONAL CRYSTAL, IF ANY,
$! ADJUSTING THE FILE NAMES ACCORDINGLY
$!
$! THATS ALL
$DEASSIGN SYS$OUTPUT
11.00 DENSITY MODIFICATION WITH MOLECULAR REPLACEMENT
DERIVED PHASE INFORMATION
When the initial source of phase information is from a model
derived by molecular replacement techniques, it is still sometimes
desirable to improve the phases by solvent flattening and/or NC
symmetry averaging. This may be the case when the molecular replacement
derived model represents only a fraction of the asymetric unit
contents, and the missing parts of the structure must still be found.
In such cases the solvent flattening etc. iterations require creation
of phase probability distribution coefficients for the partial
structure model to be used as "anchor" phases in the phase combination
step. Also, it is desirable to do the iterations on 2FO-FC maps rather
than on the normal FOM*FO maps, since the "missing" parts of the
structure will then contribute more to the maps. Thus one must assure
that both Fo and Fc are present on the phase files. This all can be
accomplished without modification to the "doall" script by doing the
following:
1) Generate phases from the partial structure in PHASIT, structure
factor calculation mode, requesting the "short form" output, i.e.
using IHLCF=0, ISIGA=0. This file contains both Fo and Fc, and should
be called "phasit.31". (Don't worry that the file does not contain
distribution coefficients as a normal "phasit.31 file would, in this
case the file will be used only to seed the process by creating the
first map.)
2) Generate the same phases again from the partial structure in
PHASIT, structure factor calculation mode, but this time request the
"long form" output, i.e. using IHLCF=1, ISIGA=0 or 1. Call this file
"anchor.31". It includes probability distribution coefficients and
will serve as the "anchor" phases, which map inverted phase
information will be combined with on each iteration.
3) Modify the fft.d input file to request a 2Fo-Fc map instead of
an Fo map.
4) Modify the bnd3.d (and possibly extnd.d) input files to specify
"anchor.31" to be used instead of "phasit.31" for the anchor phase
set, and set IOTYP=1 so that both Fo and Fc appear on the output file.
5) Modify the rmhv.d input file to specify that no heavy atoms,
i.e. 0 input atoms, are used.
You can now run the "doall" procedure, and 2Fo-Fc maps will be used
for all calculations including those used during mask construction.
Note however, that the output files (phi4cy.31, phi8cy.31, phi16cy.31
etc.) will now contain Fo and Fc in the amplitude slots instead of
the normal fom*fo and fo, thus they can NOT be used to compute figure
of merit weighted maps in FSFOUR. They can however, be used for Fo
and difference type maps. The figure of merit is still present in
the file, but it will not be applied in FSFOUR.
Finally, if non-crystallographic symmetry averaging is to be
performed in addition to solvent flattening one can continue the
process as described in the AVERAGING DIFFERENCE or 2FO-FC MAPS
section.
12.00 PHASE EXTENSION
Phases (and optionally amplitudes) can be extended, either to
higher resolution or to missing reflections within the original
resolution limit, by modifying an electron density map created with
the known phases, inverting the modified map and combining the map
inverted structure factors with the initial data via option 3 of the
BNDRY program. To extend phases to reflections for which amplitudes
are available, a file must first be created using the program MISSNG,
which generates a list of reflections (file "extrfl.d" for which at
least amplitudes are available, and possibly phase information. Then
the file "sloext.d" must be created (see write-up for SLOEXT) to
control the limits and rate of phase extension, and the file "extnd.d"
or "extnda.d" must be created to control the phase combination step in
BNDRY. Once these files are created, and assuming that the final
solvent mask (mask3.14), phase files (phasit.31 & phi16cy.31), and
input data files (eg. fft.d minv2.d, bnd2.d) from a prior "doall"
solvent flattening run are still available, one can execute the
"extnd.sh" script to carry out the phase extension iterations. The
resolution is gradually extended out to the limit specified in the
sloext.d file, with the final phases written to "phiextnd.31". One can
do phase AND AMPLITUDE extension similarly, by preparing "extnda.d"
and "sloext2.d" files, and executing "extnda.sh".
The phase extension process is even more powerful if density
modification in addition to solvent flattening/negative density
truncation is included, such as noncrystallographic symmetry averaging.
In such cases the script "extndavg.sh" can be used, which requires all
of the files utilized by "extnd.sh", plus the "extrmap.d", "mapavg.d",
"bldcel.d" and "asu.msk" files needed for NC symmetry averaging (see
the noncrystallographic symmetry section of the write-up).
When doing phase extension to higher resolution it is important to
compute the map on a grid sampled AT LEAST one third the smallest d
spacing to be encountered anywhere in the process, and to specify
Miller index ranges during map inversion ("minv2.d" file) which
satisfy the highest resolution desired. Note that if the extension
is substantial, this may require regenerating the solvent mask
(and therefore the averaging mask "asu.msk") on a finer grid than
was originally used. Best results are obtained when the extension
is carried out slowly, with at least 5 iterations per extension
step. Sample scripts and template files are provided for both UNIX
systems (as described here), and VMS symtems (with the corresponding
data files having ".dat" extensions and the control files having
".com" extensions. Samples are also given in the EXAMPLES section.
13.00 MAD PHASING
The PHASES package can be used to determine phase angles from MAD
(Multiple wavelength Anomalous Dispersion) data, by treating the data
from each wavelength as a native anomalous scattering, isomorphous
replacement or derivative anomalous scattering data set, and then
combining information from all sets in the conventional manner. This
is facilitated by the ability to input scattering factor information
to the PHASIT program. One simply adjusts the input scattering
factors and data appropriately for the wavelength and data set type
desired. For example, consider the case where data has been measured
at three wavelengths, and Bijvoet pairs were measured in all sets. A
reasonable strategy would be to:
1) Select one of the data sets to be the "native." For this set we
would prefer no Bijvoet signal to be present, so we might pick
a wavelength where delta f" is near zero. Note however, that we can
use any wavelength, even if delta f" is large, provided we first
AVERAGE both members of the Bijvoet pair for each acentric reflection.
Indeed, it is often desirable to choose a wavelength where delta f'
is near zero, allowing delta f" to be appreciable but it's effects
can be reduced or removed by the averaging. Thus if delta f" is large
be sure to include acentric reflections ONLY IF BOTH MEMBERS OF THE
BIJVOET PAIR WERE EXPLICITLY MEASURED AND AVERAGED! This is the data
set that will actually be phased and eventually used for map
calculations, thus it should include the centric reflections as well.
All other sets will be scaled to it. Since the REAL part of the
anomalous scattering correction is NOT removed by the averaging, we
will need to know what delta f' is at this wavelength. Let the real
and imaginary components of the anomalous dispersion scattering
factor corrections at this wavelength be called delta f'(N) and
delta f"(N).
2) Select another data set at a wavelength D1 which maximizes the
magnitude of ( delta f'(D1) - delta f'(N) ). For this set, average
the Bijvoet pairs for all acentric reflections as in (1), to remove
the contribution from delta f"(D1), and include the centric data
as well. This set can then be merged with the "native" in CMBISO
to form an "isomorphous" derivative scaled set. It can then be used
in PHASIT as an SIR data set, but since the only difference in the
scattering between this and the "native" is due to differences in
delta f', you must input the appropriate scattering factors. Thus
input zeroes for the 9 normal scattering factor coeficients, but
input ( delta f'(D1) - delta f'(N) ) for the REAL part, and
delta f"(D1) for the IMAGINARY part of the anomalous correction.
Be careful of the sign when doing the subtraction. (Note that the
delta f"(D1) term will not be used in the SIR calculation, thus it
does not have to be input as zero which you might have expected).
(3) Select another data set at a wavelength D2 which maximizes
delta f"(D2). For this set, DO NOT average the Bijvoet pairs. Simply
merge the data with the "native" set (created in (1)) with CMBANO to
generate a scaled "anomalous" set. The output data can then be used
in PHASIT as a "derivative anomalous scattering" data set, but since
the difference in scattering between this and the "native" is due to
both the difference in delta f' and the effect of delta f"(D2),
you must again adjust the scattering factors accordingly. Input zeros
for the 9 normal scattering factor coeficients, and input
( delta f'(D2) - delta f'(N) ) for the REAL and delta f"(D2) for the
IMAGINARY parts of the anomalous scattering correction. Again, be
careful of the sign when doing the subtraction. In this case both
the real and imaginary components will be used.
The "isomorphous" and/or "anomalous" scaled files prepared in (2)
and (3) can initially be used in the normal manner to locate the
anomalously scattering atoms from difference or Bijvoet difference
Patterson maps (see flowchart section), and possibly for initial
heavy atom refinement in GREF. Note however, that if one is using
"isomorphous" data sets as described in (2) above with program
MRGDF to compute difference or cross difference Fouriers and the
sign of ( delta f'(D1) - delta f'(N) ) is negative, then peaks in
the map corresponding to the isomorphous scatterers should also be
negative. In that case one can then request that program PSRCH list
only negative peaks to check the sites. This is necessary ONLY for
ISOMORPHOUS data sets in which the REAL part of the derivative
minus native scattering factor is expected to be negative. The
same thing holds for "difference" or "double difference" Fouriers
computed with the difference files generated by program PHASIT
when MAD data sets are used and the file corresponds to an
ISOMORPHOUS data set with (delta f'(D) - delta f'(N)) negative.
Once the anomalous scatterers have been found, the isomorphous
and anomalous scaled data from (2) and (3) can be used simultaneously
in PHASIT to compute SIRAS phases, which can then be used for map
computations or solvent flattening in the normal manner. One should
first carry out phase refinement in PHASIT, refining the heavy atom
parameters and scale factors. If the wavelengths were chosen as
described above these two sets should provide the greatest phasing
power, since the isomorphous and anomalous signals were maximized
in each case. It is not necessarily all that can be done however.
For example, the same data set (averaged Bijvoet mates) which was
utilized in (2) to create an "isomorphous" set, can also be
processed (without averaging Bijvoet mates) as in (3) to create
another "derivative anomalous scattering" set. In that case the
appropriate anomalous scattering correction factors would be
( delta f'(D1) - delta f'(N) ) and delta f"(D1). Likewise, the
data set (unaveraged Bijvoet mates) used in (3) to get the
derivative anomalous set, can also be processed (after averaging
Bijvoet mates) as in (2) to get another "isomorphous" set. For the
new isomorphous set the appropriate anomalous scattering correction
factors would be ( delta f'(D2) - delta f'(N) ) and delta f"(D2).
Finally, if the original "native" data set was collected at a
wavelength where delta f"(N) is appreciable, then it too can also be
included (without averaging Bijvoet mates) as a "native anomalous
scattering" data set. In that case the appropriate anomalous
scattering correction factors would be delta f'(N) and delta f"(N).
(Note that the real part will not be used in the calculations, and
in PHASIT native anomalous scattering data sets should come last in
the input). Thus with data at three wavelengths, one can combine up
to 5 different sources of phase information in PHASIT: two
isomorphous sets, two derivative anomalous scattering sets, and one
native anomalous scattering set. Although some of these sets provide
essentially the same (redundant) information, the experimental errors
will be different in each set, thus inclusion of all of them may still
be helpful. It may be useful to try various combinations.
MAD PHASING AT TWO WAVELENGTHS
A procedure similar to that above can be used when anomalous
scattering data has been collected only at two wavelengths. In
that case a "native" set is selected and processed as in (1), to
remove contributions from delta f"(N). The other data set is
first processed and merged with the native as in (2) to create an
"isomorphous" set, and then processed and merged again (this time
NOT averaging Bijvoet mates) as in (3) to create a "derivative
anomalous scattering" set. If the "native" set was taken at a
wavelength where delta f"(N) is appreciable, then the original
"native" data (this time WITHOUT averaging Bijvoet mates) can also
be used as a "native anomalous scattering" data set. Thus even
with data at only two wavelengths it is possible to obtain and
combine phase information from three sources, native anomalous
scattering, derivative isomorphous replacement and derivative
anomalous scattering.
14.00 VMS USER INFORMATION
Use of the PHASES package on VMS systems is very similar to its use
on UNIX systems, except that command files ".com" are used instead of
".sh" shell scripts. In all cases the programs function identically,
and all input and ouput is the same. A command procedure to be
executed from each users login.com file will define all of the programs
so that they can be run simply by entering the program name (as in
UNIX systems). However, one can not use the UNIX input and output
redirection operators (<, <<, >, >>), so that for the non-interactive
programs input data must either immediately follow the program name
(on subsequent lines), or come from a file which has been "ASSIGNed"
to FOR005 (and "DEASSIGNed" upon program completion). Likewise, to
direct standard output to a file one must ASSIGN the file to FOR006.
Also, the different byte order and floating point format makes it
difficult, if not impossible, to use any binary files on other
computer systems. This is not normally a problem since the only binary
files which typically need to be transferred are graphics map files for
use with programs TOM, O or CHAIN on graphics workstations. The VMS
version of program GMAP which creates these files contains special
code within it such that the binary map files it produces may be
transferred (via ftp, type binary) and used DIRECTLY on the Silicon
Graphics or ESV workstations where the graphics programs will be run.
Installing the PHASES package on a VMS system will involve four
steps. The first three steps should be done only once, whereas
the last step must be repeated any time a new user of the package
is added to the computer system. The steps are:
1) Edit the file "set_phases.com", which is present in the parent
directory as distributed. Only one line has to be changed, so that
the logical name "PHASES_DIR" points to the parent directory where
the software resides.
2) If one is on a DEC or VAX station instead of an ALPHA workstation,
one MAY have to edit the file "XSTUFF.OPT" (in the [.src] directory
below the parent directory) to point to the directory where the
systems X-Window object libraries are located. The supplied version
is appropriate for ALPHA workstations, but may need to be changed
for other workstations.
3) From the parent directory, type @BUILDIT.COM to invoke the
compilation and linking.
4) Have each user of the package insert the line
$@DISK:[DIRECTORY]SET_PHASES.COM
in his/her login.com file. Note that the DISK and DIRECTORY should
be changed to point to the appropriate PHASES parent directory as
in (1).
The installation will result in files being deposited in four
subdirectories below the parent directory. The subdirectories contain
the program source files, executables, write-up and sample template
files. The four directories have the logical names PHASES_SRC,
PHASES_EXE, PHASES_DOC and PHASES_TEMPL, respectively. Users may find
it desirable to copy the write-up and sample template files to their
own directory, thus the commands
COPY PHASES_DOC:PHASES.WUP *
COPY PHASES_TEMPL:*.com *
COPY PHASES_TEMPL:*.DAT *
will probably be useful. It is also recommended that at least one copy
of the PHASES.WUP manual be printed, but beware as it is large
(roughly 190 pages).
After the installation and execution of the login.com file, any
program in the package can be run simply by typing its name. Note
that if running in batch mode, one will have to insert the line
$SET DEFAULT DISK:[DIRECTORY]
in the beginning of every ".com" file, where DISK and DIRECTORY are
replaced by the users working disk and directory if the users input
data files are to be found.
15.00 For UNIX, the following scripts are invoked by the doall script
--- procedure mask1.sh ---
# COMPUTE ORIGINAL ELECTRON DENSITY MAP
#
ln phasit.31 four.ref
fsfour < fft.d > mask1.l
mv four.map orig.map
rm four.ref
ln orig.map four.map
#
# REMOVE HEAVY ATOM PEAKS FROM MAP
#
rmheavy < rmhv.d >> mask1.l
mv nohv.map four.map
#
# INVERT MAP AFTER TRUNCATING DENSITY < 0
#
mapinv < minv1.d >> mask1.l
rm four.map
#
# MULTIPLY FOURIER COEFICIENTS BY TRANSFORM OF WEIGHTING FUNCTION
#
bndry < bnd0.d >> mask1.l
rm minv.ref
#
# COMPUTE "SMEARED" MAP FROM MODIFIED COEFICIENTS
#
fsfour < fft.d >> mask1.l
rm four.ref
#
# DETERMINE PROTEIN-SOLVENT BOUNDARY MASK FROM "SMEARED" MAP
#
bndry < bnd1.d >> mask1.l
mv mask.map mask1.14
rm four.map
# THATS ALL
--- procedure cycle4.sh ---
#
# MODIFY ELECTRON DENSITY MAP ACCORDING TO MASK1
#
# use .06 for 3A data and .086 for 3.5 and .112 for 4.A
mv orig.map four.map
ln mask1.14 mask.map
bndry < bnd2.d > cycle4.l
rm four.map mask.map
#
# INVERT MODIFIED MAP TO OBTAIN STRUCTURE FACTORS
#
mapinv < minv2.d >> cycle4.l
rm mod.map
#
# COMBINE PRIOR PHASE INFORMATION WITH THAT FROM MAP INVERSION
#
bndry < bnd3.d >> cycle4.l
#
#
# PERFORM 3 MORE CYCLES OF REFINEMENT
#
for cycle
in 1 2 3
do
#
# COMPUTE ELECTRON DENSITY MAP FROM COMBINED PHASE INFORMATION
#
rm minv.ref
mv newphi.ref four.ref
fsfour < fft.d >> cycle4.l
rm four.ref
#
# MODIFY ELECTRON DENSITY MAP ACCORDING TO MASK
#
ln mask1.14 mask.map
# use .06 for 3A,.086 for 3.5 and .112 for 4A
bndry < bnd2.d >> cycle4.l
rm four.map mask.map
#
# INVERT MODIFIED MAP TO OBTAIN STRUCTURE FACTORS
#
mapinv < minv2.d >> cycle4.l
rm mod.map
#
# COMBINE PRIOR PHASE INFORMATION WITH THAT FROM MAP INVERSION
#
bndry < bnd3.d >> cycle4.l
done
mv newphi.ref phi4cy.31
mv minv.ref allcoef.31
# THATS ALL
--- procedure mask2.sh ---
#
# COMPUTE ELECTRON DENSITY MAP
#
ln phi4cy.31 four.ref
fsfour < fft.d > mask2.l
rm four.ref
#
# REMOVE HEAVY ATOM PEAKS FROM MAP
rmheavy < rmhv.d >> mask2.l
mv nohv.map four.map
#
# INVERT MAP AFTER TRUNCATING DENSITY < 0
#
mapinv < minv1.d >> mask2.l
rm four.map
#
# MULTIPLY FOURIER COEFICIENTS BY TRANSFORM OF WEIGHTING FUNCTION
#
bndry < bnd0.d >> mask2.l
rm minv.ref
#
# COMPUTE "SMEARED" MAP FROM MODIFIED COEFICIENTS
#
fsfour < fft.d >> mask2.l
rm four.ref
#
# DETERMINE PROTEIN-SOLVENT BOUNDARY MASK FROM "SMEARED" MAP
#
bndry < bnd1.d >> mask2.l
mv mask.map mask2.14
rm four.map
# THATS ALL
--- procedure cycle8.sh ---
#
# START OVER USING NEW MASK
#
cp phasit.31 newphi.ref
ln phasit.31 minv.ref
#
#
# PERFORM 4 CYCLES OF REFINEMENT, USING SECOND MASK
#
for cycle
in 1 2 3 4
do
#
# COMPUTE ELECTRON DENSITY MAP FROM COMBINED PHASE INFORMATION
#
rm minv.ref
mv newphi.ref four.ref
fsfour < fft.d >> cycle8.l
rm four.ref
#
# MODIFY ELECTRON DENSITY MAP ACCORDING TO MASK
#
ln mask2.14 mask.map
# use .06 for 3A,.086 for 3.5 and .112 for 4A
bndry < bnd2.d >> cycle8.l
rm four.map mask.map
#
# INVERT MODIFIED MAP TO OBTAIN STRUCTURE FACTORS
#
mapinv < minv2.d >> cycle8.l
rm mod.map
#
# COMBINE PRIOR PHASE INFORMATION WITH THAT FROM MAP INVERSION
#
bndry < bnd3.d >> cycle8.l
done
mv newphi.ref phi8cy.31
mv minv.ref allcoef.31
# THATS ALL
--- procedure mask3.sh ---
#
# COMPUTE ELECTRON DENSITY MAP
#
ln phi8cy.31 four.ref
fsfour < fft.d > mask3.l
rm four.ref
#
# REMOVE HEAVY ATOM PEAKS FROM MAP
rmheavy < rmhv.d >> mask3.l
mv nohv.map four.map
#
# INVERT MAP AFTER TRUNCATING DENSITY < 0
#
mapinv < minv1.d >> mask3.l
rm four.map
#
# MULTIPLY FOURIER COEFICIENTS BY TRANSFORM OF WEIGHTING FUNCTION
#
bndry < bnd0.d >> mask3.l
rm minv.ref
#
# COMPUTE "SMEARED" MAP FROM MODIFIED COEFICIENTS
#
fsfour < fft.d >> mask3.l
rm four.ref
#
# DETERMINE PROTEIN-SOLVENT BOUNDARY MASK FROM "SMEARED" MAP
#
bndry < bnd1.d >> mask3.l
mv mask.map mask3.14
rm four.map
# THATS ALL
--- procedure cycle16.sh ---
#
# START OVER USING NEW MASK
#
cp phasit.31 newphi.ref
ln phasit.31 minv.ref
#
#
# PERFORM 8 CYCLES OF REFINEMENT, USING THIRD MASK
#
for cycle
in 1 2 3 4 5 6 7 8
do
#
# COMPUTE ELECTRON DENSITY MAP FROM COMBINED PHASE INFORMATION
#
rm minv.ref
mv newphi.ref four.ref
fsfour < fft.d >> cycle16.l
rm four.ref
#
# MODIFY ELECTRON DENSITY MAP ACCORDING TO MASK
#
ln mask3.14 mask.map
# use .06 for 3A,.086 for 3.5 and .112 for 4A
bndry < bnd2.d >> cycle16.l
rm four.map mask.map
#
# INVERT MODIFIED MAP TO OBTAIN STRUCTURE FACTORS
#
mapinv < minv2.d >> cycle16.l
rm mod.map
#
# COMBINE PRIOR PHASE INFORMATION WITH THAT FROM MAP INVERSION
#
bndry < bnd3.d >> cycle16.l
done
mv newphi.ref phi16cy.31
mv minv.ref allcoef.31
# THATS ALL
--- procedure extnd.sh ---
#
# RESUME WHERE WE LEFT OFF AFTER FIRST 16 CYCLES
#
cp phi16cy.31 newphi.ref
ln phasit.31 minv.ref
#
# CREATE THE TEMPORARY FILE "EXTND.TMP" CONTROLLING EXTENSION
sloext < sloext.d > extnd.l
#
# PERFORM THE PHASE EXTENSION/REFINEMENT ITERATIONS USING THE
# THIRD MASK
#
# DO ITERATIONS AS LONG AS THE FILE EXTND.TMP EXISTS (IT WILL
# AUTOMATICALLY BE DELETED WHEN THE SPECIFIED RESOLUTION/NUMBER
# OF ITERATIONS IS REACHED)
while
test -r EXTND.TMP
do
#
# COMPUTE ELECTRON DENSITY MAP FROM COMBINED PHASE INFORMATION
#
rm minv.ref
mv newphi.ref four.ref
fsfour < fft.d >> extnd.l
rm four.ref
#
# MODIFY ELECTRON DENSITY MAP ACCORDING TO MASK
#
ln mask3.14 mask.map
# use .06 for 3A,.086 for 3.5 and .112 for 4A
bndry < bnd2.d >> extnd.l
rm four.map mask.map
#
# INVERT MODIFIED MAP TO OBTAIN STRUCTURE FACTORS
#
mapinv < minv2.d >> extnd.l
rm mod.map
#
# COMBINE PRIOR PHASE INFORMATION WITH THAT FROM MAP INVERSION
#
bndry < extnd.d >> extnd.l
#
# ADJUST ITERATION COUNT AND/OR RESOLUTION CUTOFF (THIS WILL ALSO
# DELETE THE "EXTND.TMP" FILE AT THE APPROPRIATE TIME, THUS EXITING
# THE LOOP)
sloext < sloext.d >> extnd.l
#
done
#
mv newphi.ref phiextnd.31
mv minv.ref allcoef.31
# THATS ALL
--- procedure extnda.sh ---
#
# RESUME WHERE WE LEFT OFF
#
cp phiextnd.31 newphi.ref
ln phasit.31 minv.ref
#
# CREATE THE TEMPORARY FILE "EXTND.TMP" CONTROLLING EXTENSION
sloext < sloext2.d > extnda.l
#
# PERFORM THE PHASE AND AMPLITUDE EXTENSION ITERATIONS USING
# THE THIRD MASK
#
# DO ITERATIONS AS LONG AS THE FILE EXTND.TMP EXISTS (IT WILL
# AUTOMATICALLY BE DELETED WHEN THE SPECIFIED RESOLUTION/NUMBER
# OF ITERATIONS IS REACHED)
while
test -r EXTND.TMP
do
#
# COMPUTE ELECTRON DENSITY MAP FROM COMBINED PHASE INFORMATION
#
rm minv.ref
mv newphi.ref four.ref
fsfour < fft.d >> extnda.l
rm four.ref
#
# MODIFY ELECTRON DENSITY MAP ACCORDING TO MASK
#
ln mask3.14 mask.map
# use .06 for 3A,.086 for 3.5 and .112 for 4A
bndry < bnd2.d >> extnda.l
rm four.map mask.map
#
# INVERT MODIFIED MAP TO OBTAIN STRUCTURE FACTORS
#
mapinv < minv2.d >> extnda.l
rm mod.map
#
# COMBINE PRIOR PHASE INFORMATION WITH THAT FROM MAP INVERSION
#
bndry < extnda.d >> extnda.l
#
# ADJUST ITERATION COUNT AND/OR RESOLUTION CUTOFF (THIS WILL ALSO
# DELETE THE "EXTND.TMP" FILE AT THE APPROPRIATE TIME, THUS EXITING
# THE LOOP)
sloext < sloext2.d >> extnda.l
#
done
#
mv newphi.ref phiextnda.31
mv minv.ref allcoef.31
# THATS ALL
16.00 For VMS, the following procedures are invoked by doall.com
--- procedure mask1.com ---
$SET NOVERIFY
$!
$! SEND ALL PRINTED OUTPUT TO MASK1.L
$ASSIGN MASK1.L SYS$OUTPUT
$!
$! COMPUTE ORIGINAL ELECTRON DENSITY MAP
$!
$COPY PHASIT.31 FOUR.REF
$ASSIGN FFT.DAT FOR005
$FSFOUR
$DEASSIGN FOR005
$DELETE FOUR.REF;*
$COPY FOUR.MAP ORIG.MAP
$!
$! REMOVE HEAVY ATOM PEAKS FROM MAP
$ASSIGN RMHV.DAT FOR005
$RMHEAVY
$DEASSIGN FOR005
$DELETE FOUR.MAP;*
$RENAME NOHV.MAP FOUR.MAP
$!
$! INVERT MAP AFTER TRUNCATING DENSITY < 0
$!
$ASSIGN MINV1.DAT FOR005
$MAPINV
$DEASSIGN FOR005
$DELETE FOUR.MAP;*
$!
$! MULTIPLY FOURIER COEFICIENTS BY TRANSFORM OF WEIGHTING FUNCTION
$!
$ASSIGN BND0.DAT FOR005
$BNDRY
$DEASSIGN FOR005
$DELETE MINV.REF;*
$!
$! COMPUTE "SMEARED" MAP FROM MODIFIED COEFICIENTS
$!
$ASSIGN FFT.DAT FOR005
$FSFOUR
$DEASSIGN FOR005
$DELETE FOUR.REF;*
$!
$! DETERMINE PROTEIN-SOLVENT BOUNDARY MASK FROM "SMEARED" MAP
$!
$ASSIGN BND1.DAT FOR005
$BNDRY
$RENAME MASK.MAP MASK1.14
$DEASSIGN FOR005
$DELETE FOUR.MAP;*
$!
$DEASSIGN SYS$OUTPUT
$! THATS ALL
$EXIT
--- procedure cycle4.com ---
$SET NOVERIFY
$!
$! SEND ALL PRINTED OUTPUT TO CYCLE4.L
$ASSIGN CYCLE4.L SYS$OUTPUT
$!
$! MODIFY ELECTRON DENSITY MAP ACCORDING TO MASK1
$!
$! USE .06 FOR 3A DATA, .086 FOR 3.5A, AND .112 FOR 4A DATA
$!
$RENAME ORIG.MAP FOUR.MAP
$COPY MASK1.14 MASK.MAP
$ASSIGN BND2.DAT FOR005
$BNDRY
$DEASSIGN FOR005
$DELETE FOUR.MAP;*
$!
$! INVERT MODIFIED MAP TO OBTAIN STRUCTURE FACTORS
$!
$ASSIGN MINV2.DAT FOR005
$MAPINV
$DEASSIGN FOR005
$DELETE MOD.MAP;*
$!
$! COMBINE PRIOR PHASE INFORMATION WITH THAT FROM MAP INVERSION
$!
$ASSIGN BND3.DAT FOR005
$BNDRY
$DEASSIGN FOR005
$!
$! PERFORM 3 MORE CYCLES OF REFINEMENT
$!
$CYCLE = 0
$LOOP:
$CYCLE = CYCLE + 1
$!
$! COMPUTE ELECTRON DENSITY MAP FROM COMBINED PHASE INFORMATION
$!
$DELETE MINV.REF;*
$RENAME NEWPHI.REF FOUR.REF
$ASSIGN FFT.DAT FOR005
$FSFOUR
$DEASSIGN FOR005
$DELETE FOUR.REF;*
$!
$! MODIFY ELECTRON DENSITY MAP ACCORDING TO MASK1
$!
$! USE .06 FOR 3A DATA, .086 FOR 3.5A, AND .112 FOR 4A DATA
$!
$ASSIGN BND2.DAT FOR005
$BNDRY
$DEASSIGN FOR005
$DELETE FOUR.MAP;*
$!
$! INVERT MODIFIED MAP TO OBTAIN STRUCTURE FACTORS
$!
$ASSIGN MINV2.DAT FOR005
$MAPINV
$DEASSIGN FOR005
$DELETE MOD.MAP;*
$!
$! COMBINE PRIOR PHASE INFORMATION WITH THAT FROM MAP INVERSION
$!
$ASSIGN BND3.DAT FOR005
$BNDRY
$DEASSIGN FOR005
$!
$IF CYCLE .LT. 3 THEN GOTO LOOP
$!
$DELETE MASK.MAP;*
$RENAME NEWPHI.REF PHI4CY.31
$RENAME MINV.REF ALLCOEF.31
$!
$DEASSIGN SYS$OUTPUT
$! THATS ALL
$EXIT
--- procedure mask2.com ---
$SET NOVERIFY
$!
$! SEND ALL PRINTED OUTPUT TO MASK2.L
$ASSIGN MASK2.L SYS$OUTPUT
$!
$! COMPUTE ORIGINAL ELECTRON DENSITY MAP
$!
$COPY PHI4CY.31 FOUR.REF
$ASSIGN FFT.DAT FOR005
$FSFOUR
$DEASSIGN FOR005
$DELETE FOUR.REF;*
$!
$! REMOVE HEAVY ATOM PEAKS FROM MAP
$ASSIGN RMHV.DAT FOR005
$RMHEAVY
$DEASSIGN FOR005
$DELETE FOUR.MAP;*
$RENAME NOHV.MAP FOUR.MAP
$!
$! INVERT MAP AFTER TRUNCATING DENSITY < 0
$!
$ASSIGN MINV1.DAT FOR005
$MAPINV
$DEASSIGN FOR005
$DELETE FOUR.MAP;*
$!
$! MULTIPLY FOURIER COEFICIENTS BY TRANSFORM OF WEIGHTING FUNCTION
$!
$ASSIGN BND0.DAT FOR005
$BNDRY
$DEASSIGN FOR005
$DELETE MINV.REF;*
$!
$! COMPUTE "SMEARED" MAP FROM MODIFIED COEFICIENTS
$!
$ASSIGN FFT.DAT FOR005
$FSFOUR
$DEASSIGN FOR005
$DELETE FOUR.REF;*
$!
$! DETERMINE PROTEIN-SOLVENT BOUNDARY MASK FROM "SMEARED" MAP
$!
$ASSIGN BND1.DAT FOR005
$BNDRY
$DEASSIGN FOR005
$REN MASK.MAP MASK2.14
$DELETE FOUR.MAP;*
$!
$DEASSIGN SYS$OUTPUT
$! THATS ALL
$EXIT
--- procedure cycle8.com ---
$SET NOVERIFY
$!
$! SEND ALL PRINTED OUTPUT TO CYCLE8.L
$ASSIGN CYCLE8.L SYS$OUTPUT
$!
$! START OVER USING NEW MASK
$!
$COPY PHASIT.31 NEWPHI.REF
$COPY PHASIT.31 MINV.REF
$COPY MASK2.14 MASK.MAP
$!
$! PERFORM 4 CYCLES OF REFINEMENT USING 2ND MASK
$!
$CYCLE = 0
$LOOP1:
$CYCLE = CYCLE + 1
$!
$! COMPUTE ELECTRON DENSITY MAP FROM COMBINED PHASE INFORMATION
$!
$DELETE MINV.REF;*
$RENAME NEWPHI.REF FOUR.REF
$ASSIGN FFT.DAT FOR005
$FSFOUR
$DEASSIGN FOR005
$DELETE FOUR.REF;*
$!
$! MODIFY ELECTRON DENSITY MAP ACCORDING TO MASK2
$!
$! USE .06 FOR 3A DATA, .086 FOR 3.5A, AND .112 FOR 4A DATA
$!
$ASSIGN BND2.DAT FOR005
$BNDRY
$DEASSIGN FOR005
$DELETE FOUR.MAP;*
$!
$! INVERT MODIFIED MAP TO OBTAIN STRUCTURE FACTORS
$!
$ASSIGN MINV2.DAT FOR005
$MAPINV
$DEASSIGN FOR005
$DELETE MOD.MAP;*
$!
$! COMBINE PRIOR PHASE INFORMATION WITH THAT FROM MAP INVERSION
$!
$ASSIGN BND3.DAT FOR005
$BNDRY
$DEASSIGN FOR005
$!
$IF CYCLE .LT. 4 THEN GOTO LOOP1
$!
$DELETE MASK.MAP;*
$RENAME NEWPHI.REF PHI8CY.31
$RENAME MINV.REF ALLCOEF.31
$PURGE ALLCOEF.31
$!
$DEASSIGN SYS$OUTPUT
$! THATS ALL
$EXIT
--- procedure mask3.com ---
$SET NOVERIFY
$!
$! SEND ALL PRINTED OUTPUT TO MASK3.L
$ASSIGN MASK3.L SYS$OUTPUT
$!
$! COMPUTE ORIGINAL ELECTRON DENSITY MAP
$!
$COPY PHI8CY.31 FOUR.REF
$ASSIGN FFT.DAT FOR005
$FSFOUR
$DEASSIGN FOR005
$DELETE FOUR.REF;*
$!
$! REMOVE HEAVY ATOM PEAKS FROM MAP
$ASSIGN RMHV.DAT FOR005
$RMHEAVY
$DEASSIGN FOR005
$DELETE FOUR.MAP;*
$RENAME NOHV.MAP FOUR.MAP
$!
$! INVERT MAP AFTER TRUNCATING DENSITY < 0
$!
$ASSIGN MINV1.DAT FOR005
$MAPINV
$DEASSIGN FOR005
$DELETE FOUR.MAP;*
$!
$! MULTIPLY FOURIER COEFICIENTS BY TRANSFORM OF WEIGHTING FUNCTION
$!
$ASSIGN BND0.DAT FOR005
$BNDRY
$DEASSIGN FOR005
$DELETE MINV.REF;*
$!
$! COMPUTE "SMEARED" MAP FROM MODIFIED COEFICIENTS
$!
$ASSIGN FFT.DAT FOR005
$FSFOUR
$DEASSIGN FOR005
$DELETE FOUR.REF;*
$!
$! DETERMINE PROTEIN-SOLVENT BOUNDARY MASK FROM "SMEARED" MAP
$!
$ASSIGN BND1.DAT FOR005
$BNDRY
$DEASSIGN FOR005
$REN MASK.MAP MASK3.14
$DELETE FOUR.MAP;*
$!
$DEASSIGN SYS$OUTPUT
$! THATS ALL
$EXIT
--- procedure cycle16.com ---
$SET NOVERIFY
$!
$! SEND ALL PRINTED OUTPUT TO CYCLE16.L
$ASSIGN CYCLE16.L SYS$OUTPUT
$!
$! START OVER USING NEW MASK
$!
$COPY PHASIT.31 NEWPHI.REF
$COPY PHASIT.31 MINV.REF
$COPY MASK3.14 MASK.MAP
$!
$! PERFORM 8 CYCLES OF REFINEMENT USING 3RD MASK
$!
$CYCLE = 0
$LOOP2:
$CYCLE = CYCLE + 1
$!
$! COMPUTE ELECTRON DENSITY MAP FROM COMBINED PHASE INFORMATION
$!
$DELETE MINV.REF;*
$RENAME NEWPHI.REF FOUR.REF
$ASSIGN FFT.DAT FOR005
$FSFOUR
$DEASSIGN FOR005
$DELETE FOUR.REF;*
$!
$! MODIFY ELECTRON DENSITY MAP ACCORDING TO MASK3
$!
$! USE .06 FOR 3A DATA, .086 FOR 3.5A, AND .112 FOR 4A DATA
$!
$ASSIGN BND2.DAT FOR005
$BNDRY
$DEASSIGN FOR005
$DELETE FOUR.MAP;*
$!
$! INVERT MODIFIED MAP TO OBTAIN STRUCTURE FACTORS
$!
$ASSIGN MINV2.DAT FOR005
$MAPINV
$DEASSIGN FOR005
$DELETE MOD.MAP;*
$!
$! COMBINE PRIOR PHASE INFORMATION WITH THAT FROM MAP INVERSION
$!
$ASSIGN BND3.DAT FOR005
$BNDRY
$DEASSIGN FOR005
$!
$IF CYCLE .LT. 8 THEN GOTO LOOP2
$!
$DELETE MASK.MAP;*
$RENAME NEWPHI.REF PHI16CY.31
$RENAME MINV.REF ALLCOEF.31
$PURGE ALLCOEF.31
$!
$DEASSIGN SYS$OUTPUT
$! THATS ALL
$EXIT
--- procedure extnd.com ---
$SET NOVERIFY
$!
$! SEND ALL PRINTED OUTPUT TO EXTND.L
$ASSIGN EXTND.L SYS$OUTPUT
$!
$! RESUME WHERE WE LEFT OFF AFTER FIRST 16 CYCLES
$!
$COPY PHI16CY.31 NEWPHI.REF
$COPY PHASIT.31 MINV.REF
$COPY MASK3.14 MASK.MAP
$!
$! CREATE THE TEMPORARY FILE "EXTND.TMP" CONTROLLING EXTENSION
$ASSIGN SLOEXT.DAT FOR005
$SLOEXT
$DEASSIGN FOR005
$!
$! PERFORM THE PHASE EXTENSION/REFINEMENT ITERATIONS USING THE
$! THIRD MASK
$!
$LOOP3:
$!
$! DO ITERATIONS AS LONG AS THE FILE EXTND.TMP EXISTS (IT WILL
$! AUTOMATICALLY BE DELETED WHEN THE SPECIFIED RESOLUTION/NUMBER
$! OF ITERATIONS IS REACHED)
$FILESPEC=F$SEARCH("EXTND.TMP")
$IF FILESPEC .EQS. "" THEN GOTO DONE3
$!
$! COMPUTE ELECTRON DENSITY MAP FROM COMBINED PHASE INFORMATION
$!
$DELETE MINV.REF;*
$RENAME NEWPHI.REF FOUR.REF
$ASSIGN FFT.DAT FOR005
$FSFOUR
$DEASSIGN FOR005
$DELETE FOUR.REF;*
$!
$! MODIFY ELECTRON DENSITY MAP ACCORDING TO MASK3
$!
$! USE .06 FOR 3A DATA, .086 FOR 3.5A, AND .112 FOR 4A DATA
$!
$ASSIGN BND2.DAT FOR005
$BNDRY
$DEASSIGN FOR005
$DELETE FOUR.MAP;*
$!
$! INVERT MODIFIED MAP TO OBTAIN STRUCTURE FACTORS
$!
$ASSIGN MINV2.DAT FOR005
$MAPINV
$DEASSIGN FOR005
$DELETE MOD.MAP;*
$!
$! COMBINE PRIOR PHASE INFORMATION WITH THAT FROM MAP INVERSION
$! AND EXTEND PHASING TO ADDITIONAL AMPLITUDES INPUT ON UNIT 16
$!
$ASSIGN EXTND.DAT FOR005
$BNDRY
$DEASSIGN FOR005
$!
$! ADJUST ITERATION COUNT AND/OR RESOLUTION CUTOFF (THIS WILL ALSO
$! DELETE THE "EXTND.TMP" FILE AT THE APPROPRIATE TIME, THUS EXITING
$! THE LOOP)
$ASSIGN SLOEXT.DAT FOR005
$SLOEXT
$DEASSIGN FOR005
$!
$GOTO LOOP3
$!
$DONE3:
$!
$DELETE MASK.MAP;*
$RENAME NEWPHI.REF PHIEXTND.31
$RENAME MINV.REF ALLCOEF.31
$PURGE ALLCOEF.31
$!
$DEASSIGN SYS$OUTPUT
$! THATS ALL
$EXIT
--- procedure extnda.com ---
$SET NOVERIFY
$!
$! SEND ALL PRINTED OUTPUT TO EXTNDA.L
$ASSIGN EXTNDA.L SYS$OUTPUT
$!
$! RESUME WHERE WE LEFT OFF
$!
$COPY PHIEXTND.31 NEWPHI.31
$COPY PHASIT.31 MINV.REF
$COPY MASK3.14 MASK.MAP
$!
$! CREATE THE TEMPORARY FILE "EXTND.TMP" CONTROLLING EXTENSION
$ASSIGN SLOEXT2.DAT FOR005
$SLOEXT
$DEASSIGN FOR005
$!
$! PERFORM THE PHASE EXTENSION/REFINEMENT ITERATIONS USING THE
$! THIRD MASK
$!
$LOOP4:
$!
$! DO ITERATIONS AS LONG AS THE FILE EXTND.TMP EXISTS (IT WILL
$! AUTOMATICALLY BE DELETED WHEN THE SPECIFIED RESOLUTION/NUMBER
$! OF ITERATIONS IS REACHED)
$FILESPEC=F$SEARCH("EXTND.TMP")
$IF FILESPEC .EQS. "" THEN GOTO DONE4
$!
$! COMPUTE ELECTRON DENSITY MAP FROM COMBINED PHASE INFORMATION
$!
$DELETE MINV.REF;*
$RENAME NEWPHI.REF FOUR.REF
$ASSIGN FFT.DAT FOR005
$FSFOUR
$DEASSIGN FOR005
$DELETE FOUR.REF;*
$!
$! MODIFY ELECTRON DENSITY MAP ACCORDING TO MASK3
$!
$! USE .06 FOR 3A DATA, .086 FOR 3.5A, AND .112 FOR 4A DATA
$!
$ASSIGN BND2.DAT FOR005
$BNDRY
$DEASSIGN FOR005
$DELETE FOUR.MAP;*
$!
$! INVERT MODIFIED MAP TO OBTAIN STRUCTURE FACTORS
$!
$ASSIGN MINV2.DAT FOR005
$MAPINV
$DEASSIGN FOR005
$DELETE MOD.MAP;*
$!
$! COMBINE PRIOR PHASE INFORMATION WITH THAT FROM MAP INVERSION,
$! EXTEND PHASING TO ADDITIONAL AMPLITUDES INPUT ON UNIT 16, AND
$! EXTEND PHASES AND AMPLITUDES TO ANY O
$!
$ASSIGN EXTNDA.DAT FOR005
$BNDRY
$DEASSIGN FOR005
$!
$! ADJUST ITERATION COUNT AND/OR RESOLUTION CUTOFF (THIS WILL ALSO
$! DELETE THE "EXTND.TMP" FILE AT THE APPROPRIATE TIME, THUS EXITING
$! THE LOOP)
$ASSIGN SLOEXT2.DAT FOR005
$SLOEXT
$DEASSIGN FOR005
$!
$GOTO LOOP4
$!
$DONE4:
$!
$DELETE MASK.MAP;*
$RENAME NEWPHI.REF PHIEXTNDA.31
$RENAME MINV.REF ALLCOEF.31
$PURGE ALLCOEF.31
$!
$DEASSIGN SYS$OUTPUT
$! THATS ALL
$EXIT