Fundamentals of Sequence Analysis, 1998-1999
Problem set 1:  Computing basics

If you get stuck, refer to the OpenVMS and GCG resources in the 
class home page.

Problem group 1.  Logging in

(If you do not already have an account click here, 
fill out the form, and within a day (excluding weekends)
your account will be ready for use). 

Use a communications program on your PC or Mac, set to
emulate a VT100 or higher, and log onto the SAF machine
SEQAXP (for TELNET or RLOGIN use the name
just use SEQAXP). 

The information that rolls by on the computer screen is
important - you should always read it.  This is even true
for the text you see when you log on.  Test your
understanding of what the login text said by answering the
following questions: 

1A.  What version of the Genbank database is available locally?
1B.  What does the GCG program "DIVERGE" do?
1C.  A disk block is 512 bytes.  How much disk space do you
       have available in bytes, and how many bytes can you
       put on disk before you run out of space? 

Problem group 2.  Commands

SEQAXP uses the OpenVMS operating system, which you will
interact with through command lines.  That is, you will
tell the computer what to do by typing in commands,
as opposed to selecting options from menus as on a Macintosh
or Windows machine.  The general form of an OpenVMS command

  $ verb/qualifier parameter/qualifer

  $            prompt (the computer puts this on your screen 
               to indicate that it is ready for another command -
               everything else you type in, and generally, it 
               is not case sensitive.)
  verb         the action to perform, such as COPY
  /qualifier   modifies the action of a verb or parameter, most
               qualifiers go on the verb, some on the parameter
  parameter(s) the object(s) of the verb, parameters are separated
               by spaces from the verb and from each other.

The most important verb is probably HELP - it puts you into
the online help system.  At the bottom of the help page you
will see the list of help libraries that have been added to
the local system, access these by preceding the name with an

What do these commands do (go ahead and type them in, 
nothing bad will happen)?

2E.   $ HELP @LOCAL -
        GENER OPEN -
2G.   Try pushing the up/down/right/left arrows on the keyboard
2H.   $ mytype :== type/page
      $ mytype

Problem group 3.  Directories and files

Like most operating systems, OpenVMS stores data on disks in
files which are arranged heirarchically in directories. 

Here are the more common directory related commands:
(go ahead and issue these commands, but leave off the part 
in italics!)

  $ show default             Give the current directory name
  $ create/dir  [.subdir]    Create a subdirectory
  $ set default [.subdir]    Move into it
  $ set default [-]          Move up from it
  $ set default SY$SLOGIN    Move to your home directory

All files have this form:


Any of these fields that you don't specify default, for instance


means the highest numbered version of this file that is in
your default directory.  If you modify a file a higher
numbered version will be created - this means that if you
mess up, the original is still around.  If you delete a file
you will need to specify which version or just use a
trailing ";" to mean "the most recent one". 

When doing sequence analysis 99.9% of the files that you
will use will consist of plain text, and so the most common
operations that you will perform (and the names of the
command to do them) are: 


Most of these are self explantory. PURGE is a form of delete 
that removes only lower numbered versions of files.  
DIRECTORY tells you what files are in a directory.

You can use wildcards to match parts of filenames, "*" 
matches anything, "%" matches any single character.

3A.  How many files are in your directory and how much
     space do they occupy?

3B.  Print jobs can be directed to local laserprinters.  
     Issue the command:  $ SHOW QUEUE *
     What is the name of the queue that goes to your local
     laser printer?
     (If you don't see one, and you have a networked printer,
     request that one be set up for you.)

3C.  Do you have a LOGIN.COM file in your home directory?
     (The commands in this file run automatically when
     you login to configure your process.)  If not,
     rename (from 2A, above) and edit
     it (see 2D, above) to reflect the appropriate print
     queue for your lab.  Invoke it with the command:
       $ @login
     then verify that print jobs come out on your printer
     with the command:
       $ print     

3D.  What command do you use to clean out old versions of
     files that are in your directory?  Try it now, did it

Problem group 4.  File protections

Files have access protections that can be set differently
for each of four levels of users:  System, Owner, Group,
and World.  System is the operating system or the system
operator, owner is the owner of the files, group is other
users in the same group (as in, the same lab), and world
is everybody else.  Access protections are allowed for Read,
Write, Execute, and Delete. In general, the default 
protections is correct, it is equivalent to:
  $ set file/protection=(s:rwed,o:rwed,g:re,w) filename

Some files are protected better, for instance MAIL files are
meant to be manipulated only from within MAIL utilities, so
their protection is set so that you can't modify them by
accident (and don't override this!)  Directories default to
a protection that prevents them from being deleted, so if you
ever do need to delete a subdirectory, you must issue a
command similar to that above (after removing all files
within it.) 

4A.  What is the protection on the files in your directory?
     (Hint, HELP DIR)
4B.  What happens when you try to read a file that you don't
     have access to?  Try:  $ COPY [-.MATHOG] []
4C.  What do you think will happen when you block access
     to a file from the SYSTEM account?  Daily backup
     tapes are made of all user files from the SYSTEM account.
     If the user disk fails, and the files are restored from
     tape onto a replacement drive,  will a file that was
     protected from SYSTEM read access be restored to 
     your directory?

Problem group 5.  Data transfer

At some point you will all need to move files back and forth
between SEQAXP and your own computers.  The simplest method
is to use FTP.  One FTP program for Macintoshes is called
There are an assortment of FTP utilities for windows machines
too, for instance, WSFTP ( assorted download sites and  There
are many other possibilities - the "publish" feature in
many programs will do FTP, but don't use it unless you can
specify BINARY or ASII type.

There are a couple of key points to remember when moving
data to/from SEQAXP.

 Text files should go in ASCII mode, and if they
     originate on a PC or Macintosh they should consist
     of a series of small (<200 character) lines.

 Binary files should go in BINARY mode.

 Respect the naming convention on Seqaxp (see 3 above)
     or the file will come over, but the name will be 
     mangled, perhaps to the point where you don't
     recognize it.  (In FETCH, turn off the option that
     automatically appends .txt or .bin!)

5A.  Use FTP on your PC or Macintosh. Copy
     from your account to your PC/Mac, then back to seqaxp.
     (Remember, this is a text file.)  Call the new copy
     "".  Did the transfer work correctly? 
     Look for subtle errors with this command: 

       $ DIFF

5B.  Login to seqaxp, create a subdirectory called [.KILLME],
     and copy your file into it.  Repeat the transfer
     as in 5A, but this time against the file in the new
     subdirectory.  Again, check that the transfer did
     not change the file's content.  Now remove any files
     in the [.KILLME] directory, and then delete the 
     directory itself.

5C.  There are numerous ways to mess up file transfers, 
     sending ASCII as BINARY, or vice versa, or sending
     files with lines that are too long.  If a file that
     you have loaded on SEQAXP misbehaves you can analyze
     it to see what is wrong.  Issue the command:


     and look at the RMS FILE ATTRIBUTE section.  Why
     might this file cause problems for some programs?

Some computers in the division have Pathworks installed. 
This commercial software provides a DECNET transport on
Macintoshes.  If you have it there will be a program around
called NCP.  Only if you see this program on your computer
do this next part. 

5D.  DECNET allows most OpenVMS commands to function over
     the net. This can be very convenient for moving text
     files to/from a Macintosh.  (A version for PCs also
     exists, but I don't believe that anybody here uses it.)
     Let's assume that your hard disk is called "BIGDISK"
     on the machine "MACNAME", and that you have run the NCP
     program on your Macintosh and configured it to allow
     proxy connections from your SEQAXP account.  Try this
     from your SEQAXP account: 

     $ copy desktop:

     If DECNET is working, you should see a file called
     "" on your desktop.  What command would you
     use *on SEQAXP* to view the contents of that file?
     (Hint, DESKTOP has been defined as a logical name.)

Problem group 6.  GCG basics

The GCG software by and large behaves as does other OpenVMS
software. However, there are a few gotchas, and these are
documented in the  GCG beginner's FAQ, which
you should refer to in order to answer these questions. 

6A.  Configure your graphics device appropriately (for most
     terminal emulators that is some form of Tektronix
     emulation).  Issue the command:  $ SHOWPLOT
     What do you see?

6B.  What are the command line options for REFORMAT?

6C.  Use REFORMAT to put into GCG format the following PROTEIN
     sequence:  AAAGCTCTTGGGTTTT
     (Hint, put that sequence into a file, and then run
     REFORMAT on it).  Now look at the resulting file (TYPE it).
     Does the line with a ".." indicate that this is
     protein?  Figure out the correct operation to make this
     sequence into a GCG protein sequence file.  (Note, get
     in the habit of naming GCG protein sequence files
     whatever.pep, and GCG nucleic sequence files
     whatever.seq - that way you can tell that they are at
     a glance.

6D.  Use the GCG program SEQED to edit this sequence - put a
     P on the end and change the first A to an S.  What
     happens if you leave the program with a QUIT, and what
     happens when you leave with an EXIT?  (Look at the
     edited file to see).