Fundamentals of Sequence Analysis, 1995-1996
Problem set 5:  Assembling sequences.

If you get stuck, refer to the OpenVMS and GCG resources in the 
class home page.

References:
 
 See the GCG and EGCG manuals.


Problem group 1.  Dealing with ABI sequences

Create a subdirectory and set your default directory to it.  Issue the
command:

  $ copy class:*example*.* []

You will now see two files with ugly names.

1A.  How do you fix these names?


Configure your terminal for GCG graphics.  Issue these commands:

  $ abiprintout/infile=ABI_EXAMPLE_M13F.ABI;1/begin=180/end=200
  $ abiprintout/infile=ABI_EXAMPLE_M13F.ABI;1/begin=180/end=200/pnt=500

1B.  How do the two plots differ?

1C.  What two commands could be used to get the sequence into GCG format?
     What two commands could you use to extract a subsequence (ie, trim off 
     the cruddy sequence on the ends)?

Problem group 2.  Sequence assembly

(When you are done remember to delete the files created during this exercise!)

Create a sequencing project (assuming that bluescript was the only vector 
used) and use gelenter to put into it these files:

   class:test*.seq
   class:bad*.seq
   class:rest*.seq

Assemble it.

2A.  What was wrong with bad0002.seq?
     What was wrong with bad0001.seq?

2B.  Is there a difference in the quality of the TEST* and REST* sequences?

2C.  Ignoring the BAD0001 contig, there are two contigs covering 1164 and 
     1569 bases (the pieces were shotgunned from a fragment of size 3000).
     Should you continue with shotgun sequencing for this insert?

2D.  Anything else you might want to do?