Fundamentals of Sequence Analysis, 1995-1996
Problem set 5:  Assembling sequences.

If you get stuck, refer to the OpenVMS and GCG resources in the 
class home page.

 See the GCG and EGCG manuals.

Problem group 1.  Dealing with ABI sequences

Create a subdirectory and set your default directory to it.  Issue the

  $ copy class:*example*.* []

You will now see two files with ugly names.

1A.  How do you fix these names?

Configure your terminal for GCG graphics.  Issue these commands:

  $ abiprintout/infile=ABI_EXAMPLE_M13F.ABI;1/begin=180/end=200
  $ abiprintout/infile=ABI_EXAMPLE_M13F.ABI;1/begin=180/end=200/pnt=500

1B.  How do the two plots differ?

1C.  What two commands could be used to get the sequence into GCG format?
     What two commands could you use to extract a subsequence (ie, trim off 
     the cruddy sequence on the ends)?

Problem group 2.  Sequence assembly

(When you are done remember to delete the files created during this exercise!)

Create a sequencing project (assuming that bluescript was the only vector 
used) and use gelenter to put into it these files:


Assemble it.

2A.  What was wrong with bad0002.seq?
     What was wrong with bad0001.seq?

2B.  Is there a difference in the quality of the TEST* and REST* sequences?

2C.  Ignoring the BAD0001 contig, there are two contigs covering 1164 and 
     1569 bases (the pieces were shotgunned from a fragment of size 3000).
     Should you continue with shotgun sequencing for this insert?

2D.  Anything else you might want to do?