Fundamentals of Sequence Analysis, 1995-1996
Problem set 8:  Formatting data for publication.

If you get stuck, refer to the OpenVMS and GCG resources in the 
class home page.

  See the GCG and EGCG documentation.

Problem group 1.  Plasmid Maps

1A.  Produce a map of pBR322 (GB_SY:Synpbr322) showing all restriction
     sites that are present only once and all major features.

Problem group 2.  Multiple sequence formatting

Align all Troponin C entries in SwissProtein.

2A.  Format the .MSF file using Pretty showing the consensus and differences.

2B.  Format the .MSF file using PrettyPlot.  Make the consensus Black, 
     identity Green, similarity Blue, and differences Red.  Also, turn
     off the boxes around similar sequence.

2C.  Format the .MSF file using PrettyBOX.  Show a consensus, using
     output lines of 50 characters with no "block" spacing on the
     line.  Otherwise, use the default settings.

Problem group 3.  Single sequence formatting

3A.  Format GB_IN:Dmhish1 for publication showing:

     1.  Protein translation under the DNA (3 letter form)
     2.  Forward sequence only (no reverse sequence)
     3.  Dots every 10 bases, above the DNA
     4.  Number at the ends of the dot line
     5.  Two blank lines between each group of lines
     50  bases per line