Fundamentals of Sequence Analysis, 1998-1999
Answer set 4:  Tools for Molecular Biology I.

If you get stuck, refer to the OpenVMS and GCG resources in the 
class home page.

References:
 
 See documentation in the programs.

Problem group 1.  Mapping

You have a cloned a new insert into your favorite vector.  Following single
and double digestions with enzymes Chmp and Bite you find the following 
mobilities (in cm. from the gel origin):

   Chmp cm 2.14    7.36    8.94
   Bite cm 3.62    5.36    6.90
   Chmp + Bite
        cm 5.36    6.90    7.36    8.30    8.94
   Controls
        MW 2200    4300    5700    8600
        cm 8.25    5.48    4.32    2.62

1A.  What are the Molecular weights of the various pieces?
1B.  What is the restriction map?


Problem group 2.  Silent translation sites

You have cloned an mRNA for the NBLPRZ gene.  The sequence is in the file
"class:nblprz.seq".  You need to make a gene fusion construct taking as
much of the NBLPRZ gene as possible and attaching it to a stub protein.
The clone for that stub protein happens to have a HindIII site in convenient
place, the first base of which begins in the third base position of the 
codons in the open reading frame.  The NBLPRZ protein begins with the Met at
position 57.

2A.  If you engineer a translationally silent HindIII site into NBLPRZ, how
     much of it can you put into the fusion? 


2B.   How much better could we do if the convenient site on the stub 
      protein was for an XbaI (T'CTAG_A) the first base of which began
      on the first base of a codon in the open reading frame?


Problem group 3.  Primers

3A.  Design a set of primers to amplify at least bases 100 - 900 of
     CLASS:NBLPRZ.SEQ from genomic DNA (assume that it consists of a single
     exon.) 

Problem group 4.  Finding genes

4A.  What are the predicted exons for the Drosophila gene in
     CLASS:UNKNOWN.SEQ?  (Use genefinder, the file is  already in the
     correct format for that program).  Identify the gene.  Do the
     predicted exons agree with the documented ones?


This information is provided for anybody off site who is working through this course. Here is the sequence for CLASS:NBLPRZ.SEQ NBLPRZ Nblprz.Seq Length: 1177 January 31, 1996 10:54 Type: N Check: 8643 .. 1 ATTAGGCGCT ATACGTCGGT ATACGGATCA TCGGTATCCG TGTATCGATC 51 GATGCAATGG CTTCCCCGAC CTCCCCGAAA GTTTTCCCGC TGTCCCTGTG 101 CTCCACCCAG CCGGACGGTA ACGTTGTTAT CGCTTGCCTG GTTCAGGGTT 151 TCTTCCCGCA GGAACCGCTG TCCGTTACCT GGTCCGAATC CGGTCAGGGT 201 GTTACCGCTC GTAACTTCCC GCCGTCCCAG GACGCTTCCG GTGACCTGTA 251 CACCACCTCC TCCCAGCTGA CCCTGCCGGC TACCCAGTGC CTGGCTGGTA 301 AATCCGTTAC CTGCCACGTT AAACACTACA CCAACCCGTC CCAGGACGTT 351 ACCGTTCCGT GCCCGGTTCC GTCCACCCCG CCGACCCCGT CCCCGTCCAC 401 CCCGCCGACC CCGTCCCCGT CCTGCTGCCA CCCGCGTCTG TCCCTGCACC 451 GTCCGGCTCT GGAAGACCTG CTGCTGGGTT CCGAAGCTAA CCTGACCTGC 501 ACCCTGACCG GTCTGCGTGA CGCTTCCGGT GTTACCTTCA CCTGGACCCC 551 GTCCTCCGGT AAATCCGCTG TTCAGGGTCC GCCGGAACGT GACCTGTGCG 601 GTTGCTACTC CGTTTCCTCC GTTCTGCCGG GTTGCGCTGA ACCGTGGAAC 651 CACGGTAAAA CCTTCACCTG CACCGCTGCT TACCCGGAAT CCAAAACCCC 701 GCTGACCGCT ACCCTGTCCA AATCCGGTAA CACCTTCCGT CCGGAAGTTC 751 ACCTGCTGCC GCCGCCGTCC GAAGAACTGG CTCTGAACGA ACTGGTTACC 801 CTGACCTGCC TGGCTCGTGG TTTCTCCCCG AAAGACGTTC TGGTTCGTTG 851 GCTGCAGGGT TCCCAGGAAC TGCCGCGTGA AAAATACCTG ACCTGGGCTT 901 CCCGTCAGGA ACCGTCCCAG GGTACCACCA CCTTCGCTGT TACCTCCATC 951 CTGCGTGTTG CTGCTGAAGA CTGGAAAAAA GGTGACACCT TCTCCTGCAT 1001 GGTTGGTCAC GAAGCTCTGC CGCTGGCTTT CACCCAGAAA ACCATCGACC 1051 GTCTGGCTGG TAAACCGACC CACGTTAACG TTTCCGTTGT TATGGCTGAA 1101 GTTGACGGTA CCTGCTACTA AATCGTAGTC GTATATATGC TAGTCGATGC 1151 TATATGCTAG CTGATTTCGC GATTCTA