Fundamentals of Sequence Analysis, 1998-1999 Answer set 4: Tools for Molecular Biology I. If you get stuck, refer to the OpenVMS and GCG resources in the class home page. References: See documentation in the programs. Problem group 1. Mapping You have a cloned a new insert into your favorite vector. Following single and double digestions with enzymes Chmp and Bite you find the following mobilities (in cm. from the gel origin): Chmp cm 2.14 7.36 8.94 Bite cm 3.62 5.36 6.90 Chmp + Bite cm 5.36 6.90 7.36 8.30 8.94 Controls MW 2200 4300 5700 8600 cm 8.25 5.48 4.32 2.62 1A. What are the Molecular weights of the various pieces? 1B. What is the restriction map? Problem group 2. Silent translation sites You have cloned an mRNA for the NBLPRZ gene. The sequence is in the file "class:nblprz.seq". You need to make a gene fusion construct taking as much of the NBLPRZ gene as possible and attaching it to a stub protein. The clone for that stub protein happens to have a HindIII site in convenient place, the first base of which begins in the third base position of the codons in the open reading frame. The NBLPRZ protein begins with the Met at position 57. 2A. If you engineer a translationally silent HindIII site into NBLPRZ, how much of it can you put into the fusion? 2B. How much better could we do if the convenient site on the stub protein was for an XbaI (T'CTAG_A) the first base of which began on the first base of a codon in the open reading frame? Problem group 3. Primers 3A. Design a set of primers to amplify at least bases 100 - 900 of CLASS:NBLPRZ.SEQ from genomic DNA (assume that it consists of a single exon.) Problem group 4. Finding genes 4A. What are the predicted exons for the Drosophila gene in CLASS:UNKNOWN.SEQ? (Use genefinder, the file is already in the correct format for that program). Identify the gene. Do the predicted exons agree with the documented ones?
This information is provided for anybody off site who is working through this course. Here is the sequence for CLASS:NBLPRZ.SEQ NBLPRZ Nblprz.Seq Length: 1177 January 31, 1996 10:54 Type: N Check: 8643 .. 1 ATTAGGCGCT ATACGTCGGT ATACGGATCA TCGGTATCCG TGTATCGATC 51 GATGCAATGG CTTCCCCGAC CTCCCCGAAA GTTTTCCCGC TGTCCCTGTG 101 CTCCACCCAG CCGGACGGTA ACGTTGTTAT CGCTTGCCTG GTTCAGGGTT 151 TCTTCCCGCA GGAACCGCTG TCCGTTACCT GGTCCGAATC CGGTCAGGGT 201 GTTACCGCTC GTAACTTCCC GCCGTCCCAG GACGCTTCCG GTGACCTGTA 251 CACCACCTCC TCCCAGCTGA CCCTGCCGGC TACCCAGTGC CTGGCTGGTA 301 AATCCGTTAC CTGCCACGTT AAACACTACA CCAACCCGTC CCAGGACGTT 351 ACCGTTCCGT GCCCGGTTCC GTCCACCCCG CCGACCCCGT CCCCGTCCAC 401 CCCGCCGACC CCGTCCCCGT CCTGCTGCCA CCCGCGTCTG TCCCTGCACC 451 GTCCGGCTCT GGAAGACCTG CTGCTGGGTT CCGAAGCTAA CCTGACCTGC 501 ACCCTGACCG GTCTGCGTGA CGCTTCCGGT GTTACCTTCA CCTGGACCCC 551 GTCCTCCGGT AAATCCGCTG TTCAGGGTCC GCCGGAACGT GACCTGTGCG 601 GTTGCTACTC CGTTTCCTCC GTTCTGCCGG GTTGCGCTGA ACCGTGGAAC 651 CACGGTAAAA CCTTCACCTG CACCGCTGCT TACCCGGAAT CCAAAACCCC 701 GCTGACCGCT ACCCTGTCCA AATCCGGTAA CACCTTCCGT CCGGAAGTTC 751 ACCTGCTGCC GCCGCCGTCC GAAGAACTGG CTCTGAACGA ACTGGTTACC 801 CTGACCTGCC TGGCTCGTGG TTTCTCCCCG AAAGACGTTC TGGTTCGTTG 851 GCTGCAGGGT TCCCAGGAAC TGCCGCGTGA AAAATACCTG ACCTGGGCTT 901 CCCGTCAGGA ACCGTCCCAG GGTACCACCA CCTTCGCTGT TACCTCCATC 951 CTGCGTGTTG CTGCTGAAGA CTGGAAAAAA GGTGACACCT TCTCCTGCAT 1001 GGTTGGTCAC GAAGCTCTGC CGCTGGCTTT CACCCAGAAA ACCATCGACC 1051 GTCTGGCTGG TAAACCGACC CACGTTAACG TTTCCGTTGT TATGGCTGAA 1101 GTTGACGGTA CCTGCTACTA AATCGTAGTC GTATATATGC TAGTCGATGC 1151 TATATGCTAG CTGATTTCGC GATTCTA