Fundamentals of Sequence Analysis, 1998-1999 Problem set 7: Phylogenetic analysis. If you get stuck, refer to the OpenVMS and GCG resources in the class home page. References: See the GCG and PHYLIP documentation. Reviews: Felsenstein, J. Phylogenies from Molecular Sequences: Inference and Reliability (1988) Ann. Rev. of Genetics 22:521-565 Problem group 1. Artificial data Create two sets of data (STAR and FORK) as for Problem set 2. For the STAR set start with PIR1:A1HU and produce sequences that differ from it by 50, 100, 150, 200, 250, 300, 350, and 400 substitutions (no indels). That is A1HU->A50, A1HU->A100, A1HU->A150, etc.. For the FOR set, again start with PIR1:A1HU, but this time have each sequence in the order differ from the preceding one by 50. That is A1HU->A50, A50->A100, A100->A150 and so forth. Align each set separately using PILEUP. Set the gap penalty so that no gaps are introduced. Save the tree that results from each run. 1A. Use the PARSIMONY method to derive a phylogenetic tree. How do the trees produced by pileup and the phylogenetic tree compare? 1B. Use the FITCH, NEIGHBOR joining, and UPGMA methods to derive phylogenetic trees. Problem group 2. Real data 2A. The phylogenetic relationships between humans and the great apes has been a topic of great debate. Solve it to your own satisfication using the protein sequences in Swiss-Protein.