SAPS. Version of March 24, 1993. Date run: Wed Mar 24 16:12:22 1993 File: testpro ID HMCU_DROME STANDARD; PRT; 2175 AA. DE HOMEOBOX PROTEIN CUT. number of residues: 2175 1 MQPTLPQAAG TADMDLTAVQ SINDWFFKKE QIYLLAQFWQ QRATLAEKEV NTLKEQLSTG 61 NPDSNLNSEN SDTAAAAATA AAVAAVVAGA TATNDIEDEQ QQQLQQTASG GILESDSDKL 121 LNSSIVAAAI TLQQQNGSNL LANTNTPSPS PPLLSAEQQQ QLQSSLQQSG GVGGACLNPK 181 LFFNHAQQMM MMEAAAAAAA AALQQQQQQQ SPLHSPANEV AIPTEQPAAT VATGAAAAAA 241 AAATPIATGN VKSGSTTSNA NHTNSNNSHQ DEEELDDEEE DEEEDEDEDD EEENASMQSN 301 ADDMELDAQQ ETRTEPSATT QQQHQQQDTE DLEENKDAGE ASLNVSNNHN TTDSNNSCSR 361 KNNNGGNESE QHVASSAEDD DCANNNTNTS NNNNTSNTAT SNTNNNNNNN SSSGNSEKRK 421 KKNNNNNNGQ PAVLLAAKDK EIKALLDELQ RLRAQEQTHL VQIQRLEEHL EVKRQHIIRL 481 EARLDKQQIN EALAEATALS AAASTNNNNN SQSSDNNKKL NTAAERPMDA SSNADLPEST 541 KAPVPAEDDE EDEDQAMLVD SEEAEDKPED SHHDDDEDED EDREAVNATT TDSNELKIKK 601 EQHSPLDLNV LSPNSAIAAA AAAAAAAACA NDPNKFQALL IERTKALAAE ALKNGASDAL 661 SEDAHHQQQQ HHQQQHQHQQ QHHQQQHLHQ QHHHHLQQQP NSGSNSNPAS NDHHHGHHLH 721 GHGLLHPSSA HHLHHQTTES NSNSSTPTAA GNNNGSNNSS SNTNANSTAQ LAASLASTLN 781 GTKSLMQEDS NGLAAVAMAA HAQHAAALGP GFLPGLPAFQ FAAAQVAAGG DGRGHYRFAD 841 SELQLPPGAS MAGRLGESLI PKGDPMEAKL QEMLRYNMDK YANQALDTLH ISRRVRELLS 901 VHNIGQRLFA KYILGLSQGT VSELLSKPKP WDKLTEKGRD SYRKMHAWAC DDNAVMLLKS 961 LIPKKDSGLP QYAGRGAGGA GGDDSMSEDR IAHILSEASS LMKQSSVAQH REQERRSHGG 1021 EDSHSNEDSK SPPQSCTSPF FKVENQLKQH QHLNPEQAAA QQREREREQR EREQQQRLRH 1081 DDQDKMARLY QELIARTPRE TAFPSFLFSP SLFGGAAGMP GAASNAFPAM ADENMRHVFE 1141 REIAKLQQHQ QQQQAAQAQA QFPNFSSLMA LQQQVLNGAQ DLSLAAAAAK DIKLNGQRSS 1201 LEHSAGSSSC SKDGERDDAY PSSLHGRKSE GGGTPAPPAP PSGPGTGAGA PPTAAPPTGG 1261 ASSNSAAPSP LSNSILPPAL SSQGEEFAAT ASPLQRMASI TNSLITQPPV TPHHSTPQRP 1321 TKAVLPPITQ QQFDMFNNLN TEDIVRRVKE ALSQYSISQR LFGESVLGLS QGSVSDLLAR 1381 PKPWHMLTQK GREPFIRMKM FLEDENAVHK LVASQYKIAP EKLMRTGSYS GSPQMPQGLA 1441 SKMQAASLPM QKMMSELKLQ EPAQAQHLMQ QMQAAAMSAA MQQQQVAQAQ QQAQQAQQAQ 1501 QHLQQQAQQH LQQQQHLAQQ QHPHQQHHQA AAAAAALHHQ SMLLTSPGLP PQHAISLPPS 1561 AGGAQPGGPG GNQGSSNPSN SEKKPMLMPV HGTNAMRSLH QHMSPTVYEM AALTQDLDTH 1621 DITTKIKEAL LANNIGQKIF GEAVLGLSQG SVSELLSKPK PWHMLSIKGR EPFIRMQLWL 1681 SDANNVERLQ LLKNERREAS KRRRSTGPNQ QDNSSDTSSN DTNDFYTSSP GPGSVGSGVG 1741 GAPPSKKQRV LFSEEQKEAL RLAFALDPYP NVGTIEFLAN ELGLATRTIT NWFHNHRMRL 1801 KQQVPHGPAG QDNPIPSRES TSATPFDPVQ FRILLQQRLL ELHKERMGMS GAPIPYPPYF 1861 AAAAILGRSL AGIPGAAAAA GAAAAAAAVG ASGGDELQAL NQAFKEQMSG LDLSMPTLKR 1921 ERSDDYQDDL ELEGGGHNLS DNESLEGQEP EDKTTDYEKV LHKSALAAAA AYMSNAVRSS 1981 RRKPAAPQWV NPAGAVTNPS AVVAAVAAAA AAAADNERII NGVCVMQASE YGRDDTDSNK 2041 PTDGGNDSDH EHAQLEIDQR FMEPEVHIKQ EEDDDEEQSG SVNLDNEDNA TSEQKLKVIN 2101 EEKLRMVRVR RLSSTGGGSS EEMPAPLAPP PPPPAASSSI VSGESTTSSS SSSNTSSSTP 2161 AVTTAAATAA AGWNY -------------------------------------------------------------------------------- COMPOSITIONAL ANALYSIS (extremes relative to: DROME.q) A+ :294(13.5%); C : 8( 0.4%); D :114( 5.2%); E :150( 6.9%); F : 39( 1.8%) G :127( 5.8%); H : 85( 3.9%); I : 54( 2.5%); K : 88( 4.0%); L :182( 8.4%) M : 58( 2.7%); N :144( 6.6%); P :129( 5.9%); Q :203( 9.3%); R : 83( 3.8%) S :215( 9.9%); T :104( 4.8%); V : 66( 3.0%); W : 10( 0.5%); Y- : 22( 1.0%) KR : 171 ( 7.9%); ED : 264 ( 12.1%); AGP : 550 ( 25.3%); KRED : 435 ( 20.0%); KR-ED : -93 ( -4.3%); FIKMNY : 405 ( 18.6%); LVIFM : 399 ( 18.3%); ST : 319 ( 14.7%). -------------------------------------------------------------------------------- CHARGE DISTRIBUTIONAL ANALYSIS 1 0000000000 00-0-00000 000-000++- 0000000000 0+0000-+-0 000+-00000 61 00-00000-0 0-00000000 0000000000 0000-0---0 0000000000 000-0-0-+0 121 0000000000 0000000000 0000000000 000000-000 0000000000 000000000+ 181 0000000000 00-0000000 0000000000 00000000-0 0000-00000 0000000000 241 0000000000 0+00000000 0000000000 ----0----- ---------- ---0000000 301 0--0-0-000 -0+0-00000 0000000-0- -0--0+-00- 0000000000 00-000000+ 361 +000000-0- 0000000--- -000000000 0000000000 0000000000 000000-+++ 421 ++00000000 0000000+-+ -0+000--00 +0+00-0000 0000+0--00 -0++0000+0 481 -0+0-+0000 -000-00000 0000000000 0000-00++0 0000-+00-0 0000-00-00 541 +00000---- ----00000- 0--0--+0-- 000------- --+-000000 0-00-0+0++ 601 -00000-000 0000000000 0000000000 0-00+00000 0-+0+0000- 00+0000-00 661 0--0000000 0000000000 0000000000 0000000000 0000000000 0-00000000 721 0000000000 00000000-0 0000000000 0000000000 0000000000 0000000000 781 00+0000--0 0000000000 0000000000 0000000000 0000000000 -0+000+00- 841 0-00000000 000+00-000 0+0-00-0+0 0-00+000-+ 000000-000 00++0+-000 901 000000+000 +000000000 00-000+0+0 0-+00-+0+- 00++000000 --000000+0 961 000++-0000 0000+00000 00--000--+ 000000-000 00+0000000 +-0-++0000 1021 --0000--0+ 0000000000 0+0-000+00 00000-0000 00+-+-+-0+ -+-000+0+0 1081 --0-+00+00 0-000+00+- 0000000000 0000000000 0000000000 0--00+000- 1141 +-00+00000 0000000000 0000000000 0000000000 -00000000+ -0+0000+00 1201 0-00000000 0+-0-+--00 000000++0- 0000000000 0000000000 0000000000 1261 0000000000 0000000000 0000--0000 00000+0000 0000000000 00000000+0 1321 0+00000000 000-000000 0--00++0+- 000000000+ 000-000000 00000-000+ 1381 0+0000000+ 0+-000+0+0 00---0000+ 000000+000 -+00+00000 0000000000 1441 0+00000000 0+000-0+00 -000000000 0000000000 0000000000 0000000000 1501 0000000000 0000000000 0000000000 0000000000 0000000000 0000000000 1561 0000000000 0000000000 0-++000000 000000+000 00000000-0 00000-0-00 1621 -000+0+-00 0000000+00 0-00000000 000-000+0+ 0000000+0+ -000+00000 1681 0-0000-+00 00+0-++-00 ++++000000 0-000-0000 -00-000000 0000000000 1741 00000++0+0 000--0+-00 +00000-000 00000-0000 -00000+000 000000+0+0 1801 +000000000 0-00000+-0 000000-000 0+00000+00 -00+-+0000 0000000000 1861 0000000+00 0000000000 0000000000 0000--0000 0000+-0000 0-000000++ 1921 -+0--00--0 -0-0000000 -0-00-00-0 --+00-0-+0 00+0000000 0000000+00 1981 +++0000000 0000000000 0000000000 0000-0-+00 000000000- 00+--0-00+ 2041 00-000-0-0 -0000-0-0+ 00-0-000+0 -------000 0000-0--00 00-0+0+000 2101 --+0+00+0+ +000000000 --00000000 0000000000 000-000000 0000000000 2161 0000000000 00000 A. CHARGE CLUSTERS. Positive charge clusters (cmin = 9/30 or 12/45 or 15/60): none Negative charge clusters (cmin = 12/30 or 16/45 or 19/60): 1) From 271 to 293: DEEELDDEEEDEEEDEDEDDEEE ----0------------------ quartile: 1; size: 23, +count: 0, -count: 22, 0count: 1; t-value: 12.26 * E: 14 (60.9%); D: 8 (34.8%); 2) From 547 to 582: EDDEEDEDQAMLVDSEEAEDKPEDSHHDDDEDEDED --------00000-0--0--+0--000--------- quartile: 2; size: 36, +count: 1, -count: 24, 0count: 11; t-value: 9.51 * E: 11 (30.6%); D: 13 (36.1%); 3) From 1924 to 1952: DDYQDDLELEGGGHNLSDNESLEGQEPED --00--0-0-0000000-0-00-00-0-- quartile: 4; size: 29, +count: 0, -count: 12, 0count: 17; t-value: 4.82 L: 4 (13.8%); G: 4 (13.8%); E: 6 (20.7%); D: 6 (20.7%); 4)*From 2034 to 2088: DDTDSNKPTDGGNDSDHEHAQLEIDQRFMEPEVHIKQEEDDDEEQSGSVNLDNED --0-00+00-000-0-0-0000-0-0+00-0-000+0-------0000000-0-- quartile: 4; size: 55, +count: 3, -count: 21, 0count: 31; t-value: 4.68 E: 9 (16.4%); D: 12 (21.8%); Mixed charge clusters (cmin = 16/30 or 22/45 or 27/60): 1) From 1063 to 1085: REREREQREREQQQRLRHDDQDK +-+-+-0+-+-000+0+0--0-+ quartile: 2; size: 23, +count: 8, -count: 8, 0count: 7; t-value: 5.94 * E: 5 (21.7%); D: 3 (13.0%); R: 7 (30.4%); Q: 5 (21.7%); 2)*From 2033 to 2111: see sequence above see sequence above quartile: 4; size: 79, +count: 11, -count: 24, 0count: 44; t-value: 5.40 * E: 12 (15.2%); D: 12 (15.2%); B. HIGH SCORING (UN)CHARGED SEGMENTS. There are no high scoring positive charge segments. ______________________________________ High scoring negative charge segments: score= 2.00 frequency= 0.121 ( ED ) score= 0.00 frequency= 0.000 ( BZX ) score= -1.00 frequency= 0.800 ( LAGSVTIPNFQYHMCW ) score= -2.00 frequency= 0.079 ( KR ) Expected score/letter: -0.714 M_0.01= 13.36; M_0.05= 11.39 1) From 271 to 293: length= 23, score=43.00 ** 271 DEEELDDEEE DEEEDEDEDD EEE E: 14(60.9%); D: 8(34.8%); 2) From 547 to 582: length= 36, score=35.00 ** 547 EDDEEDEDQA MLVDSEEAED KPEDSHHDDD EDEDED E: 11(30.6%); D: 13(36.1%); ___________________________________ High scoring mixed charge segments: score= 1.00 frequency= 0.200 ( KEDR ) score= 0.00 frequency= 0.000 ( BZX ) score= -1.00 frequency= 0.800 ( LAGSVTIPNFQYHMCW ) Expected score/letter: -0.600 M_0.01= 8.29; M_0.05= 7.11 1) From 271 to 293: length= 23, score=21.00 ** 271 DEEELDDEEE DEEEDEDEDD EEE E: 14(60.9%); D: 8(34.8%); 2) From 547 to 584: length= 38, score=16.00 ** (pocket at 555 to 559: length= 5, score=-5.00) 547 EDDEEDED |QA MLV| DSEEAED KPEDSHHDDD EDEDEDRE E: 12(31.6%); D: 13(34.2%); 3) From 1063 to 1082: length= 20, score= 8.00 * (pocket at 1074 to 1076: length= 3, score=-3.00) 1063 REREREQRER E |QQQ| RLRHDD E: 5(25.0%); D: 2(10.0%); R: 7(35.0%); Q: 4(20.0%); ________________________________ High scoring uncharged segments: score= 1.00 frequency= 0.800 ( LAGSVTIPNFQYHMCW ) score= 0.00 frequency= 0.000 ( BZX ) score= -8.00 frequency= 0.200 ( KEDR ) Expected score/letter: -0.800 M_0.01= 57.49; M_0.05= 47.95 1) From 120 to 270: length=151, score=97.00 ** 120 LLNSSIVAAA ITLQQQNGSN LLANTNTPSP SPPLLSAEQQ QQLQSSLQQS 170 GGVGGACLNP KLFFNHAQQM MMMEAAAAAA AAALQQQQQQ QSPLHSPANE 220 VAIPTEQPAA TVATGAAAAA AAAATPIATG NVKSGSTTSN ANHTNSNNSH 270 Q A: 32(21.2%); S: 16(10.6%); Q: 21(13.9%); 2) From 664 to 830: length=167, score=122.00 ** 664 AHHQQQQHHQ QQHQHQQQHH QQQHLHQQHH HHLQQQPNSG SNSNPASNDH 714 HHGHHLHGHG LLHPSSAHHL HHQTTESNSN SSTPTAAGNN NGSNNSSSNT 764 NANSTAQLAA SLASTLNGTK SLMQEDSNGL AAVAMAAHAQ HAAALGPGFL 814 PGLPAFQFAA AQVAAGG A: 25(15.0%); S: 19(11.4%); Q: 25(15.0%); H: 28(16.8%); 3) From 1231 to 1318: length= 88, score=61.00 ** (pocket at 1285 to 1286: length= 2, score=-16.00) 1231 GGGTPAPPAP PSGPGTGAGA PPTAAPPTGG ASSNSAAPSP LSNSILPPAL 1281 SSQG |EE| FAAT ASPLQRMASI TNSLITQPPV TPHHSTPQ A: 14(15.9%); G: 10(11.4%); S: 13(14.8%); T: 9(10.2%); P: 19(21.6%); 4) From 1462 to 1581: length=120, score=120.00 ** 1462 PAQAQHLMQQ MQAAAMSAAM QQQQVAQAQQ QAQQAQQAQQ HLQQQAQQHL 1512 QQQQHLAQQQ HPHQQHHQAA AAAAALHHQS MLLTSPGLPP QHAISLPPSA 1562 GGAQPGGPGG NQGSSNPSNS A: 24(20.0%); Q: 38(31.7%); C. CHARGE RUNS AND PATTERNS. pattern (+)| (-)| (*)| (0)| (+0)| (-0)| (*0)|(+00)|(-00)|(*00)| lmin0 5 | 6 | 7 | 48 | 10 | 11 | 14 | 12 | 14 | 17 | lmin1 6 | 7 | 9 | 57 | 12 | 14 | 17 | 14 | 17 | 21 | lmin2 7 | 9 | 11 | 64 | 13 | 15 | 19 | 16 | 19 | 23 | (0) 59(1,0,0); at 120- 179: see sequence above (1. quartile) L: 10 (16.7%); A: 6 (10.0%); S: 9 (15.0%); N: 6 (10.0%); Q: 10 (16.7%); ST: 12 (20.0%); (-) 22(1,0,0); at 271- 293: DEEELDDEEEDEEEDEDEDDEEE (1. quartile) ----0------------------ (+) 5(0,0,0); at 418- 422: KRKKK (1. quartile) +++++ (-) 8(0,0,0); at 547- 554: EDDEEDED (2. quartile) -------- (-) 9(0,0,0); at 574- 582: DDDEDEDED (2. quartile) --------- (0) 74(1,0,0); at 664- 738: see sequence above (2. quartile) Q: 20 (26.7%); H: 26 (34.7%); (0) 73(2,0,0); at 713- 787: see sequence above (2. quartile) L: 8 (10.7%); A: 8 (10.7%); S: 14 (18.7%); T: 8 (10.7%); N: 11 (14.7%); H: 12 (16.0%); ST: 22 (29.3%); (*) 10(1,0,0); at 1063-1073: REREREQRERE (2. quartile) +-+-+-0+-+- (0) 54(0,0,0); at 1231-1284: see sequence above (3. quartile) A: 10 (18.5%); G: 10 (18.5%); S: 9 (16.7%); P: 14 (25.9%); ST: 13 (24.1%); (0) 122(1,0,0); at 1459-1581: see sequence above (3. quartile) A: 24 (19.5%); Q: 39 (31.7%); (-) 7(0,0,0); at 2071-2077: EEDDDEE (4. quartile) ------- Run count statistics: + runs >= 3: 3, at 418; 1701; 1981; - runs >= 4: 6, at 271; 276; 378; 547; 574; 2071; * runs >= 5: 6, at 276; 417; 547; 574; 1063; 2071; 0 runs >= 32: 8, at 120; 382; 664; 740; 790; 1146; 1231; 1462; -------------------------------------------------------------------------------- DISTRIBUTION OF OTHER AMINO ACID TYPES 1. HIGH SCORING SEGMENTS. __________________________________ High scoring hydrophobic segments: 2.00 (LVIFM) 1.00 (AGYCW) 0.00 (BZX) -2.00 (PH) -4.00 (STNQ) -8.00 (KEDR) Expected score/letter: -2.443 M_0.01= 20.26; M_0.05= 17.35 1) From 792 to 830: length= 39, score=19.00 * (pocket at 801 to 804: length= 4, score=-7.00) 792 GLAAVAMAA |H AQH| AAALGPG FLPGLPAFQF AAAQVAAGG L: 4(10.3%); A: 15(38.5%); G: 6(15.4%); 2) From 1870 to 1891: length= 22, score=22.00 ** 1870 LAGIPGAAAA AGAAAAAAAV GA A: 14(63.6%); G: 4(18.2%); ____________________________________ High scoring transmembrane segments: 5.00 (LVIF) 2.00 (AGM) 0.00 (BZX) -1.00 (YCW) -2.00 (ST) -6.00 (P) -8.00 (H) -10.00 (NQ) -16.00 (KR) -17.00 (ED) Expected score/letter: -4.673 M_0.01= 47.40; M_0.05= 40.40; M_0.30= 32.07 1) From 74 to 90: length= 17, score=39.00 74 AAAAATAAAV AAVVAGA A: 12(70.6%); V: 3(17.6%); 2) From 1860 to 1894: length= 35, score=54.00 ** (pocket at 1868 to 1869: length= 2, score=-18.00) 1860 FAAAAILG |RS| LAGIPGAAAA AGAAAAAAAV GASGG A: 18(51.4%); G: 7(20.0%); 3) From 2000 to 2014: length= 15, score=35.00 2000 SAVVAAVAAA AAAAA A: 11(73.3%); V: 3(20.0%); 2. SPACINGS OF C. H2N-175-C-181-C-23-C-246-C-320-C-85-C-173-C-813-C-151-COOH -------------------------------------------------------------------------------- REPETITIVE STRUCTURES. A. SEPARATED, TANDEM, AND PERIODIC REPEATS: amino acid alphabet. Repeat core block length: 6 Aligned matching blocks: [ 99- 105] EQQQQLQ [ 157- 163] EQQQQLQ ______________________________ [ 284- 289] EDEDED [ 577- 582] EDEDED ______________________________ [ 403- 413] TNNNNNNNSSS [ 505- 514] TNNNNNSQSS with superset: [ 389- 394] TSNNNN [ 403- 408] TNNNNN [ 505- 510] TNNNNN ______________________________ [ 667- 690] QQ_QQHHQQQHQHQQQHHQQQHLHQ [1505-1529] QQAQQHLQQQQHLAQQQHPHQQHHQ with superset: [ 204- 209] QQQQQQ [ 322- 327] QQHQQQ [ 669- 674] QQHHQQ [ 674- 679] QQHQHQ [ 680- 685] QQHHQQ [1147-1152] QQHQQQ [1500-1505] QQHLQQ [1508-1513] QQHLQQ and: [ 669- 680] QQH_HQQQHQHQQ [ 674- 685] QQHQHQQQH_HQQ [ 680- 691] QQH_HQQQHLHQQ [1508-1520] QQHLQQQQHLAQQ and: [ 669- 687] QQH_HQQQHQHQQQHHQQQH [ 674- 692] QQH_QHQQQHHQQQHLHQQH [1508-1527] QQHLQQQQHLAQQQHPHQQH ______________________________ [ ]--------[ 914- 931] [1362-1398]-( -32)-[1367-1384] [1640-1676]-( -32)-[1645-1662] [1362-1398] FGESVLGLSQGSVSDLLARPKPWHMLTQKGREPFIRM [1640-1676] FGEAVLGLSQGSVSELLSKPKPWHMLSIKGREPFIRM [ 914- 931] LGLSQGTVSELLSKPKPW [1367-1384] LGLSQGSVSDLLARPKPW [1645-1662] LGLSQGSVSELLSKPKPW ______________________________ [1152-1161] QQQAAQA_QAQ [1490-1500] QQQAQQAQQAQ with superset: [ 101- 106] QQQLQQ [ 204- 209] QQQQQQ [ 321- 326] QQQHQQ [1152-1157] QQQAAQ [1490-1495] QQQAQQ [1504-1509] QQQAQQ ______________________________ Simple tandem repeats: [ 668- 676] Q__QQHHQQQH [ 677- 687] QHQQQHHQQQH [ 688- 693] LHQ_QHH [1490-1503] QQQAQQAQQAQQHL [1504-1517] QQQAQQHLQQQQHL [1518-1526] AQ__QQHPHQQ B. SEPARATED AND TANDEM REPEATS: 11-letter reduced alphabet. (i= LVIF; += KR; -= ED; s= AG; o= ST; n= NQ; a= YW; p= P; h= H; m= M; c= C) Repeat core block length: 10 Aligned matching blocks: [ 576- 587] -------+-s_in [2071-2083] -------nosoin with superset: [ 286- 295] --------ns [ 547- 556] --------ns [ 576- 585] -------+-s [2071-2080] -------nos ______________________________ [1359-1398] n+iis-oiisionsoio-iis+p+pahmion+s+-pii+m [1637-1676] n+iis-siisionsoio-iio+p+pahmioi+s+-pii+m with superset: [ 913- 930] iisionsoio-iio+p+p [1366-1383] iisionsoio-iis+p+p [1644-1661] iisionsoio-iio+p+p -------------------------------------------------------------------------------- MULTIPLETS. A. AMINO ACID ALPHABET. 1. Total number of amino acid multiplets: 247 (Expected range: 108--178) high 1 .......AA. .......... .....FFKK. ...LL....Q Q......... .......... 61 .......... ...AAAAA.A AA.AAVV... .........Q QQQ.QQ...G G........L 121 L.SS..AAA. ..QQQ....L L......... PPLL...QQQ Q..SS.QQ.G G.GG...... 181 .FF...QQMM MM.AAAAAAA AA.QQQQQQQ .......... .......AA. ....AAAAAA 241 AAA....... .....TT... .....NN... .EEE.DDEEE .EEE....DD EEE....... 301 .DD.....QQ ........TT QQQ.QQQ... ..EE...... ......NN.. TT..NN.... 361 .NNNGG.... ....SS..DD D..NNN.... NNNN...... ...NNNNNNN SSS......K 421 KKNNNNNN.. ...LLAA... ....LL.... .......... ......EE.. ......II.. 481 ......QQ.. .......... AAA..NNNNN ..SS.NNKK. ..AA...... SS........ 541 .......DDE E......... .EE....... .HHDDD.... ........TT T.......KK 601 .......... .......AAA AAAAAAAA.. ........LL .......AA. .......... 661 ....HHQQQQ HHQQQ...QQ QHHQQQ...Q QHHHH.QQQ. .......... ..HHH.HH.. 721 ...LL..SS. HH.HH.TT.. ...SS...AA .NNN..NNSS S......... .AA....... 781 .......... ...AA...AA ....AAA... .......... .AAA..AAGG .......... 841 .....PP... .......... .......... .......... .......... ..RR...LL. 901 .......... .......... ...LL..... .......... .......... DD....LL.. 961 ...KK..... .......GG. GGDD...... ........SS ....SS.... ....RR..GG 1021 .......... .PP......F F......... .......AAA QQ........ ...QQQ.... 1081 DD........ .......... .......... ...GGAA... .AA....... .......... 1141 ......QQ.Q QQQQAA.... .....SS... .QQQ...... ....AAAAA. ........SS 1201 ......SSS. ......DD.. .SS....... GGG...PP.P P......... PP.AAPP.GG 1261 .SS..AA... ......PP.. SS..EE.AA. .......... .......PP. ..HH...... 1321 .....PP..Q QQ....NN.. .....RR... .......... .......... ......LL.. 1381 .......... .......... .......... .......... .......... .......... 1441 ....AA.... ..MM...... .........Q Q..AAA..AA .QQQQ....Q QQ.QQ.QQ.Q 1501 Q..QQQ.QQ. .QQQQ...QQ Q...QQHH.A AAAAAA.HH. ..LL.....P P......PP. 1561 .GG...GG.G G...SS.... ..KK...... .......... .......... AA........ 1621 ..TT.....L L.NN...... .......... ....LL.... .......... .......... 1681 ...NN..... LL...RR... .RRR.....Q Q..SS..SS. .......SS. .........G 1741 G.PP.KK... ...EE..... .......... .......... .......... .......... 1801 .QQ....... .......... .......... ...LLQQ.LL .......... ......PP.. 1861 AAAA...... .....AAAAA .AAAAAAA.. ..GG...... .......... .......... 1921 ...DD..DD. ...GGG.... .......... ...TT..... ......AAAA A.......SS 1981 RR..AA.... .......... .VVAA.AAAA AAAA....II .......... ...DD..... 2041 ...GG..... .......... .......... EEDDDEE... .......... .......... 2101 EE.......R R.SS.GGGSS EE......PP PPPPAASSS. .....TTSSS SSS..SSS.. 2161 ..TTAAA.AA A.... 2. Histogram of spacings between consecutive amino acid multiplets: (1-5) 148 (6-10) 49 (11-20) 33 (>=21) 18 3. Clusters of amino acid multiplets (cmin = 17/30 or 22/45 or 27/60): none 4. Significant specific amino acid multiplet counts: Letter Count % Observed (Critical number) G 127 5.8 19 (18) at 110 (l= 2) 170 (l= 2) 173 (l= 2) 365 (l= 2) 829 (l= 2) 978 (l= 2) 981 (l= 2) 1019 (l= 2) 1114 (l= 2) 1231 (l= 3) 1259 (l= 2) 1562 (l= 2) 1567 (l= 2) 1570 (l= 2) 1740 (l= 2) 1893 (l= 2) 1934 (l= 3) 2044 (l= 2) 2116 (l= 3) Q 203 9.3 38 (34) at 40 (l= 2) 100 (l= 4) 105 (l= 2) 133 (l= 3) 158 (l= 4) 167 (l= 2) 187 (l= 2) 204 (l= 7) 309 (l= 2) 321 (l= 3) 325 (l= 3) 487 (l= 2) 667 (l= 4) 673 (l= 3) 679 (l= 3) 684 (l= 3) 690 (l= 2) 697 (l= 3) 1061 (l= 2) 1074 (l= 3) 1147 (l= 2) 1150 (l= 5) 1172 (l= 3) 1330 (l= 3) 1470 (l= 2) 1482 (l= 4) 1490 (l= 3) 1494 (l= 2) 1497 (l= 2) 1500 (l= 2) 1504 (l= 3) 1508 (l= 2) 1512 (l= 4) 1519 (l= 3) 1525 (l= 2) 1710 (l= 2) 1802 (l= 2) 1836 (l= 2) H 85 3.9 12 (11) at 572 (l= 2) 665 (l= 2) 671 (l= 2) 682 (l= 2) 692 (l= 4) 713 (l= 3) 717 (l= 2) 731 (l= 2) 734 (l= 2) 1313 (l= 2) 1527 (l= 2) 1538 (l= 2) 5. Long amino acid multiplets (>= 5; Letter/Length/Position): A/5/74 A/9/194 Q/7/204 A/9/235 N/7/404 N/6/423 N/5/506 A/11/618 Q/5/1150 A/5/1185 A/7/1530 A/5/1876 A/7/1882 A/5/1967 A/8/2007 P/6/2129 S/6/2148 B. CHARGE ALPHABET. 1. Total number of charge multiplets: 56 (Expected range: 21-- 60) 19 +plets (f+: 7.9%), 37 -plets (f-: 12.1%) Total number of charge altplets: 49 (Critical number: 62) 2. Histogram of spacings between consecutive charge multiplets: (1-5) 13 (6-10) 8 (11-20) 6 (>=21) 30 3. Long charge multiplets (>= 5; Letter/Length/Position): -/18/276 +/5/418 -/8/547 -/9/574 -/7/2071 -------------------------------------------------------------------------------- PERIODICITY ANALYSIS. A. AMINO ACID ALPHABET (core: 4; !-core: 6) Location Period Element Copies Core Errors 74- 85 1 A 10 5 2 74- 93 2 A. 9 6 ! 1 100- 106 1 Q 6 4 1 189- 192 1 M 4 4 0 194- 202 1 A 9 9 ! 0 204- 210 1 Q 7 7 ! 0 235- 243 1 A 9 9 ! 0 274- 289 4 E.ED 4 4 /0/./1/0/ 321- 328 2 Q. 4 4 0 344- 367 6 N..... 4 4 0 388- 432 9 N........ 5 5 0 391- 394 1 N 4 4 0 402- 411 2 N. 5 5 0 423- 428 1 N 6 6 ! 0 506- 510 1 N 5 5 0 566- 585 4 D... 5 5 0 616- 631 2 A. 8 8 ! 0 622- 653 8 A....... 4 4 0 667- 670 1 Q 4 4 0 669- 688 4 Q... 5 5 0 670- 704 7 Q...... 5 5 0 692- 695 1 H 4 4 0 714- 737 6 H..... 4 4 0 754- 769 4 N... 4 4 0 1062-1089 7 QR.R... 4 4 /0/1/./1/./././ 1147-1154 1 Q 7 5 1 1185-1189 1 A 5 5 0 1235-1246 3 P.. 4 4 0 1243-1250 2 G. 4 4 0 1301-1325 5 T.... 5 5 0 1464-1517 6 Q..Q.. 8 6 ! /1/././3/././ 1482-1485 1 Q 4 4 0 1482-1523 3 Q.. 12 7 ! 2 1482-1515 2 Q. 13 4 4 1497-1532 4 Q... 8 5 1 1512-1515 1 Q 4 4 0 1530-1536 1 A 7 7 ! 0 1729-1760 8 S....... 4 4 0 1861-1864 1 A 4 4 0 1876-1888 1 A 12 7 ! 1 1965-1972 2 A. 4 4 0 2004-2014 1 A 10 8 ! 1 2129-2134 1 P 6 6 ! 0 2148-2153 1 S 6 6 ! 0 B. CHARGE ALPHABET ({+= KR; -= ED; 0}; core: 5; !-core: 7) and HYDROPHOBICITY ALPHABET ({*= KRED; i= LVIF; 0}; core: 6; !-core: 9) Location Period Element Copies Core Errors 271- 293 1 - 22 18 ! 1 272- 311 5 --... 7 5 /1/2/./././ 328- 342 3 -0. 5 5 0 418- 422 1 + 5 5 0 547- 554 1 - 8 8 ! 0 550- 585 4 -... 8 6 1 574- 584 1 * 11 11 ! 0 1063-1073 1 * 10 6 1 2071-2077 1 - 7 7 ! 0 -------------------------------------------------------------------------------- SPACING ANALYSIS. Location (Quartile) Spacing Rank P-value Interpretation 33- 836 (1.) Y( 803)Y 1 of 23 0.0009 large 1. maximal spacing 42- 313 (1.) R( 271)R 1 of 84 0.0011 large 1. maximal spacing 118- 271 (1.) D( 153)D 2 of 115 0.0001 large 2. maximal spacing 183- 636 (1.) F( 453)F 1 of 40 0.0040 large 1. maximal spacing 429- 655 (1.) G( 226)G 1 of 128 0.0001 large 1. maximal spacing 643- 833 (2.) R( 190)R 2 of 84 0.0003 large 2. maximal spacing 653- 783 (2.) +( 130)+ 1 of 172 0.0029 large 1. maximal spacing 1230-1285 (3.) *( 55)* 2 of 436 0.0000 large 2. maximal spacing 1401-1640 (3.) F( 239)F 2 of 40 0.0415 large 2. maximal spacing 1404-1616 (3.) D( 212)D 1 of 115 0.0007 large 1. maximal spacing 1406-1572 (3.) N( 166)N 1 of 145 0.0010 large 1. maximal spacing 1429-1608 (3.) Y( 179)Y 2 of 23 0.9787 small 2. maximal spacing 1438-1548 (3.) G( 110)G 2 of 128 0.0066 large 2. maximal spacing 1458-1583 (3.) +( 125)+ 2 of 172 0.0000 large 2. maximal spacing 1461-1582 (3.) *( 121)* 1 of 436 0.0000 large 1. maximal spacing 1461-1582 (3.) -( 121)- 1 of 265 0.0000 large 1. maximal spacing 1813-1901 (4.) N( 88)N 2 of 145 0.0318 large 2. maximal spacing 1958-2015 (4.) -( 57)- 2 of 265 0.0084 large 2. maximal spacing