Created sample texts in English ("A mysterious affair at Styles") and Portuguese (Rober M. Rosi's master thesis), depunctualized and decapitalized. Extracted letter-in-context statistics for "t" and "d", "p" and "b" in those texts. Results posted to the Voynich list, in reply to Jacques Guy. Extracted comparative statistics for \c/ strings beginning with \cs/ and \c/: cat bio-j-jsa-gut.wds \ | sed \ -e 's/cs/S/g' \ -e 's/[ql][gj]/H/g' \ -e 's/cg/8/g' \ -e 's/cy/9/g' \ -e 's/ci/a/g' \ | sed -e 's/^/_/g' -e 's/$/_/g' \ | compare-contexts -rctx 1 '_cc[cS]*' '_Sc[cS]*' > .foo 192 0.33 _ccc8 219 0.38 _Scc8 97 0.17 _cccH 80 0.14 _SccH 67 0.11 _ccc9 71 0.12 _Scc9 52 0.09 _ccccH 56 0.10 _ScccH 31 0.05 _cccc9 38 0.07 _Sccc8 25 0.04 _ccco 30 0.05 _Scco 22 0.04 _cc8 23 0.04 _Sccc9 20 0.03 _cccc8 19 0.03 _Sco 17 0.03 _cco 9 0.02 _Sc9 17 0.03 _ccH 7 0.01 _Sca 13 0.02 _cca 6 0.01 _Scca 11 0.02 _ccca 6 0.01 _ScH 8 0.01 _cccS_ 6 0.01 _Sc8 6 0.01 _cc9 3 0.01 _Sccco 3 0.01 _ccccS_ 1 0.00 _SccccH 1 0.00 _cccco 1 0.00 _ScccS_ 1 0.00 _ccccco 1 0.00 _SccS_ 1 0.00 _cccca 1 0.00 _Sc_ 1 0.00 _cccSa ----- ---- ---- 1 0.00 _cccS9 577 1.00 TOT ----- ---- ---- 586 1.00 TOT Trying yet another variant encoding, "huc": cat bio-m-jsa.evt \ | jsa2huc \ | make-consensus-interlin \ > bio-x-huc.evt cat bio-x-huc.evt \ | egrep '^<.*;J> ' \ | sed \ -e 's/{[^}]*}//g' \ > bio-j-huc.evt extract-words-from-interlin \ -chars "mnrfcgiaoAeHP" \ bio-j-huc.evt \ bio-j-huc jsa2huc ------------------------------------------------- #! /n/gnu/bin/sed -f # Recoding superanalytic to ad-hoc encoding: /^[^#]/s/ij/f/g /^[^#]/s/ix/e/g /^[^#]/s/cy/X/g /^[^#]/s/ci/X/g /^[^#]/s/iiiiu/m/g /^[^#]/s/iiiu/m/g /^[^#]/s/iiu/n/g /^[^#]/s/iis/v/g /^[^#]/s/is/r/g /^[^#]/s/X/ci/g /^[^#]/s/cs/c/g /^[^#]/s/qo/A/g /^[^#]/s/qj/H/g /^[^#]/s/qg/P/g /^[^#]/s/lj/H/g /^[^#]/s/lg/P/g ------------------------------------------------- lines words bytes file ------ ------- --------- ------------ 7098 7098 48799 bio-j-huc.wds 1705 1705 14931 bio-j-huc.dic 4742 4742 33342 bio-j-huc-gut.wds 737 737 5720 bio-j-huc-gut.dic 892 892 2815 bio-j-huc-fun.wds 42 42 296 bio-j-huc-fun.dic 1464 1464 12642 bio-j-huc-bad.wds 926 926 8915 bio-j-huc-bad.dic Digraph counts: m n r f c g i o A e H P TOT ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- . . . 81 . 1917 . . 991 1243 284 158 68 4742 m 350 . . . . 3 . . . . . . . 353 n 110 . . . . . . . 1 . . . . 111 r 415 . . . . 106 . . 37 . . . . 558 f 36 . . . . . . . 1 . . . . 37 c 47 . . . . 7238 2147 4186 248 . 15 377 32 14290 g 49 . . 1 . 2051 . . 37 . 5 4 . 2147 i 2862 341 109 290 33 52 . 2 3 2 423 69 2 4188 o 14 12 1 177 4 37 . . . 4 765 478 42 1534 A 7 . 1 7 . 30 . . 1 1 133 1047 24 1251 e 840 . . 2 . 489 . . 104 1 4 185 10 1635 H 10 . . . . 2227 . . 75 . 6 . . 2318 P 2 . . . . 140 . . 36 . . . . 178 ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- TOT 4742 353 111 558 37 14290 2147 4188 1534 1251 1635 2318 178 33342 Next-symbol probability (× 99): m n r f c g i o A e H P TOT ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- . . . 2 . 40 . . 21 26 6 3 1 99 m 98 . . . . 1 . . . . . . . 99 n 98 . . . . . . . 1 . . . . 99 r 74 . . . . 19 . . 7 . . . . 99 f 96 . . . . . . . 3 . . . . 99 c . . . . . 50 15 29 2 . . 3 . 99 g 2 . . . . 95 . . 2 . . . . 99 i 68 8 3 7 1 1 . . . . 10 2 . 99 o 1 1 . 11 . 2 . . . . 49 31 3 99 A 1 . . 1 . 2 . . . . 11 83 2 99 e 51 . . . . 30 . . 6 . . 11 1 99 H . . . . . 95 . . 3 . . . . 99 P 1 . . . . 78 . . 20 . . . . 99 ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- TOT 14 1 0 2 0 42 6 12 5 4 5 7 1 33342 Previous-symbol probability (× 99): m n r f c g i o A e H P TOT ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- . . . 14 . 13 . . 64 98 17 7 38 14 m 7 . . . . . . . . . . . . 1 n 2 . . . . . . . . . . . . 0 r 9 . . . . 1 . . 2 . . . . 2 f 1 . . . . . . . . . . . . 0 c 1 . . . . 50 99 99 16 . 1 16 18 42 g 1 . . . . 14 . . 2 . . . . 6 i 60 96 97 51 88 . . . . . 26 3 1 12 o . 3 1 31 11 . . . . . 46 20 23 5 A . . 1 1 . . . . . . 8 45 13 4 e 18 . . . . 3 . . 7 . . 8 6 5 H . . . . . 15 . . 5 . . . . 7 P . . . . . 1 . . 2 . . . . 1 ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- TOT 99 99 99 99 99 99 99 99 99 99 99 99 99 33342