In a previous version of the glyph statistics, which treated (qoe) (qor) (oe) (or) as prefix+(e), I noticed a curious coincidence: the counts for (bg) and (g) were amost equal to those of (o) and (qo), respectively. Is there a correlation? intra-word or inter-word? (Currier had noticed something like it, but he didn't say the numbers were so close!) Let's check: cat .voyn.glp \ | sed \ -e 's:([_/=]*)::g' \ -e 's/([b]*g)([^)]*)/@&@/g' \ | tr '@' '\012' \ | egrep '^\([b]*g\)' \ | sort | uniq -c | expand \ | sort +0.0 -0.7nr \ | compute-freqs \ > .voyn-bg.frq cat .voyn.glp \ | sed \ -e 's:([_/=]*)::g' \ -e 's/([^)]*)([q]*o[er]*)/@&@/g' \ | tr '@' '\012' \ | egrep '\([q]*o[er]*\)$' \ | sort | uniq -c | expand \ | sort +0.0 -0.7nr \ | compute-freqs \ > .voyn-qo.frq compare-freqs \ .voyn-bg.frq \ .voyn-qo.frq \ > .voyn-bg-qo.cmp ## TOTAL .voyn-bg.fr .voyn-qo.fr WORD # ----------- ----------- ----------- ----------- 1390 0.204 693 0.205 697 0.202 (bg)(qo) 802 0.118 401 0.119 401 0.116 (g)(qo) 396 0.058 197 0.058 199 0.058 (bg)(o) 272 0.040 135 0.040 137 0.040 (g)(o) 193 0.028 96 0.028 97 0.028 (bg)(oe) 184 0.027 92 0.027 92 0.027 (bg)(qoe) 171 0.025 85 0.025 86 0.025 (g)(oe) 166 0.024 166 0.049 0 0.000 (g)(e) 143 0.021 143 0.042 0 0.000 (bg)(e) 118 0.017 118 0.035 0 0.000 (g)(b) 114 0.017 57 0.017 57 0.017 (g)(qoe) 104 0.015 104 0.031 0 0.000 (bg)(b) 91 0.013 91 0.027 0 0.000 (bg)(tc) 90 0.013 0 0.000 90 0.026 (z)(oe) 81 0.012 0 0.000 81 0.023 (an)(o) 78 0.011 0 0.000 78 0.023 (an)(oe) 68 0.010 0 0.000 68 0.020 (ar)(oe) 67 0.010 67 0.020 0 0.000 (bg)(sc) 63 0.009 63 0.019 0 0.000 (g)(r) 60 0.009 0 0.000 60 0.017 (b)(oe) 59 0.009 0 0.000 59 0.017 (am)(o) 59 0.009 59 0.017 0 0.000 (g)(tc) 56 0.008 56 0.017 0 0.000 (bg)(r) 55 0.008 0 0.000 55 0.016 (tc)(oe) 54 0.008 0 0.000 54 0.016 (am)(oe) 53 0.008 0 0.000 53 0.015 (ar)(o) 53 0.008 0 0.000 53 0.015 (d)(oe) 53 0.008 53 0.016 0 0.000 (g)(z) 50 0.007 50 0.015 0 0.000 (g)(sc) .... .... .... .... So the correlation is not inter-word. Perhaps intra-word? cat .voyn.glp \ | sed \ -e 's:([_/=]*)::g' \ -e 's/([b]*g)/&@/g' \ -e 's/([q]*o[er]*)/@&/g' \ | tr '@' '\012' \ | egrep '(^\([q]*o[er]*\))|(\([b]*g\)$)' \ | sort | uniq -c | expand \ | sort +0.0 -0.7nr \ | compute-freqs \ > .voyn-qo-bg.frq 290 0.056 (oe) 165 0.032 (qo)(dc)(bg) 153 0.029 (qo)(dcc)(bg) 124 0.024 (or) 93 0.018 (g) 88 0.017 (qo)(dcc)(g) 78 0.015 (oe)(tc)(bg) 67 0.013 (qo)(d)(g) 64 0.012 (qo)(d)(an) 63 0.012 (tc)(bg) 57 0.011 (sc)(bg) 55 0.011 (oe)(sc)(bg) 52 0.010 (e)(tc)(bg) 52 0.010 (o)(hc)(bg) 52 0.010 (oe)(g) 47 0.009 (o)(dc)(bg) 47 0.009 (qo)(hc)(bg) 45 0.009 (qo)(d) 43 0.008 (bg) 42 0.008 (qo)(dc)(g) 41 0.008 (qo)(hcc)(bg) 39 0.007 (o)(dcc)(bg) 39 0.007 (qo)(d)(ae) 36 0.007 (qoe) 35 0.007 (oe)(bg) 35 0.007 (oe)(tc)(g) 33 0.006 (oe)(dc)(bg) 30 0.006 (qo)(h)(g) 29 0.006 (oe)(dcc)(bg) 29 0.006 (sc)(g) 28 0.005 (oe)(dcc)(g) 26 0.005 (e)(sc)(bg) 26 0.005 (qo)(d)(am) 25 0.005 (hc)(bg) 25 0.005 (o)(hcc)(bg) .... .... So the correlation does not seem to be intra-word either. Boh.