Redoing the statistics. Should I correct the 2/R "mistakes"? Let's not do it for now. But I will combine H+D, P+F, S+T: Tetragram frequencies around line breaks, ignoring spaces: cat .voyn.fsg \ | tr -d ' /=' \ | sed -e 's/^\(..\).*\(..\)$/\1\2/g' \ | tr -s '\012' ':' \ | enum-ngraphs -v n=5 \ | egrep -v '\*' \ | egrep '^..:..$' \ > .voyn-nl-2-2.grm cat .voyn-nl-2-2.grm \ | sort | uniq -c | expand \ | compute-freqs \ > .voyn-nl-2-2.frq Tetragram frequencies around blanks (spaces and line breaks): cat .voyn.fsg \ | tr -d '/=' \ | tr -s ' \012' '__' \ | enum-ngraphs -v n=7 \ | egrep -v '\*' \ | egrep '^..._...$' \ | sed \ -e 's/^\(...\)_\(...\)$/\1:\2/g' \ -e 's/_//g' \ -e 's/^.*\(..\):\(..\).*$/\1:\2/g' \ > .voyn-sp-2-2.grm cat .voyn-sp-2-2.grm \ | sort | uniq -c | expand \ | compute-freqs \ > .voyn-sp-2-2.frq Comparisons: compare-freqs \ .voyn-tt-2-2.frq \ .voyn-nl-2-2.frq \ | compute-count-ratio \ | sort +0.0 -0.2r +4 -5nr \ > .voyn-tt-nl-2-2.cmp compare-freqs \ .voyn-sp-2-2.frq \ .voyn-nl-2-2.frq \ | compute-count-ratio \ | sort +0.0 -0.2r +4 -5nr \ > .voyn-sp-nl-2-2.cmp