I prepared a WWW page with the word pair tables above. I also prepared analogous tables for the English and Portuguese texts: cat engl.txt | sed -e 's@$@ //@g' | tr ' ' '\012' | egrep '.' > engl2.wds cat engl2-keys.dic // the a an and of in on at to for with as up but i you he she it is was had be my his her him me that mrs john cynthia inglethorp cat engl2.wds | tr '[A-Z]' '[a-z]' | head -4661 \ | enum-word-pairs \ | count-diword-freqs -v keyfile=engl2-keys.dic \ > .baz cat port.txt | sed -e 's@$@ //@g' | tr ' ' '\012' | egrep '.' > port2.wds cat port2-keys.dic // a da na o do no ao as das os dos um uma cada de em por para e ou como que é ser não são aresta face arestas faces complexo vértices celular cat port2.wds | sed -e 's/^x$/???/g' | head -7000 \ | enum-word-pairs \ | count-diword-freqs -v keyfile=port2-keys.dic \ > .baz The results were posted on my Voynich WWW page. Decided to recompute the tables, adding the left and right probabilities. cat .wds \ | sed -e '/?/s/^.*$/???/g' \ | enum-word-pairs \ | count-diword-freqs -v keyfile=.keys \ > .baz