While preparing the new label location maps (Note-010.html), I got curious about the colocates of some words. Let's start with "daiin" which is very common and almost as frequent in both languages: compare-word-colocates \ '\bdaiin\b' \ hea-f-eva.wds heb-f-eva.wds bio-f-eva.wds count hea-f-eva.wds count heb-f-eva.wds count bio-f-eva.wds ----- ---------------------- ----- ---------------------- ----- ---------------------- 84 daiin / 19 / daiin 19 / daiin 46 / daiin 5 daiin / 9 daiin / 30 chol daiin 4 - daiin 7 daiin chey 23 daiin = 4 daiin chedy 6 daiin ol 12 - daiin 3 chckhy daiin 5 daiin chedy 11 daiin cthy 3 chedy daiin 5 shey daiin 10 daiin daiin 3 daiin or 4 daiin daiin 10 shol daiin 3 daiin otal 4 daiin shedy 9 chor daiin 2 ar daiin 4 daiin shey 8 daiin - 2 daiin chcthy 4 qokal daiin It is reassuring that in both languages "daiin" likes line-start and line-end positions. However it is curious that in language B "daiin" favors line-starts, while in language A it prefers line-ends. Let's modify the code so that it ignores line breaks. Let's also map 't' to 'k', final [ao] to y, initial y or qy to o or qo: compare-word-colocates \ '\bd[ao]iin\b' \ hea-f-eva.wds heb-f-eva.wds bio-f-eva.wds count hea-f-eva.wds count heb-f-eva.wds count bio-f-eva.wds ----- ---------------------- ----- ---------------------- ----- ---------------------- 30 chol daiin 4 chckhy daiin 7 daiin chey 23 daiin = 4 daiin chedy 6 daiin ol 15 daiin ckhy 4 daiin okal 5 daiin chedy 13 daiin daiin 3 ar daiin 5 daiin okedy 11 daiin qokchy 3 chedy daiin 5 qokal daiin 10 shol daiin 3 daiin okaiin 5 qoky daiin 9 chor daiin 3 daiin or 5 shey daiin 9 okol daiin 3 daiin shody 4 daiin daiin 8 chy daiin 3 okaiin daiin 4 daiin okaiin 8 ckhy daiin 2 daiin chckhy 4 daiin shedy Perhaps some of A's chol, shol, ckhy corresponds to B's chey, shey, chedy, shedy. Let's try with "okaiin": compare-word-colocates \ '\b[q]*[oy]k[ao]iin\b' \ hea-f-eva.wds heb-f-eva.wds bio-f-eva.wds count hea-f-eva.wds count heb-f-eva.wds count bio-f-eva.wds ----- ---------------------- ----- ---------------------- ----- ---------------------- 7 daiin okaiin 5 okaiin okaiin 22 shedy qokaiin 5 chol okaiin 4 okaiin = 17 chedy qokaiin 4 okaiin = 3 daiin okaiin 17 qokaiin chedy 4 okaiin ckhy 3 okaiin chckhy 13 qokaiin shedy 4 okaiin okaiin 3 okaiin daiin 12 qokaiin ol 4 or okaiin 3 okaiin okar 10 shey qokaiin 3 okaiin daiin 2 aiin okaiin 9 chey qokaiin 3 okaiin s 2 chckhy okaiin 9 okaiin shedy 2 ckhor okaiin 2 chdy qokaiin 9 qokaiin checkhy 2 ckhy qokaiin 2 dain okaiin 8 qokaiin chckhy It may be that A's chol is B's chedy/shedy. Another word that is common in both languages is "okal": compare-word-colocates \ '\b[q]*[oy][tk][oa][l]\b' \ hea-f-eva.wds heb-f-eva.wds bio-f-eva.wds count hea-f-eva.wds count heb-f-eva.wds count bio-f-eva.wds ----- ---------------------- ----- ---------------------- ----- ---------------------- 9 okol daiin 4 daiin okal 15 qokal chedy 8 qokol daiin 3 chdy okal 12 qokal shedy 7 okol chol 3 okal dar 9 chedy qokal 5 daiin qokol 2 aiin okal 9 qokeedy qokal 3 ckhor okol 2 chckhy okal 9 shedy qokal 3 dain okol 2 okaiin okal 7 qokal dar 3 okal chol 2 okal chedy 7 qokedy qokal 3 okol dol 2 okal chody 6 okal chedy 3 shor okol 2 okal dam 5 qokal daiin 2 chody okol 2 okal okair 5 qokal dy Here are the counts withot the k/t and o/y fixes: count hea-f-eva.wds count heb-f-eva.wds count bio-f-eva.wds ----- ---------------------- ----- ---------------------- ----- ---------------------- 6 otol chol 3 daiin otal 11 qokal chedy 5 qokol daiin 3 okal dar 9 qokal shedy 4 daiin qotol 2 aiin okal 8 shedy qokal 4 otol daiin 2 chckhy okal 6 chedy qokal 3 okol daiin 2 chdy ykal 6 qokeedy qokal 3 qotol daiin 2 okal chedy 5 qokal daiin 2 cho qokol 2 okal okair 4 okal chedy 2 cthor otol 2 okal shdy 4 qokal dar 2 daiin otal 2 qokol chedy 4 qotal chedy 2 odaiin okal 1 chcfhol okal 3 chedy qotal So it seems that A uses otol/okol where B uses okal/otal. It is tempting to identify A's chol with B's chedy/shedy. Let's try with "okar", which is also distributed fairly uniformly: compare-word-colocates \ '\b[q]*[oy][tk][oa]r\b' \ hea-f-eva.wds heb-f-eva.wds bio-f-eva.wds count hea-f-eva.wds count heb-f-eva.wds count bio-f-eva.wds ----- ---------------------- ----- ---------------------- ----- ---------------------- 3 daiin qokor 6 qokar okar 6 qokar shedy 3 okor chor 5 okar chdy 6 qokeedy qokar 3 qokor chor 4 okar ar 5 chedy qokar 2 dain qokor 4 okar or 5 qokar ol 2 dy qokor 3 ar okar 4 chckhy okar 2 okor chey 3 okaiin okar 4 okar okedy 2 oky okor 3 okar chedy 4 okar ol 2 qokchy qokor 3 okar okedy 4 okar shedy 2 qokor chol 3 okar ol 4 shey qokar 2 qokor daiin 3 qokar chckhy 3 okar chedy Again, where A uses "or", B uses "ar". Perhaps A's qokchy is B's qokeedy ? Another fairly uniform word is "qokeey": compare-word-colocates \ '\b[q]*[oy][kt][cse][eh][yo]\b' \ hea-f-eva.wds heb-f-eva.wds bio-f-eva.wds count hea-f-eva.wds count heb-f-eva.wds count bio-f-eva.wds ----- ---------------------- ----- ---------------------- ----- ---------------------- 11 daiin qokchy 3 chedy okeey 8 qokeey qokedy 8 okchy kchy 2 keedy okeey 6 qokeedy qokeey 7 daiin okchy 2 okchy okar 6 shedy qokeey 7 qokchy qokchy 2 okeey daiin 4 chedy qokeey 5 ckhy okchy 2 okeey dar 4 qokeey okeey 5 okchy daiin 2 r okeey 4 qokeey raiin 5 okeey daiin 1 alfshe? okshy 3 dar qokeey 5 qokchy daiin 1 arar okeey 3 okeey qol 5 qokchy kchy 1 chees okeey 3 qokeey daiin 4 qokchy qoky 1 chek qokchy 3 qokeey qokaiin Here are the counts without the k/t and y/o fixes: count hea-f-eva.wds count heb-f-eva.wds count bio-f-eva.wds ----- ---------------------- ----- ---------------------- ----- ---------------------- 4 daiin qokchy 2 chedy okeey 6 qokeey qokedy 4 daiin qotchy 2 r yteey 6 shedy qokeey 4 qotchy qokchy 1 alfshe? okshy 4 chedy qokeey 3 cthy otchy 1 arar oteey 3 dar qokeey 3 okeey daiin 1 chedy ykeey 3 qokeey daiin 3 qotchy daiin 1 chees oteey 3 qokeey qokaiin 3 qoteey daiin 1 chek qokchy 3 qokeey raiin 2 aiin qotchy 1 cheody okeey 2 oteey qol 2 choty qokchy 1 chfalas qokeey 2 pchedy qokeey 2 daiin otchy 1 cthy qokeey 2 qokedy qokeey Note the near-repetition "qotchy qokchy" in A, and "qokeey qokedy" or "qokedy qokeey" in B. Now "otam", also fairly uniform: compare-word-colocates \ '\b[q]*[oy][tk][ao][mjg]\b' \ hea-f-eva.wds heb-f-eva.wds bio-f-eva.wds count hea-f-eva.wds count heb-f-eva.wds count bio-f-eva.wds ----- ---------------------- ----- ---------------------- ----- ---------------------- 1 char okam 2 chdam qokam 1 chcphey qokam 1 chol qokom 2 daiin okam 1 chedy qokam 1 ckham okom 2 qokar okam 1 lchey qokam 1 ckhor okam 1 aiin okam 1 okam olaiin 1 dar okom 1 akedy okam 1 okar okam 1 kal okam 1 ar okam 1 qokam chedy 1 kchody qokam 1 chdar okam 1 qokam okal 1 okam = 1 chdy okam 1 qokam qokaiin 1 okam chckh 1 checkhy okam 1 qokam s 1 okam chol 1 chekeedy okam 1 qokam sol Can't say much... Next is "chey" also very uniform: compare-word-colocates \ '\b[q]*[cse][eh]ey\b' \ hea-f-eva.wds heb-f-eva.wds bio-f-eva.wds count hea-f-eva.wds count heb-f-eva.wds count bio-f-eva.wds ----- ---------------------- ----- ---------------------- ----- ---------------------- 5 chey kchy 3 chey = 10 shey qokaiin 3 cheor chey 2 dar chey 9 chey qokaiin 3 chey keey 2 qoky chey 7 daiin chey 3 dar shey 2 shey daiin 7 qol chey 3 dy shey 2 shey qokaiin 6 qokaiin shey 2 chey dam 1 ar shey 5 chey qokeedy 2 chey kchol 1 chdain shey 5 ol shey 2 chey keor 1 chdar shey 5 shey daiin 2 chey kor 1 che?dy chey 5 shey qokedy 2 chey kshey 1 chedy chey 5 shey qoky Not clear... The word "chckhey" is also fairly uniform: compare-word-colocates \ '\b[cse][eh][ce][kt][he]ey\b' \ hea-f-eva.wds heb-f-eva.wds bio-f-eva.wds count hea-f-eva.wds count heb-f-eva.wds count bio-f-eva.wds ----- ---------------------- ----- ---------------------- ----- ---------------------- 1 chain chckhey 1 ???in shckhey 2 daiin chckhey 1 chckhey chor 1 chckhey = 2 qokaiin chckhey 1 chckhey daiin 1 chckhey choky 1 chckhey cheor 1 chckhey okaiin 1 chckhey okchdy 1 chckhey dar 1 chckhey okshy 1 chckhey or 1 chckhey kedy 1 chckhey ol 1 dair shckhey 1 chckhey lchey 1 chckhey qod 1 kodaiin shckhey 1 chckhey ldy 1 chkaiin shckhey 1 odain chckhey 1 chckhey qokeedy 1 ckhol chckhey 1 okain chckhey 1 chckhey qokeeol 1 ckhy chckhey 1 okam chckhey 1 chckhey saiin The uses of this word are too scattered for us to say anything useful. Another uniform word is "yshey": compare-word-colocates \ '\b[q]*[oy][cse][eh]ey\b' \ hea-f-eva.wds heb-f-eva.wds bio-f-eva.wds count hea-f-eva.wds count heb-f-eva.wds count bio-f-eva.wds ----- ---------------------- ----- ---------------------- ----- ---------------------- 1 chey qoeeey 1 chdy ochey 1 chealy oshey 1 chydaiin ochey 1 dy ochey 1 cheedy oshey 1 ckhar ochey 1 kchodain oeeey 1 dy ochey 1 ckhy ochey 1 lor ochey 1 lcheey qochey 1 daiin ochey 1 ochey dar 1 lor oshey 1 dy ochey 1 ochey kamar 1 ochey kal 1 ochey chol 1 ochey oly 1 ochey qokain 1 ochey ckhos 1 oeeey okaiin 1 okar oshey 1 ochey kchokchy 1 ols oshey 1 ochey kchos 1 oroly ochey Finally, let's try "or": compare-word-colocates \ '\b[oya]r\b' \ hea-f-eva.wds heb-f-eva.wds bio-f-eva.wds count hea-f-eva.wds count heb-f-eva.wds count bio-f-eva.wds ----- ---------------------- ----- ---------------------- ----- ---------------------- 5 ckhy or 6 or aiin 8 or shedy 4 or okaiin 4 okar ar 4 or aiin 3 ar al 4 okar or 3 or al 3 chol or 3 ar daiin 2 chedy or 3 or chol 3 ar okar 2 chekar or 3 or chor 3 daiin or 2 dal or 2 daiin or 3 dar ar 2 dar ar 2 dol or 3 kor or 2 or chey 2 okaiin or 3 or ar 2 or sheey 2 or aiin 2 ar aiin 2 or shey Again, it seems that A's chol is B's chedy. Also A's okaiin seems to be B's aiin. Now a few random words: compare-word-colocates \ '\b[cs]ho[rl]\b' \ hea-f-eva.wds heb-f-eva.wds bio-f-eva.wds count hea-f-eva.wds count heb-f-eva.wds count bio-f-eva.wds ----- ---------------------- ----- ---------------------- ----- ---------------------- 30 chol daiin 3 chol kar 2 qokaiin chol 20 chol chol 2 or chol 2 qokol chol 10 shol daiin 1 arakaiin shol 2 shol kedy 9 chor daiin 1 chdaiin chol 1 ?chor or 8 chol ckhol 1 ches chol 1 chcphey chol 8 chol shol 1 chkaiin chol 1 chey chol 8 chor chol 1 chol alaiin 1 chol ar 7 chol ckhy 1 chol chckhy 1 chol chedcheydaiin 7 okol chol 1 chol chky 1 chol cheky 6 chol chor 1 chol dar 1 chol chy Note the curious numeric coincidence in the first file. compare-word-colocates \ '\b[cse][he]edy\b' \ hea-f-eva.wds heb-f-eva.wds bio-f-eva.wds count hea-f-eva.wds count heb-f-eva.wds count bio-f-eva.wds ----- ---------------------- ----- ---------------------- ----- ---------------------- 4 daiin chedy 22 shedy qokaiin 3 chedy daiin 18 qol chedy 3 chedy okedy 17 chedy qokaiin 3 chedy okeey 17 qokaiin chedy 3 okar chedy 17 shedy qokedy 3 shedy qokedy 15 chedy qol 2 chedy chckhy 15 ol shedy 2 chedy dal 15 qokal chedy 2 chedy dar 15 shedy qokeedy 2 chedy kedy 13 ol chedy compare-word-colocates \ '\b[ce][tk][eh][ao][rl]\b' \ hea-f-eva.wds heb-f-eva.wds bio-f-eva.wds count hea-f-eva.wds count heb-f-eva.wds count bio-f-eva.wds ----- ---------------------- ----- ---------------------- ----- ---------------------- 8 chol ckhol 1 aiin ckhar 1 ckhal saiin 8 daiin ckhor 1 ckhar od 1 ckhol chedy 7 daiin ckhol 1 ckhol ol 1 ckhol skar 6 ckhol chol 1 okaiin ckhol 1 ckhor chey 5 ckhol daiin 1 ckhor olchdy 3 chor ckhol 1 daiin ckhal 3 chor ckhor 1 iin ckhor 3 ckhol dy 1 olshey ckhor 3 ckhor chol 1 qokal ckhol 3 ckhor okol 1 rkaiin ckhol