I stole some text in pinyin from http://www-personal.umich.edu/~wbaxter/,
  cleaned it some and saved it to chin-mch.txt.
  
  This is a bad sample: in the first statistics I ran, "zhong1 guo2"
  (China) came out neat the top. That's because half the sample is a
  Voice of America semi-political speech...
  
  So I removed all (but one) occurrences of "zhong1 guo2" from the sample.

  Let's run some statistics.  Fist, words overall:
  
    cat chin-mch.txt \
      | tr ' ' '\012' \
      | grep '.' \
      | sort | uniq -c | expand \
      | sort +0 -1nr \
      | compute-freqs \
      > .chin.frq

    count freqy word 
    ----- ----- -----------
      244 0.065 de
      118 0.031 shi4
       78 0.021 ren2
       62 0.016 you3
       61 0.016 ta1
       55 0.015 xue2
       54 0.014 wen2
       50 0.013 shi2
       50 0.013 zai4
       42 0.011 guo2
       41 0.011 yi2
       40 0.011 yi4
       37 0.010 le
       35 0.009 ge
       34 0.009 shuo1
       33 0.009 bu4
      ... ..... .....

  Now, without tones:
  
    cat chin-mch.txt \
      | tr ' ' '\012' \
      | tr -d '0-9' \
      | grep '.' \
      | sort | uniq -c | expand \
      | sort +0 -1nr \
      | compute-freqs \
      > .chin-notone.frq  
      
    count freqy word 
    ----- ----- -----------
      245 0.065 de
      223 0.059 shi
      129 0.034 yi
       93 0.025 ren
       71 0.019 you
       65 0.017 bu
       61 0.016 ta
       60 0.016 guo
       58 0.015 wen
       55 0.015 xue
       55 0.015 zi
       50 0.013 zai
       47 0.012 ji
       44 0.012 yu
       44 0.012 zhi
       43 0.011 ge
       40 0.011 mei

  Now for the initial consonant:
  
    cat chin-mch.txt \
      | tr ' ' '\012' \
      | tr -d '0-9' \
      | sed -e 's/[aeiouü].*$//g' \
      | grep '.' \
      | sort | uniq -c | expand \
      | sort +0 -1nr \
      | compute-freqs \
      > .chin-initial.frq   
      
    count freqy word 
    ----- ----- -----------
      473 0.126 d
      402 0.107 y
      364 0.097 sh
      215 0.057 j
      209 0.056 x
      198 0.053 h
      197 0.053 zh
      178 0.047 g
      173 0.046 l
      166 0.044 z
      157 0.042 b
      138 0.037 w
      130 0.035 r
      130 0.035 t
       91 0.024 f
       90 0.024 m
       89 0.024 n
       89 0.024 q
       75 0.020 k
       74 0.020 ch

  Now for the final (vowels plus terminators):
  
    cat chin-mch.txt \
      | tr ' ' '\012' \
      | tr -d '0-9' \
      | sed -e 's/^[^aeiouü]*//g' \
      | grep '.' \
      | sort | uniq -c | expand \
      | sort +0 -1nr \
      | compute-freqs \
      > .chin-final.frq  
      
    count freqy word 
    ----- ----- -----------
      654 0.173 i
      434 0.115 e
      311 0.082 u
      238 0.063 en
      179 0.047 ai
      168 0.045 uo
      145 0.038 ou
      130 0.034 a
      126 0.033 an
      123 0.033 ing
      122 0.032 ong
      118 0.031 ei
      113 0.030 ian
      109 0.029 eng
      102 0.027 ao
       98 0.026 ang
       73 0.019 ui
       67 0.018 ue
       59 0.016 iao

  Changing subject again, I have been looking at the differences between 
  languages A and B, particularly the tail (midfix+suffix) distribution.
  They really look like different languages.  Even taking into account
  possible letter confusion, there seems no simple correspondence 
  between the tails of one and those of the other.
  
  Just to be sure, let's try to recompute the tail distributions after collapsing 
  everything that could be equivalent:
  
    t,k ---------> t
    p,f ---------> p
    r,s ---------> e
    ei ----------> o
    o,a,y -------> o
    ch,sh -------> ee
    cth,ckh -----> tee
    cph,cfh -----> pee
    iiii,iii,ii -> i
    
    foreach lang ( a b )
      cat Note-009/he${lang}-f.factored \
        | sed \
            -e 's/sh/ee/g'   \
            -e 's/ch/ee/g'   \
            -e 's/s/e/g'     \
            -e 's/r/e/g'     \
            -e 's/k/t/g'     \
            -e 's/f/p/g'     \
            -e 's/cth/tee/g' \
            -e 's/ckh/tee/g' \
            -e 's/cph/pee/g' \
            -e 's/cfh/pee/g' \
            -e 's/ei/o/g'    \
            -e 's/a/o/g'     \
            -e 's/y/o/g'     \
            -e 's/iiii/i/g'  \
            -e 's/iii/i/g'   \
            -e 's/ii/i/g'    \
        > .he${lang}-f-ere.factored

      cat .he${lang}-f-ere.factored \
        | grep -e '- -' \
        | gawk '/./ {print $2}' \
        | sort | uniq -c | expand | sort +0 -1nr \
        > .he${lang}-f-ere-midfs-all.frq

      cat .he${lang}-f-ere.factored \
        | grep -e '- -' \
        | gawk '/./ {print $3}' \
        | sort | uniq -c | expand | sort +0 -1nr \
        > .he${lang}-f-ere-suffs-all.frq

      cat .he${lang}-f-ere.factored \
        | grep -e '- -' \
        | gawk '/./ {print ($2 $3)}' \
        | sed -e 's/--//g' \
        | sort | uniq -c | expand | sort +0 -1nr \
        > .he${lang}-f-ere-tails-all.frq
    end
    dicio-wc .he{a,b}-f-ere-{midf,suff,tail}s-all.frq
  
     lines   words     bytes file        
    ------ ------- --------- ------------
       179     358      2964 .hea-f-ere-midfs-all.frq
       131     262      1815 .hea-f-ere-suffs-all.frq
       655    1310     10600 .hea-f-ere-tails-all.frq
       133     266      2169 .heb-f-ere-midfs-all.frq
        82     164      1118 .heb-f-ere-suffs-all.frq
       420     840      6716 .heb-f-ere-tails-all.frq
  
  
    foreach elem ( midf suff tail )
      foreach lang ( A.a B.b )
        set file = "he${lang:e}-f-ere-${elem}s-all"
        echo "${file}.frq -> ${file}.fmt"
        cat .${file}.frq \
          | compute-freqs \
          | gawk '\
                BEGIN {\
                  printf "by Friedman\nlanguage '"${lang:r}"'\n"; \
                  printf "freq pc '"${elem}"'ix\n---- -- ------------------\n";} \
                /./   {printf "%4d %2d %s\n",$1,int($2*100+0.5),$3; t+=$1;} \
                END   {printf "---- -- ------------------\n%4d 99 TOTAL\n",t;} \
              ' \
          > .${file}.fmt
      end
    end

    foreach elem ( midf suff tail )
      set tfiles = ( )
      foreach lang ( a b )
        set file = "he${lang}-f-ere-${elem}s-all"
        set tfiles = ( ${tfiles} .${file}.fmt )
      end
      pr -m -t -i' '1 -w 54  ${tfiles} \
        | expand \
        > .herbal-f-ere-${elem}-cmp.txt
    end
    dicio-wc .herbal-f-ere-{midf,suff,tail}-cmp.txt
    
  Here are the results:

    by Friedman                by Friedman
    language A                 language B
    freq pc midfix             freq pc midfix
    ---- -- ------------------ ---- -- ------------------
    1595 27 -ee-                590 24 -t-
    1313 22 -tee-               293 12 -tee-
     913 15 -t-                 279 12 -eee-
     419  7 -eee-               274 11 -te-
     241  4 -teee-              269 11 -ee-
     191  3 -pee-                95  4 -teee-
     155  3 -te-                 66  3 -eetee-
     132  2 -eeot-               60  3 -eeet-
     100  2 -eeee-               52  2 -pee-
     100  2 -eeotee-             49  2 -eeee-
      99  2 -eetee-              48  2 -p-
      60  1 -p-                  45  2 -peee-
      57  1 -eet-                25  1 -eeetee-
      46  1 -peee-               24  1 -eet-
      40  1 -teeee-              15  1 -eeete-
      24  0 -eeet-               14  1 -eeteee-
    .... .. .......            .... .. .....
    ---- -- ------------------ ---- -- ------------------
    5967 99 TOTAL              2431 99 TOTAL   
    
  Tails: 

    by Friedman                by Friedman
    language A                 language B
    freq pc tailix             freq pc tailix
    ---- -- ------------------ ---- -- ------------------
     579 10 -teeo               153  6 -toe
     395  7 -eeo                150  6 -tedo
     370  6 -eeol               135  6 -teedo
     337  6 -eeoe               131  5 -toin
     226  4 -teeoe              118  5 -eeedo
     212  4 -teeol               92  4 -teeo
     197  3 -to                  88  4 -tol
     189  3 -tol                 87  4 -eedo
     178  3 -toin                65  3 -to
     167  3 -eeeo                52  2 -eeteeo
     156  3 -teeeo               51  2 -eeeo
     119  2 -toe                 41  2 -teeedo
      96  2 -eeoin               39  2 -teeeo
      91  2 -eeeoe               31  1 -teo
    
  Hmm, it seems that scribe A does not use "d" in the suffixes
  very much.  Perhaps if we delete "d" we will get a better resemblance:
  
    foreach lang ( a b )
      cat Note-009/he${lang}-f.factored \
        | sed \
            -e 's/d//g'      \
            -e 's/sh/ee/g'   \
            -e 's/ch/ee/g'   \
            -e 's/s/e/g'     \
            -e 's/r/e/g'     \
            -e 's/k/t/g'     \
            -e 's/f/p/g'     \
            -e 's/cth/tee/g' \
            -e 's/ckh/tee/g' \
            -e 's/cph/pee/g' \
            -e 's/cfh/pee/g' \
            -e 's/ei/o/g'    \
            -e 's/a/o/g'     \
            -e 's/y/o/g'     \
            -e 's/iiii/i/g'  \
            -e 's/iii/i/g'   \
            -e 's/ii/i/g'    \
        > .he${lang}-f-erf.factored
    end

    foreach lang ( a b )
      cat .he${lang}-f-erf.factored \
        | grep -e '- -' \
        | gawk '/./ {print $2}' \
        | sort | uniq -c | expand | sort +0 -1nr \
        > .he${lang}-f-erf-midfs-all.frq

      cat .he${lang}-f-erf.factored \
        | grep -e '- -' \
        | gawk '/./ {print $3}' \
        | sort | uniq -c | expand | sort +0 -1nr \
        > .he${lang}-f-erf-suffs-all.frq

      cat .he${lang}-f-erf.factored \
        | grep -e '- -' \
        | gawk '/./ {print ($2 $3)}' \
        | sed -e 's/--//g' \
        | sort | uniq -c | expand | sort +0 -1nr \
        > .he${lang}-f-erf-tails-all.frq
    end
    dicio-wc .he{a,b}-f-erf-{midf,suff,tail}s-all.frq
  
     lines   words     bytes file        
    ------ ------- --------- ------------
       162     324      2666 .hea-f-erf-midfs-all.frq
        85     170      1159 .hea-f-erf-suffs-all.frq
       535    1070      8572 .hea-f-erf-tails-all.frq
       125     250      2028 .heb-f-erf-midfs-all.frq
        54     108       722 .heb-f-erf-suffs-all.frq
       329     658      5186 .heb-f-erf-tails-all.frq
  
    foreach elem ( midf suff tail )
      foreach lang ( A.a B.b )
        set file = "he${lang:e}-f-erf-${elem}s-all"
        echo "${file}.frq -> ${file}.fmt"
        cat .${file}.frq \
          | compute-freqs \
          | gawk '\
                BEGIN {\
                  printf "by Friedman\nlanguage '"${lang:r}"'\n"; \
                  printf "freq pc '"${elem}"'ix\n---- -- ------------------\n";} \
                /./   {printf "%4d %2d %s\n",$1,int($2*100+0.5),$3; t+=$1;} \
                END   {printf "---- -- ------------------\n%4d 99 TOTAL\n",t;} \
              ' \
          > .${file}.fmt
      end
    end

    foreach elem ( midf suff tail )
      set tfiles = ( )
      foreach lang ( a b )
        set file = "he${lang}-f-erf-${elem}s-all"
        set tfiles = ( ${tfiles} .${file}.fmt )
      end
      pr -m -t -i' '1 -w 54  ${tfiles} \
        | expand \
        > .herbal-f-erf-${elem}-cmp.txt
    end
    dicio-wc .herbal-f-erf-{midf,suff,tail}-cmp.txt
  
     lines   words     bytes file        
    ------ ------- --------- ------------
       168     893      6707 .herbal-f-erf-midf-cmp.txt
        91     449      3316 .herbal-f-erf-suff-cmp.txt
       541    2624     20105 .herbal-f-erf-tail-cmp.txt

    by Friedman                by Friedman
    language A                 language B
    freq pc tailix             freq pc tailix
    ---- -- ------------------ ---- -- ------------------
     611 10 -teeo               231 10 -teeo
     422  7 -eeo                185  8 -teo
     374  6 -eeol               172  7 -eeeo
     338  6 -eeoe               153  6 -toe
     228  4 -teeoe              131  5 -toin
     218  4 -teeol              116  5 -eeo
     216  4 -to                  90  4 -tol
     195  3 -tol                 84  4 -teeeo
     180  3 -toin                69  3 -to
     172  3 -eeeo                59  2 -eeteeo
     161  3 -teeeo               36  2 -eeeeo
     120  2 -toe                 34  1 -eeoe
      98  2 -eeoin               34  1 -eeol
      91  2 -eeeoe               30  1 -peeeo
      79  1 -eeoteeo             30  1 -tom
      76  1 -eeoo                29  1 -eeeto
      70  1 -eeteeo              29  1 -eeoo
      69  1 -teeoo               29  1 -peeo
      64  1 -eeeeo               26  1 -eeeoe
      62  1 -eeoto               26  1 -teoo

  Good news, at least we got the top entry to match.
  Now what else can we do?  We could map "teeoe" and
  "teeol" to "teo", but that seems a bit ad-hoc...
  
  Let's try again.  Let's compare the frequencies of 
  "k" and "t", "sh" and "ch" in each language"
  
    foreach lang ( a b )
      cat Note-009/he${lang}-f.factored \
        | sed \
            -e 's/[- .:]//g' \
            -e 's/ch/C/' \
            -e 's/sh/S/' \
            -e 's/$/\./' \
        | count-digraph-freqs \
            -v pad="." \
            -v showentropy=0 \
            -v chars=".CSaoeilmnrchtpkfsqjdvxyg"
    end
    
  Language A:

    Digraph counts:

           TT     .     C     S     a     o     e     i     l     m     n     r     c     h     t     p     k     f     s     q     d     y
        ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- -----
      n  1376  1341     1     .     1     7     .     .     3     2     .     1     1     .     1     .     .     .     2     .     8     8
      m   265   261     .     .     1     3     .     .     .     .     .     .     .     .     .     .     .     .     .     .     .     .
      r  1569  1302    32     6    59    59     1     5     4     3     .     2     4     .     .     .     1     .     2     .    16    73
      l  1720  1310    35    14    11    88     7     .     .     3     .     2     6     .     6     1     7     1    40     1   120    68
      y  3189  2543    91    16     5    16     3     .     3     1     .     2    10     .   183    21   186     2    16     2    89     .
      s   669   316    51     5   104   107    16     .     .     2     .     .     4    12     .     1     3     .     1     .     .    47
      d  2234   160   145    51  1109   175    24     .    14     5     .     2    10     .     3     3     5     .    11     .     7   510
      k  1650    24   377    75   223   252   258     .     .     1     .     .    43   226     .     .     .     .     3     .     2   166
      t  1790    17   423    57   161   273   124     .     1     .     .     .    28   522     1     1     .     .     4     1     5   172
      p   324     7   117    11     9    35     .     .     .     .     .     .    16   101     1     .     .     .     .     .     3    24
      f   106     9    28     5     6    14     .     .     .     .     .     .     2    30     .     .     .     .     .     .     4     8
      .  7812     .  1507   745    79  1145    26    12    41    16     3    57   615     .   267    95   352    33   356   708  1266   489
      c  1001     .     .     .     .     .     .     .     .     .     .     .     .   122   522   101   226    30     .     .     .     .
      o  5711   410    59    24    74    18    60   101  1325    91     7   993   141     .   726    83   742    35   117     4   641    60
      a  2318    43     4     .     1     4     .  1305   311   131    54   412     3     .     4     2     7     2    10     .    19     6
      i  2601     2     1     2     .     1     3  1173     4     6  1300    83     3     .     1     .    14     .     3     .     5     .
      e  1958    32    11     1   118   529   475     1     1     3    12     4    12     .    34    11    37     1    79     .    10   587
      h  1013    10     4     1    93   335   177     1     2     .     .     2     .     .     1     .     1     .     9     .     4   373
      S  1016    15     5     .    47   525   233     .     3     .     .     .    21     .     6     .    13     1     4     .     6   137
      C  2892    10     .     3   217  1427   549     3     7     1     .     9    80     .    32     5    51     1    12     .    28   457
      q   716     .     1     .     .   698     2     .     1     .     .     .     2     .     2     .     5     .     .     .     1     4
        ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- -----
    TOT 41930  7812  2892  1016  2318  5711  1958  2601  1720   265  1376  1569  1001  1013  1790   324  1650   106   669   716  2234  3189

    Next-symbol probability (× 99):

        TT  .  C  S  a  o  e  i  l  m  n  r  c  h  t  p  k  f  s  q  d  y
        -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
      . 99  . 19  9  1 15  .  .  1  .  .  1  8  .  3  1  4  .  5  9 16  6
      C 99  .  .  .  7 49 19  .  .  .  .  .  3  .  1  .  2  .  .  .  1 16
      S 99  1  .  .  5 51 23  .  .  .  .  .  2  .  1  .  1  .  .  .  1 13
      a 99  2  .  .  .  .  . 56 13  6  2 18  .  .  .  .  .  .  .  .  1  .
      o 99  7  1  .  1  .  1  2 23  2  . 17  2  . 13  1 13  1  2  . 11  1
      e 99  2  1  .  6 27 24  .  .  .  1  .  1  .  2  1  2  .  4  .  1 30
      i 99  .  .  .  .  .  . 45  .  . 49  3  .  .  .  .  1  .  .  .  .  .
      l 99 75  2  1  1  5  .  .  .  .  .  .  .  .  .  .  .  .  2  .  7  4
      m 99 98  .  .  .  1  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .
      n 99 96  .  .  .  1  .  .  .  .  .  .  .  .  .  .  .  .  .  .  1  1
      r 99 82  2  .  4  4  .  .  .  .  .  .  .  .  .  .  .  .  .  .  1  5
      c 99  .  .  .  .  .  .  .  .  .  .  .  . 12 52 10 22  3  .  .  .  .
      h 99  1  .  .  9 33 17  .  .  .  .  .  .  .  .  .  .  .  1  .  . 36
      t 99  1 23  3  9 15  7  .  .  .  .  .  2 29  .  .  .  .  .  .  . 10
      p 99  2 36  3  3 11  .  .  .  .  .  .  5 31  .  .  .  .  .  .  1  7
      k 99  1 23  5 13 15 15  .  .  .  .  .  3 14  .  .  .  .  .  .  . 10
      f 99  8 26  5  6 13  .  .  .  .  .  .  2 28  .  .  .  .  .  .  4  7
      s 99 47  8  1 15 16  2  .  .  .  .  .  1  2  .  .  .  .  .  .  .  7
      q 99  .  .  .  . 97  .  .  .  .  .  .  .  .  .  .  1  .  .  .  .  1
      d 99  7  6  2 49  8  1  .  1  .  .  .  .  .  .  .  .  .  .  .  . 23
      y 99 79  3  .  .  .  .  .  .  .  .  .  .  .  6  1  6  .  .  .  3  .
        -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
    TOT 99 18  7  2  5 13  5  6  4  1  3  4  2  2  4  1  4  0  2  2  5  8

    Previous-symbol probability (× 99):

        TT  .  C  S  a  o  e  i  l  m  n  r  c  h  t  p  k  f  s  q  d  y
        -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
      . 18  . 52 73  3 20  1  .  2  6  .  4 61  . 15 29 21 31 53 98 56 15
      C  7  .  .  .  9 25 28  .  .  .  .  1  8  .  2  2  3  1  2  .  1 14
      S  2  .  .  .  2  9 12  .  .  .  .  .  2  .  .  .  1  1  1  .  .  4
      a  5  1  .  .  .  .  . 50 18 49  4 26  .  .  .  1  .  2  1  .  1  .
      o 13  5  2  2  3  .  3  4 76 34  1 63 14  . 40 25 45 33 17  1 28  2
      e  5  .  .  .  5  9 24  .  .  1  1  .  1  .  2  3  2  1 12  .  . 18
      i  6  .  .  .  .  .  . 45  .  2 94  5  .  .  .  .  1  .  .  .  .  .
      l  4 17  1  1  .  2  .  .  .  1  .  .  1  .  .  .  .  1  6  .  5  2
      m  1  3  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .
      n  3 17  .  .  .  .  .  .  .  1  .  .  .  .  .  .  .  .  .  .  .  .
      r  4 17  1  1  3  1  .  .  .  1  .  .  .  .  .  .  .  .  .  .  1  2
      c  2  .  .  .  .  .  .  .  .  .  .  .  . 12 29 31 14 28  .  .  .  .
      h  2  .  .  .  4  6  9  .  .  .  .  .  .  .  .  .  .  .  1  .  . 12
      t  4  . 14  6  7  5  6  .  .  .  .  .  3 51  .  .  .  .  1  .  .  5
      p  1  .  4  1  .  1  .  .  .  .  .  .  2 10  .  .  .  .  .  .  .  1
      k  4  . 13  7 10  4 13  .  .  .  .  .  4 22  .  .  .  .  .  .  .  5
      f  0  .  1  .  .  .  .  .  .  .  .  .  .  3  .  .  .  .  .  .  .  .
      s  2  4  2  .  4  2  1  .  .  1  .  .  .  1  .  .  .  .  .  .  .  1
      q  2  .  .  .  . 12  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .
      d  5  2  5  5 47  3  1  .  1  2  .  .  1  .  .  1  .  .  2  .  . 16
      y  8 32  3  2  .  .  .  .  .  .  .  .  1  . 10  6 11  2  2  .  4  .
        -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
    TOT 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99

  Language B

    Digraph counts:

           TT     .     C     S     a     o     e     i     l     m     n     r     c     h     t     p     k     f     s     q     d     x     y
        ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- -----
      n   513   500     1     .     .     1     .     .     .     .     .     4     .     .     .     .     .     .     1     1     1     .     4
      y  1754  1473    18     8     1     6     2     .     5     4     .     2     1     .    75    17   110     8     4     .    20     .     .
      m   113   107     .     .     2     .     .     .     .     1     .     .     .     .     .     .     .     .     .     .     3     .     .
      r   670   532     6     2    77    16     2     4     1     .     .     .     1     .     .     .     .     .     .     .     8     .    21
      l   612   315    35    15    36    34     2     .     1     .     .     4     3     .     8     .    55     4    10     1    52     .    37
      s   191    94     6     .    60     7     5     .     1     .     .     .     .     6     .     2     1     .     1     .     2     .     6
      d  1477    71    19    13   421    38    12     1     6     .     .     1     2     .     .     1     2     .     2     .     .     .   888
      f    86     5    33     1    20     7     2     .     .     .     .     .     2     9     .     .     .     .     1     .     1     .     5
      p   142     5    65     9    22    12     1     .     .     .     .     .     5    12     .     .     .     .     .     .     4     .     7
      .  3223     .   540   256   171   760    14     7    49     5     .    21    42     .   137    53   163    21    75   330   341     2   236
      x     4     1     .     .     .     3     .     .     .     .     .     .     .     .     .     .     .     .     .     .     .     .     .
      c   219     .     .     .     .     .     .     .     .     .     .     .     .    23    63    12   112     9     .     .     .     .     .
      o  1695    51     5     2    16     8    21     4   297     3     1   174    35     .   216    35   517    28    40     .   230     1    11
      a  1368    17     .     1     .     1     .   569   245    99     4   398     4     .     1     1    10     1     7     .     8     .     2
      i  1051     .     .     .     .     2     .   464     1     .   508    64     3     .     .     1     6     .     .     .     2     .     .
      k  1106    20    94    21   374    49   330     .     1     .     .     .     4   112     .     .     .     .     4     .     3     .    94
      t   530     5    73    18   128    53   145     .     .     .     .     .     5    63     .     .     .     .     .     .     .     .    40
      h   225     3     1     1     5    10    65     1     .     .     .     .     1     .     .     .     .     1     .     .    26     1   110
      S   350     2     .     .     5    50   206     .     .     .     .     .    12     .     .     .     6     .     3     .    44     .    22
      C   909     2     .     1    19    93   406     1     2     1     .     1    71     .     9     3    25     1     6     .   212     .    56
      e  1497    20    13     2    11   219   279     .     3     .     .     1    27     .    21    17    99    13    37     .   520     .   215
      q   332     .     .     .     .   326     5     .     .     .     .     .     1     .     .     .     .     .     .     .     .     .     .
        ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- -----
    TOT 18067  3223   909   350  1368  1695  1497  1051   612   113   513   670   219   225   530   142  1106    86   191   332  1477     4  1754

    Next-symbol probability (× 99):

        TT  .  C  S  a  o  e  i  l  m  n  r  c  h  t  p  k  f  s  q  d  x  y
        -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
      . 99  . 17  8  5 23  .  .  2  .  .  1  1  .  4  2  5  1  2 10 10  .  7
      C 99  .  .  .  2 10 44  .  .  .  .  .  8  .  1  .  3  .  1  . 23  .  6
      S 99  1  .  .  1 14 58  .  .  .  .  .  3  .  .  .  2  .  1  . 12  .  6
      a 99  1  .  .  .  .  . 41 18  7  . 29  .  .  .  .  1  .  1  .  1  .  .
      o 99  3  .  .  1  .  1  . 17  .  . 10  2  . 13  2 30  2  2  . 13  .  1
      e 99  1  1  .  1 14 18  .  .  .  .  .  2  .  1  1  7  1  2  . 34  . 14
      i 99  .  .  .  .  .  . 44  .  . 48  6  .  .  .  .  1  .  .  .  .  .  .
      l 99 51  6  2  6  6  .  .  .  .  .  1  .  .  1  .  9  1  2  .  8  .  6
      m 99 94  .  .  2  .  .  .  .  1  .  .  .  .  .  .  .  .  .  .  3  .  .
      n 99 96  .  .  .  .  .  .  .  .  .  1  .  .  .  .  .  .  .  .  .  .  1
      r 99 79  1  . 11  2  .  1  .  .  .  .  .  .  .  .  .  .  .  .  1  .  3
      c 99  .  .  .  .  .  .  .  .  .  .  .  . 10 28  5 51  4  .  .  .  .  .
      h 99  1  .  .  2  4 29  .  .  .  .  .  .  .  .  .  .  .  .  . 11  . 48
      t 99  1 14  3 24 10 27  .  .  .  .  .  1 12  .  .  .  .  .  .  .  .  7
      p 99  3 45  6 15  8  1  .  .  .  .  .  3  8  .  .  .  .  .  .  3  .  5
      k 99  2  8  2 33  4 30  .  .  .  .  .  . 10  .  .  .  .  .  .  .  .  8
      f 99  6 38  1 23  8  2  .  .  .  .  .  2 10  .  .  .  .  1  .  1  .  6
      s 99 49  3  . 31  4  3  .  1  .  .  .  .  3  .  1  1  .  1  .  1  .  3
      q 99  .  .  .  . 97  1  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .
      d 99  5  1  1 28  3  1  .  .  .  .  .  .  .  .  .  .  .  .  .  .  . 60
      x 99 25  .  .  . 74  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .
      y 99 83  1  .  .  .  .  .  .  .  .  .  .  .  4  1  6  .  .  .  1  .  .
        -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
    TOT 99 18  5  2  7  9  8  6  3  1  3  4  1  1  3  1  6  0  1  2  8  0 10

    Previous-symbol probability (× 99):

        TT  .  C  S  a  o  e  i  l  m  n  r  c  h  t  p  k  f  s  q  d  x  y
        -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
      . 18  . 59 72 12 44  1  1  8  4  .  3 19  . 26 37 15 24 39 98 23 50 13
      C  5  .  .  .  1  5 27  .  .  1  .  . 32  .  2  2  2  1  3  . 14  .  3
      S  2  .  .  .  .  3 14  .  .  .  .  .  5  .  .  .  1  .  2  .  3  .  1
      a  7  1  .  .  .  .  . 54 40 87  1 59  2  .  .  1  1  1  4  .  1  .  .
      o  9  2  1  1  1  .  1  . 48  3  . 26 16  . 40 24 46 32 21  . 15 25  1
      e  8  1  1  1  1 13 18  .  .  .  .  . 12  .  4 12  9 15 19  . 35  . 12
      i  6  .  .  .  .  .  . 44  .  . 98  9  1  .  .  1  1  .  .  .  .  .  .
      l  3 10  4  4  3  2  .  .  .  .  .  1  1  .  1  .  5  5  5  .  3  .  2
      m  1  3  .  .  .  .  .  .  .  1  .  .  .  .  .  .  .  .  .  .  .  .  .
      n  3 15  .  .  .  .  .  .  .  .  .  1  .  .  .  .  .  .  1  .  .  .  .
      r  4 16  1  1  6  1  .  .  .  .  .  .  .  .  .  .  .  .  .  .  1  .  1
      c  1  .  .  .  .  .  .  .  .  .  .  .  . 10 12  8 10 10  .  .  .  .  .
      h  1  .  .  .  .  1  4  .  .  .  .  .  .  .  .  .  .  1  .  .  2 25  6
      t  3  .  8  5  9  3 10  .  .  .  .  .  2 28  .  .  .  .  .  .  .  .  2
      p  1  .  7  3  2  1  .  .  .  .  .  .  2  5  .  .  .  .  .  .  .  .  .
      k  6  1 10  6 27  3 22  .  .  .  .  .  2 49  .  .  .  .  2  .  .  .  5
      f  0  .  4  .  1  .  .  .  .  .  .  .  1  4  .  .  .  .  1  .  .  .  .
      s  1  3  1  .  4  .  .  .  .  .  .  .  .  3  .  1  .  .  1  .  .  .  .
      q  2  .  .  .  . 19  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .
      d  8  2  2  4 30  2  1  .  1  .  .  .  1  .  .  1  .  .  1  .  .  . 50
      x  0  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .
      y 10 45  2  2  .  .  .  .  1  4  .  .  .  . 14 12 10  9  2  .  1  .  .
        -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
    TOT 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99

  
  The relative frequencies of "t" and "k", "sh" and "ch" are as follows:

    Language A:  t = 1790 k = 1650  ratio t/k = 1.085
    Language B:  t =  530 k = 1106  ratio t/k = 0.479

    Language A:  S = 1016 C = 2892  ratio S/C = 0.351
    Language B:  S =  350 C =  909  ratio S/C = 0.385
    
  So it seems we must collapse t and k, otherwise it will be very hard to 
  find a correspondence between the two languages. 
  
  We could keep ch and sh distinct, but their next-symbol
  probabilities are so similar that it seems silly to distinguish
  them.

  Just to be sure, let's compare the sh and ch contexts in the two
  languages:
  
    foreach lang ( a b )
      foreach f ( sh.ch ch.sh )
        cat Note-009/he${lang}-f.factored \
          | sed \
              -e 's/[- .:]//g' \
              -e 's/k/t/' \
              -e 's/p/f/' \
              -e 's/ckh/K/' \
              -e 's/cph/P/' \
              -e 's/'"${f:r}"'/@/' \
              -e 's/'"${f:e}"'/~/' \
          | grep '@' \
          | sort | uniq -c | expand \
          | sort +0 -1nr \
          | compute-freqs \
          > .tmp-he${lang}-${f:r}.frq
      end
    end
    dicio-wc .tmp-he{a,b}-{sh,ch}.frq
    
     lines   words     bytes file        
    ------ ------- --------- ------------
       322     966      6438 .tmp-hea-sh.frq
       860    2580     17698 .tmp-hea-ch.frq
       173     519      3466 .tmp-heb-sh.frq
       389    1167      7964 .tmp-heb-ch.frq

    Language A                          Language B                       
   ----------------------------------  ----------------------------------     
    contexts of sh    contexts of ch    contexts of sh   contexts of ch  
   ----------------- ----------------  ---------------- -----------------         
    98 0.096 @ol     221 0.076 @ol      35 0.100 @edy    60 0.066 @edy   
    92 0.091 @o      150 0.052 @or      16 0.046 @dy     51 0.056 @dy    
    63 0.062 @or      95 0.033 @y       11 0.031 @ol     43 0.047 @cthy  
    61 0.060 @y       88 0.030 qot@y    10 0.029 @ey     22 0.024 @ety   
    39 0.038 @ey      55 0.019 @ey      10 0.029 @y      22 0.024 t@dy   
    23 0.023 @ody     53 0.018 t@y       9 0.026 @ody    20 0.022 qot@dy 
    19 0.019 @eey     44 0.015 ot@y      8 0.023 @eedy   15 0.017 @ey    
    15 0.015 @eol     37 0.013 @oty      8 0.023 @eody   13 0.014 @ol    
    14 0.014 @aiin    36 0.012 t@or      7 0.020 @eo     12 0.013 @ecthy 
    14 0.014 @e       34 0.012 @aiin     6 0.017 @ety    12 0.013 @ody   
    14 0.014 @odaiin  33 0.011 @eor      6 0.017 @or     12 0.013 ot@dy  
    12 0.012 @eor     32 0.011 ot@ol     5 0.014 @ed     11 0.012 @eody  
    11 0.011 t@o      31 0.011 @ody      5 0.014 @eey    10 0.011 @y     
    10 0.010 @eo      30 0.010 t@ol      5 0.014 @eol     9 0.010 @ty    
    10 0.010 @octhy   30 0.010 yt@y      5 0.014 d@edy    9 0.010 t@edy  
    10 0.010 ot@y     29 0.010 @cthy     5 0.014 t@dy     7 0.008 @daiin 
     9 0.009 @cthy    29 0.010 @o        4 0.011 @cthey   7 0.008 @eol   
 
  Obviously "sh" and "ch" are very different.
  
  Just to make double sure, we can play the same game with t and k:
  
    foreach lang ( a b )
      foreach f ( t.k k.t )
        cat Note-009/he${lang}-f.factored \
          | sed \
              -e 's/[- .:]//g' \
              -e 's/p/f/' \
              -e 's/'"${f:r}"'/@/' \
              -e 's/'"${f:e}"'/~/' \
          | grep '@' \
          | sort | uniq -c | expand \
          | sort +0 -1nr \
          | compute-freqs \
          > .tmp-he${lang}-${f:r}.frq
      end
    end
    dicio-wc .tmp-he{a,b}-{t,k}.frq
    pr -m -t -i' '1 -w 104 .tmp-he{a,b}-{t,k}.frq \
      | expand \
      > .tmp-t-k-cmp.txt

     lines   words     bytes file        
    ------ ------- --------- ------------
       642    1926     13627 .tmp-hea-t.frq
       683    2049     14438 .tmp-hea-k.frq
       271     813      5693 .tmp-heb-t.frq
       437    1311      9223 .tmp-heb-k.frq


     Language A                                          Language B                                
    ------------------------------------------------    ---------------------------------------------   
     contexts of t             contexts of k             contexts of t             contexts of k   
    ---------------------     ----------------------    --------------------      -------------------       
     96 0.054 c@hy             39 0.024 qo@chy           18 0.034 o@edy            41 0.037 qo@edy
     51 0.029 c@hol            33 0.020 o@y              16 0.030 o@ar             35 0.032 chc@hy
     49 0.028 qo@chy           29 0.018 @chy             14 0.027 o@al             35 0.032 o@aiin
     42 0.024 c@hor            28 0.017 c@hy             13 0.025 o@aiin           29 0.026 o@ar
     38 0.021 o@y              27 0.017 o@aiin           12 0.023 o@y              27 0.025 qo@ar
     34 0.019 c@hey            25 0.015 qo@y             12 0.023 qo@edy           25 0.023 o@edy
     29 0.016 o@chy            22 0.014 qo@ol            11 0.021 y@edy            23 0.021 o@al
     28 0.016 o@ol             21 0.013 o@ol             10 0.019 @edy             20 0.018 qo@aiin
     28 0.016 qo@y             20 0.012 @chor             9 0.017 @ar              19 0.017 che@y
     27 0.015 o@aiin           19 0.012 @chol             8 0.015 chc@hy           18 0.016 @ar
     24 0.014 @chy             18 0.011 @aiin             7 0.013 @chdy            17 0.015 o@y
     24 0.014 o@chol           18 0.011 @ol               7 0.013 o@am             16 0.015 o@eedy
     20 0.011 c@ho             18 0.011 y@chy             7 0.013 o@chdy           15 0.014 @chdy
     20 0.011 cho@y            17 0.010 cho@y             7 0.013 o@eol            15 0.014 qo@chdy
     18 0.010 qo@ol            16 0.010 qo@aiin           7 0.013 qo@ar            15 0.014 y@ar
     17 0.010 @ol              15 0.009 c@hol             7 0.013 y@eedy           14 0.013 @edy
    
  Hm, there is some resemblance, but not as much as I would like.
  Perhaps it will get better if I delete the [oqy] prefixes and
  eplace cth,ckh by tch, kch:

    foreach lang ( a b )
      foreach f ( t.k k.t )
        cat Note-009/he${lang}-f.factored \
          | sed \
              -e 's/[- .:]//g' \
              -e 's/p/f/' \
              -e 's/^[qoy]*//' \
              -e 's/c\([tkpf]\)h/\1ch/' \
              -e 's/'"${f:r}"'/@/' \
              -e 's/'"${f:e}"'/~/' \
          | grep '@' \
          | sort | uniq -c | expand \
          | sort +0 -1nr \
          | compute-freqs \
          > .tmp-he${lang}-${f:r}.frq
      end
    end
    dicio-wc .tmp-he{a,b}-{t,k}.frq 
    pr -m -t -i' '1 -w 104 .tmp-he{a,b}-{t,k}.frq \
      | expand \
      > .tmp-t-k-cmp.txt
    
     lines   words     bytes file        
    ------ ------- --------- ------------
       446    1338      9376 .tmp-hea-t.frq
       453    1359      9489 .tmp-hea-k.frq
       205     615      4279 .tmp-heb-t.frq
       318     954      6630 .tmp-heb-k.frq

    
     Language A                                          Language B                                
    ------------------------------------------------    ---------------------------------------------   
     contexts of t             contexts of k             contexts of t             contexts of k   
    ---------------------     ----------------------    --------------------      -------------------       
     218 0.123 @chy            137 0.084 @chy             51 0.097 @edy             89 0.081 @ar
     109 0.061 @chol            80 0.049 @y               35 0.067 @ar              89 0.081 @edy
      99 0.056 @chor            75 0.046 @aiin            26 0.049 @chdy            77 0.070 @aiin
      82 0.046 @y               70 0.043 @ol              23 0.044 @y               44 0.040 @al
      73 0.041 @ol              62 0.038 @chol            20 0.038 @aiin            43 0.039 @eedy
      70 0.039 @chey            61 0.037 @chor            19 0.036 @al              41 0.037 @chdy
      63 0.036 @aiin            50 0.031 @chey            16 0.030 @chedy           35 0.032 ch@chy
      47 0.027 @or              49 0.030 @eey             13 0.025 @eedy            30 0.027 @y
      40 0.023 @cho             36 0.022 @or              12 0.023 @eey             25 0.023 @chy
      29 0.016 @chody           30 0.018 @cho             11 0.021 @am              25 0.023 @eey
      26 0.015 cho@chy          25 0.015 @eol             11 0.021 @chy             19 0.017 @eody
      23 0.013 @char            23 0.014 @al              10 0.019 @chey            19 0.017 che@y
      20 0.011 @eey             21 0.013 ch@chy           10 0.019 @eol             18 0.016 @ain
      20 0.011 cho@y            20 0.012 @ey              10 0.019 @ody             18 0.016 @am
      19 0.011 @chaiin          20 0.012 cho@chy           9 0.017 @or              14 0.013 @ol
      17 0.010 @al              19 0.012 @shy              8 0.015 @ol              13 0.012 @chedy
      17 0.010 ch@chy           18 0.011 @chody            8 0.015 ch@chy           11 0.010 @ey

  Not perfect, but convincing enough...
  
  Ok. let's try again to equalize the tail distributions:
  
    foreach lang ( a b )
      cat Note-009/he${lang}-f.factored \
        | sed \
            -e 's/d//g'      \
            -e 's/k/t/g'     \
            -e 's/f/p/g'     \
            -e 's/cth/tch/g' \
            -e 's/ckh/tch/g' \
            -e 's/cph/pch/g' \
            -e 's/cfh/pch/g' \
            -e 's/ei/a/g'    \
            -e 's/a/o/g'    \
        > .he${lang}-f-erg.factored
    end

    foreach lang ( a b )
      cat .he${lang}-f-erg.factored \
        | gawk '/./ {print ($1 $2 $3)}' \
        | sed -e 's/--//g' \
        | sort | uniq -c | expand | sort +0 -1nr \
        > .he${lang}-f-erg-words-all.frq

      cat .he${lang}-f-erg.factored \
        | grep -v -e '- -' \
        | gawk '/./ {print $1}' \
        | sort | uniq -c | expand | sort +0 -1nr \
        > .he${lang}-f-erg-unifs-all.frq

      cat .he${lang}-f-erg.factored \
        | grep -e '- -' \
        | gawk '/./ {print $1}' \
        | sort | uniq -c | expand | sort +0 -1nr \
        > .he${lang}-f-erg-prefs-all.frq

      cat .he${lang}-f-erg.factored \
        | grep -e '- -' \
        | gawk '/./ {print $2}' \
        | sort | uniq -c | expand | sort +0 -1nr \
        > .he${lang}-f-erg-midfs-all.frq

      cat .he${lang}-f-erg.factored \
        | grep -e '- -' \
        | gawk '/./ {print $3}' \
        | sort | uniq -c | expand | sort +0 -1nr \
        > .he${lang}-f-erg-suffs-all.frq

      cat .he${lang}-f-erg.factored \
        | grep -e '- -' \
        | gawk '/./ {print ($2 $3)}' \
        | sed -e 's/--//g' \
        | sort | uniq -c | expand | sort +0 -1nr \
        > .he${lang}-f-erg-tails-all.frq
    end
    dicio-wc .he{a,b}-f-erg-{word,unif,pref,midf,suff,tail}s-all.frq

     lines   words     bytes file        
    ------ ------- --------- ------------
      1563    3126     23155 .hea-f-erg-words-all.frq
       193     386      2524 .hea-f-erg-unifs-all.frq
        47      94       600 .hea-f-erg-prefs-all.frq
       286     572      4647 .hea-f-erg-midfs-all.frq
       126     252      1732 .hea-f-erg-suffs-all.frq
       888    1776     14121 .hea-f-erg-tails-all.frq
       880    1760     12786 .heb-f-erg-words-all.frq
       132     264      1735 .heb-f-erg-unifs-all.frq
        28      56       345 .heb-f-erg-prefs-all.frq
       193     386      3105 .heb-f-erg-midfs-all.frq
        76     152      1018 .heb-f-erg-suffs-all.frq
       506    1012      7898 .heb-f-erg-tails-all.frq

    foreach elem ( word unif pref midf suff tail )
      foreach lang ( A.a B.b )
        set file = "he${lang:e}-f-erg-${elem}s-all"
        echo "${file}.frq -> ${file}.fmt"
        cat .${file}.frq \
          | compute-freqs \
          | gawk '\
                BEGIN {\
                  printf "by Friedman\nlanguage '"${lang:r}"'\n"; \
                  printf "freq pc '"${elem}"'ix\n---- -- ------------------\n";} \
                /./   {printf "%4d %2d %s\n",$1,int($2*100+0.5),$3; t+=$1;} \
                END   {printf "---- -- ------------------\n%4d 99 TOTAL\n",t;} \
              ' \
          > .${file}.fmt
      end
    end

    foreach elem ( word unif pref midf suff tail )
      set tfiles = ( )
      foreach lang ( a b )
        set file = "he${lang}-f-erg-${elem}s-all"
        set tfiles = ( ${tfiles} .${file}.fmt )
      end
      pr -m -t -i' '1 -w 54  ${tfiles} \
        | expand \
        > .herbal-f-erg-${elem}-cmp.txt
    end
    dicio-wc .herbal-f-erg-{word,unif,pref,midf,suff,tail}-cmp.txt

     lines   words     bytes file        
    ------ ------- --------- ------------
      1569    7361     55938 .herbal-f-erg-word-cmp.txt
       199    1007      7275 .herbal-f-erg-unif-cmp.txt
        53     257      1901 .herbal-f-erg-pref-cmp.txt
       292    1469     11188 .herbal-f-erg-midf-cmp.txt
       132     638      4738 .herbal-f-erg-suff-cmp.txt
       894    4214     32524 .herbal-f-erg-tail-cmp.txt

  With these transformations, the prefixes are obviously still 
  the same in both languages:
  
    by Friedman                by Friedman
    language A                 language B
    freq pc prefix             freq pc prefix
    ---- -- ------------------ ---- -- ------------------
    3857 65 -                  1269 52 -
     825 14 o-                  504 21 o-
     607 10 qo-                 300 12 qo-
     440  7 y-                  227  9 y-
      56  1 s-                   66  3 ol-
      42  1 ol-                  26  1 l-
      21  0 so-                   6  0 s-
      16  0 l-                    5  0 o:i-
      13  0 or-                   4  0 or-
      12  0 r-                    3  0 lo-
      11  0 oy-                   2  0 olo-
       7  0 o:i-                  2  0 qol-
       6  0 yo-                   2  0 r-
       5  0 os-                   1  0 lol-
       4  0 ro-                   1  0 lqo-
       4  0 sol-                  1  0 o:ii-
       4  0 sy-                   1  0 o:n-
       3  0 lo-                   1  0 oo-
       2  0 ls-                   1  0 oro-
       2  0 o:in-                 1  0 orol-
       2  0 oo-                   1  0 oy-
       2  0 oro-                  1  0 so:i-
       2  0 qoo:i-                1  0 sol-
  
  The suffixes are close enough:
  
    by Friedman                by Friedman
    language A                 language B
    freq pc suffix             freq pc suffix
    ---- -- ------------------ ---- -- ------------------
    1853 31 -y                 1173 48 -y
    1028 17 -ol                 250 10 -or
     881 15 -or                 202  8 -ol
     456  8 -o                  179  7 -oiin
     370  6 -oiin               122  5 -oy
     266  5 -oy                  96  4 -
     136  2 -                    73  3 -o
     130  2 -om                  47  2 -om
      96  2 -ooiin               34  1 -os
      84  1 -oly                 33  1 -oin
      77  1 -s                   31  1 -oly
      76  1 -os                  31  1 -s
      53  1 -oin                 20  1 -oir
      44  1 -ory                 11  1 -ooiin
      40  1 -oor                 10  0 -oor
      35  1 -on                   9  0 -ory
      28  1 -ool                  6  0 -orom
      12  0 -n                    6  0 -oror
      12  0 -ols                  6  0 -yy
      12  0 -yy                   5  0 -ool

  The midfixes are still very different:
  
    by Friedman                by Friedman
    language A                 language B
    freq pc midfix             freq pc midfix
    ---- -- ------------------ ---- -- ------------------
    1090 18 -tch-               590 24 -t-
    1045 18 -ch-                274 11 -te-
     913 15 -t-                 172  7 -ch-
     526  9 -sh-                163  7 -che-
     251  4 -che-               141  6 -tch-
     191  3 -tche-              135  6 -tee-
     181  3 -pch-               110  5 -she-
     155  3 -te-                 79  3 -sh-
     142  2 -she-                64  3 -tche-
     131  2 -tee-                57  2 -chtch-
      96  2 -chot-               48  2 -chet-
      93  2 -tsh-                48  2 -p-
      69  1 -chtch-              46  2 -pch-
      61  1 -chotch-             39  2 -pche-
      60  1 -p-                  24  1 -shee-
      58  1 -chee-               23  1 -chee-
      50  1 -cht-                19  1 -cht-
      43  1 -pche-               18  1 -ee-
      36  1 -shee-               18  1 -tsh-
      30  1 -tchee-              18  1 -tshe-
      25  0 -eee-                16  1 -chetch-
  
  And the tails, oh my:
  
    by Friedman                by Friedman
    language A                 language B
    freq pc tailix             freq pc tailix
    ---- -- ------------------ ---- -- ------------------
     379  6 -tchy               171  7 -tey
     269  5 -chol               149  6 -tor
     232  4 -chor               113  5 -tchy
     195  3 -tol                108  4 -toiin
     192  3 -tchol               99  4 -teey
     191  3 -tchor               97  4 -chey
     182  3 -ty                  89  4 -tol
     163  3 -toiin               73  3 -chy
     154  3 -chy                 63  3 -ty
     121  2 -tchey               58  2 -shey
     116  2 -sho                 52  2 -tchey
     114  2 -tor                 49  2 -chtchy
     104  2 -shol                30  1 -shy
      95  2 -tcho                30  1 -tom
      88  2 -chey                28  1 -pchey
      83  1 -shy                 25  1 -pchy
      82  1 -shor                25  1 -teoy
      75  1 -teey                23  1 -chety
      61  1 -cho                 23  1 -toin
      58  1 -cheor               22  1 -toly
      58  1 -choiin              21  1 -teol
      53  1 -tchoy               20  1 -chol
      47  1 -chotchy             18  1 -toy
      45  1 -choy                17  1 -sheey
      45  1 -shey                16  1 -chetchy
      44  1 -teol                16  1 -chor
      43  1 -chtchy              16  1 -choy
      42  1 -pchy                14  1 -teo

  The unifixes are rather OK, I think, except for the 
  inversion between "oiin" and "or":
  
    by Friedman                by Friedman
    language A                 language B
    freq pc unifix             freq pc unifix
    ---- -- ------------------ ---- -- ------------------
     441 24 oiin                149 19 or
     175 10 or                  126 16 oiin
     145  8 ol                   75 10 ol
     107  6 y                    35  4 y
      88  5 s                    25  3 soiin
      77  4 oin                  22  3 om
      55  3 om                   18  2 oroiin
      40  2 soiin                17  2 oly
      31  2 ooiin                16  2 oy
      30  2 sor                  13  2 oloiin
      28  2 oir                  12  2 oin
      28  2 sol                  12  2 olor
      25  1 o                    12  2 s
      25  1 sy                   10  1 ory
      20  1 qooiin                9  1 ooiin
  
  The words as a whole are rather different:
  
    by Friedman                by Friedman
    language A                 language B
    freq pc wordix             freq pc wordix
    ---- -- ------------------ ---- -- ------------------
     441  6 oiin                149  5 or
     247  3 chol                126  4 oiin
     201  3 chor                 80  3 chey
     182  2 tchy                 75  2 ol
     175  2 or                   64  2 chy
     145  2 ol                   56  2 qotey
     126  2 chy                  54  2 otey
     108  1 sho                  54  2 otor
     107  1 tchol                53  2 shey
     107  1 y                    49  2 otoiin
     104  1 tchor                48  2 chtchy
     101  1 qotchy               44  1 otol
     100  1 shol                 43  1 tchy
      88  1 s                    35  1 qotor
      79  1 otol                 35  1 y
      79  1 shor                 33  1 tor
      77  1 oin                  31  1 oteey
      77  1 oty                  30  1 oty