let's recompute the frequency of each glyph, excluding garbage,
  without classification:
  
    cat .voyn.glp \
      | sed \
          -e 's:([/ _=]*):@:g' \
           -e 's/)(/)@(/g' \
      | tr '@' '\n' \
      | egrep '.' \
      | grep -v '\*' \
      | sort | uniq -c | expand \
      | sort +0.8 -0.99  \
      | compute-freqs \
      > .voyn-glyphs.frq
      
  There are 763 lines, and a total of 17561 glyphs by the current parsing.
  Hnece the average number of glyphs per line is 23.0.

  If we believe that most Bio lines are just paragraph continuation
  lines, then the line-initial and line-final "glyphs" should reflect
  the beginning and endings of true words.  If we take the number of
  words per line in .voyn.fsg (9.44) as basically correct, then a
  glyph that is strictly word-final in the VMs language should have
  1/9.44 of its occurrences in line-final position.
  
  The probability of some occurrence of a glyph being word-end
  can be estimated as Pwe = 9.44*NLE/NT, where NLE is the number
  of occurrences as line-end and NT the total occurrences. 
  The probability Pws of it being word-start is estimated the same way
  from the number NLS of occurrences as line-start.

  Let's compute the frequencies of glyphs at line-end and word-end 
  (where the word division is taken from the transcription file):
      
    cat .voyn.glp \
      | sed \
          -e 's:([_/=]*)::g' \
          -e 's:([^)]*)$:-@&@:g' \
      | tr '@' '\012' \
      | egrep -v -e '-$' \
      | egrep '.' \
      | egrep -v '\*' \
      | sort | uniq -c | expand \
      | sort +0.0 -0.7nr  \
      | compute-freqs \
      > .voyn-line-fin-g.frq

    cat .voyn.glp \
      | sed \
          -e 's:([/=]*)::g' \
          -e 's:(_[(_)]*)$:-:g' \
          -e 's:\(([^)]*)\)(_):-@\1@:g' \
      | tr '@' '\012' \
      | egrep -v -e '-$' \
      | egrep -v '\*' \
      | egrep '.' \
      | egrep -v '^\(_\)' \
      | sort | uniq -c | expand \
      | sort +0.0 -0.7nr  \
      | compute-freqs \
      > .voyn-word-fin-g.frq
      
    compare-freqs \
        .voyn-glyphs.frq \
        .voyn-line-fin-g.frq \
        .voyn-word-fin-g.frq \
      | gawk '/[0-9]\.[0-9]/ {printf "%-63s %5.2f\n", $0, (9.44*$5/$3)}' \
      > .voyn-fin.cmp

       tot occurs    line ends    word ends  glyph        Pwe
       ----------   ----------   ----------  ---------  -----
         43 0.002     40 0.053      2 0.000  (ak)        8.78
         10 0.001      8 0.011      1 0.000  (K?)        7.55
         12 0.001      8 0.011      3 0.001  (6?)        6.29
          2 0.000      1 0.001      1 0.000  (IIIK?)     4.72
          2 0.000      1 0.001      1 0.000  (IIK?)      4.72
       1726 0.098    218 0.287   1308 0.230  (g)         1.19
        286 0.016     30 0.039    178 0.031  (or)        0.99

        557 0.032     54 0.071    343 0.060  (ae)        0.92
       2055 0.117    184 0.242   1818 0.320  (bg)        0.85
         23 0.001      2 0.003     20 0.004  (M?)        0.82
        414 0.024     31 0.041    374 0.066  (am)        0.71

        405 0.023     25 0.033    313 0.055  (ar)        0.58
        495 0.028     29 0.038    451 0.079  (an)        0.55
         17 0.001      1 0.001      7 0.001  (qor)       0.56
       1152 0.066     57 0.075    490 0.086  (oe)        0.47
        193 0.011      9 0.012     81 0.014  (qoe)       0.44

        186 0.011      8 0.011     39 0.007  (r)         0.41
        456 0.026     18 0.024     34 0.006  (e)         0.37
        365 0.021     11 0.014     57 0.010  (z)         0.28
        685 0.039     12 0.016     61 0.011  (b)         0.17

         30 0.002      0 0.000      5 0.001  (4?)        0.00
         10 0.001      0 0.000      0 0.000  (I?)        0.00
         10 0.001      0 0.000      0 0.000  (II?)       0.00
          2 0.000      0 0.000      2 0.000  (IIL?)      0.00
          6 0.000      0 0.000      1 0.000  (L?)        0.00
          7 0.000      0 0.000      6 0.001  (N?)        0.00
         28 0.002      0 0.000     24 0.004  (air?)      0.00
       1141 0.065      0 0.000     14 0.002  (d)         0.00
        412 0.023      0 0.000      0 0.000  (dc)        0.00
        440 0.025      0 0.000      1 0.000  (dcc)       0.00
        147 0.008      0 0.000      1 0.000  (dz)        0.00
         52 0.003      0 0.000      0 0.000  (dzc)       0.00
         32 0.002      0 0.000      1 0.000  (f)         0.00
          3 0.000      0 0.000      0 0.000  (fz)        0.00
          1 0.000      0 0.000      0 0.000  (fzc)       0.00
        513 0.029      0 0.000      4 0.001  (h)         0.00
        210 0.012      0 0.000      0 0.000  (hc)        0.00
        129 0.007      0 0.000      0 0.000  (hcc)       0.00
         91 0.005      1 0.001      0 0.000  (hz)        0.10
         30 0.002      0 0.000      0 0.000  (hzc)       0.00
        885 0.050      8 0.011     16 0.003  (o)         0.09
        195 0.011      0 0.000      4 0.001  (p)         0.00
         12 0.001      0 0.000      0 0.000  (pz)        0.00
          9 0.001      0 0.000      0 0.000  (pzc)       0.00
       1436 0.082      1 0.001     10 0.002  (qo)        0.01
        212 0.012      0 0.000      4 0.001  (s)         0.00
        716 0.041      1 0.001      0 0.000  (sc)        0.01
        150 0.009      0 0.000      1 0.000  (scc)       0.00
        400 0.023      0 0.000      1 0.000  (t)         0.00
        927 0.053      0 0.000      1 0.000  (tc)        0.00
        126 0.007      0 0.000      1 0.000  (tcc)       0.00

  
  The statistics for line-ends roughly agree with those of word-ends.

  By th Pwe criterion, the glyphs that asre almost certain to be
  word-ends are
  
       tot occurs    line ends    word ends  glyph        Pwe
       ----------   ----------   ----------  ---------  -----
       1726 0.098    218 0.287   1308 0.230  (g)         1.19
        286 0.016     30 0.039    178 0.031  (or)        0.99

  The following glyphs are anomalous in that they seem to occur as line-end
  (rarely, except for (ak)) but not as other word-end.  Thus they may 
  be abbreviations, or continuation signs, or truncated words.

       tot occurs    line ends    word ends  glyph        Pwe
       ----------   ----------   ----------  ---------  -----
         43 0.002     40 0.053      2 0.000  (ak)        8.78
         12 0.001      8 0.011      3 0.001  (6?)        6.29
         10 0.001      8 0.011      1 0.000  (K?)        7.55
 
 The following glyphs apepar to be very likely, but not certain,
  word-ends (about 75-90% chance):

       tot occurs    line ends    word ends  glyph        Pwe
       ----------   ----------   ----------  ---------  -----
       2055 0.117    184 0.242   1818 0.320  (bg)        0.85
        557 0.032     54 0.071    343 0.060  (ae)        0.92
        414 0.024     31 0.041    374 0.066  (am)        0.71

  The following glyphs apepar to be likely, but not certain,
  word-ends (about 50% chance):

       tot occurs    line ends    word ends  glyph        Pwe
       ----------   ----------   ----------  ---------  -----
        405 0.023     25 0.033    313 0.055  (ar)        0.58
        495 0.028     29 0.038    451 0.079  (an)        0.55
       1152 0.065     57 0.075    490 0.086  (oe)        0.47
        193 0.011      9 0.012     81 0.014  (qoe)       0.44
         17 0.001      1 0.001      7 0.001  (qor)       0.56

  The following glyphs apparently MAY occur at end-of-word, but 
  apparently occur with higher frequency in non-word-final
  position:

       tot occurs    line ends    word ends  glyph        Pwe
       ----------   ----------   ----------  ---------  -----
        685 0.039     12 0.016     61 0.011  (b)         0.17
        365 0.021     11 0.014     57 0.010  (z)         0.28
        456 0.026     18 0.024     34 0.006  (e)         0.37
        186 0.011      8 0.011     39 0.007  (r)         0.41

  Numerically, (air?) would not seem to be a valid word-end,
  but structurally it is like (ar), and its occurrences 
  suggest it should be allowed:

       tot occurs    line ends    word ends  glyph        Pwe
       ----------   ----------   ----------  ---------  -----
         28 0.002      0 0.000     24 0.004  (air?)      0.00


  Finally, these glyphs do NOT seem to be valid word-ends:

       tot occurs    line ends    word ends  glyph        Pwe
       ----------   ----------   ----------  ---------  -----
         30 0.002      0 0.000      5 0.001  (4?)        0.00

        885 0.050      8 0.011     16 0.003  (o)         0.09
       1436 0.082      1 0.001     10 0.002  (qo)        0.01

       1141 0.065      0 0.000     14 0.002  (d)         0.00
        412 0.023      0 0.000      0 0.000  (dc)        0.00
        440 0.025      0 0.000      1 0.000  (dcc)       0.00
        147 0.008      0 0.000      1 0.000  (dz)        0.00
         52 0.003      0 0.000      0 0.000  (dzc)       0.00

        513 0.029      0 0.000      4 0.001  (h)         0.00
        210 0.012      0 0.000      0 0.000  (hc)        0.00
        129 0.007      0 0.000      0 0.000  (hcc)       0.00
         91 0.005      1 0.001      0 0.000  (hz)        0.10
         30 0.002      0 0.000      0 0.000  (hzc)       0.00

        212 0.012      0 0.000      4 0.001  (s)         0.00
        716 0.041      1 0.001      0 0.000  (sc)        0.01
        150 0.009      0 0.000      1 0.000  (scc)       0.00

        400 0.023      0 0.000      1 0.000  (t)         0.00
        927 0.053      0 0.000      1 0.000  (tc)        0.00
        126 0.007      0 0.000      1 0.000  (tcc)       0.00

         38 0.002      2 0.003      0 0.000  (c?)        0.50
         48 0.003      0 0.000      0 0.000  (cc?)       0.00
         29 0.002      0 0.000      0 0.000  (ccc?)      0.00

         32 0.002      0 0.000      1 0.000  (f)         0.00
          3 0.000      0 0.000      0 0.000  (fz)        0.00
          1 0.000      0 0.000      0 0.000  (fzc)       0.00

        195 0.011      0 0.000      4 0.001  (p)         0.00
         12 0.001      0 0.000      0 0.000  (pz)        0.00
          9 0.001      0 0.000      0 0.000  (pzc)       0.00
  
  Let's now look at the frequencies of line-starts and word-starts:

    cat .voyn.glp \
      | sed \
          -e 's/(_)//g' \
          -e 's:^([^)]*):@&@-:g' \
      | tr '@' '\012' \
      | egrep -v '^-' \
      | egrep -v '\*' \
      | egrep '.' \
      | egrep -v '^\(//\)$' \
      | sort | uniq -c | expand \
      | sort +0.0 -0.7nr  \
      | compute-freqs \
      > .voyn-line-ini-g.frq

    cat .voyn.glp \
      | sed \
          -e 's:([/=]*)::g' \
          -e 's:^(_):-:g' \
          -e 's:(_)\(([^)]*)\):@\1@-:g' \
      | tr '@' '\012' \
      | egrep -v '^-' \
      | egrep -v '\*' \
      | egrep -v '^\(_\)$' \
      | egrep '.' \
      | sort | uniq -c | expand \
      | sort +0.0 -0.7nr  \
      | compute-freqs \
      > .voyn-word-ini-g.frq

    compare-freqs \
        .voyn-glyphs.frq \
        .voyn-line-ini-g.frq \
        .voyn-word-ini-g.frq \
      | gawk '/[0-9]\.[0-9]/ {printf "%-63s %5.2f\n", $0, (9.44*$5/$3)}' \
      > .voyn-ini.cmp

       tot occurs   line start   word start  glyph        Pws
       ----------   ----------   ----------  ---------  -----
        365 0.021    138 0.181    140 0.025  (z)         3.57
        195 0.011     71 0.093     22 0.004  (p)         3.44
         17 0.001      5 0.007     11 0.002  (qor)       2.78
         30 0.002      6 0.008     22 0.004  (4?)        1.89
        685 0.039    103 0.135    350 0.062  (b)         1.42
       1436 0.082    169 0.221   1251 0.220  (qo)        1.11

        193 0.011     18 0.024    172 0.030  (qoe)       0.88
        513 0.029     44 0.058     50 0.009  (h)         0.81

         32 0.002      2 0.003      4 0.001  (f)         0.59

        456 0.026     26 0.034    340 0.060  (e)         0.54

        885 0.050     32 0.042    678 0.119  (o)         0.34
       1726 0.098     59 0.077     79 0.014  (g)         0.32
        212 0.012      7 0.009    129 0.023  (s)         0.31
        186 0.011      4 0.005    130 0.023  (r)         0.20
        716 0.041     15 0.020    445 0.078  (sc)        0.20
       1152 0.066     23 0.030    543 0.096  (oe)        0.19
        400 0.023      5 0.007    205 0.036  (t)         0.12
        927 0.053     11 0.014    457 0.080  (tc)        0.11

         38 0.002      2 0.003     11 0.002  (c?)        0.50
         48 0.003      1 0.001      9 0.002  (cc?)       0.20
         29 0.002      0 0.000      3 0.001  (ccc?)      0.00

        210 0.012      3 0.004     27 0.005  (hc)        0.13
        129 0.007      3 0.004      9 0.002  (hcc)       0.22
         30 0.002      1 0.001      6 0.001  (hzc)       0.31
         12 0.001      1 0.001      4 0.001  (pz)        0.79
         91 0.005      1 0.001      9 0.002  (hz)        0.10
        286 0.016      3 0.004     96 0.017  (or)        0.10
        147 0.008      1 0.001      3 0.001  (dz)        0.06
        126 0.007      1 0.001     84 0.015  (tcc)       0.07
        150 0.009      1 0.001     96 0.017  (scc)       0.06
       1141 0.065      5 0.007     67 0.012  (d)         0.04
        412 0.023      1 0.001     15 0.003  (dc)        0.02
          1 0.000      0 0.000      0 0.000  (fzc)       0.00
          2 0.000      0 0.000      0 0.000  (IIIK?)     0.00
          2 0.000      0 0.000      0 0.000  (IIK?)      0.00
          2 0.000      0 0.000      0 0.000  (IIL?)      0.00
          3 0.000      0 0.000      2 0.000  (fz)        0.00
          6 0.000      0 0.000      2 0.000  (L?)        0.00
          7 0.000      0 0.000      0 0.000  (N?)        0.00
          9 0.001      0 0.000      2 0.000  (pzc)       0.00
         10 0.001      0 0.000      0 0.000  (II?)       0.00
         10 0.001      0 0.000      0 0.000  (K?)        0.00
         10 0.001      0 0.000      1 0.000  (I?)        0.00
         12 0.001      0 0.000      1 0.000  (6?)        0.00
         23 0.001      0 0.000      0 0.000  (M?)        0.00
         28 0.002      0 0.000      1 0.000  (air?)      0.00
         43 0.002      0 0.000      5 0.001  (ak)        0.00
         52 0.003      0 0.000      3 0.001  (dzc)       0.00
        405 0.023      0 0.000     28 0.005  (ar)        0.00
        414 0.024      0 0.000     36 0.006  (am)        0.00
        440 0.025      0 0.000     14 0.002  (dcc)       0.00
        495 0.028      0 0.000     17 0.003  (an)        0.00
        557 0.032      0 0.000     39 0.007  (ae)        0.00
       2055 0.117      1 0.001     62 0.011  (bg)        0.00


  On the other hand, there are noticeable discrepancies between the
  statistics of line-starts and word-starts.  The following glyphs
  can be taken as sure word-starts:
                              
       tot occurs   line start   word start  glyph        Pws
       ----------   ----------   ----------  ---------  -----
        365 0.021    138 0.181    140 0.025  (z)         3.57
        685 0.039    103 0.135    350 0.062  (b)         1.42

         30 0.002      6 0.008     22 0.004  (4?)        1.89
       1436 0.082    169 0.221   1251 0.220  (qo)        1.11

  The following glyphs are word-starters with high probability:

       tot occurs   line start   word start  glyph        Pws
       ----------   ----------   ----------  ---------  -----
         17 0.001      5 0.007     11 0.002  (qor)       2.78
        193 0.011     18 0.024    172 0.030  (qoe)       0.88

  These are word-starters with fair probability (~50%):

       tot occurs   line start   word start  glyph        Pws
       ----------   ----------   ----------  ---------  -----
        456 0.026     26 0.034    340 0.060  (e)         0.54
       1726 0.098     59 0.077     79 0.014  (g)         0.32

  The (o) glyph is structurally like (qo) and hence should 
  be a word-start with high probability. However its frequency
  as line-start is rather low.  Perhaps it is not word-start
  after all.

       tot occurs   line start   word start  glyph        Pws
       ----------   ----------   ----------  ---------  -----
        885 0.050     32 0.042    678 0.119  (o)         0.34

  These apparently MAY occur as word-start but apparently are 
  fairly common also in mid-word:

       tot occurs   line start   word start  glyph        Pws
       ----------   ----------   ----------  ---------  -----
        186 0.011      4 0.005    130 0.023  (r)         0.20
       1152 0.066     23 0.030    543 0.096  (oe)        0.19
        286 0.016      3 0.004     96 0.017  (or)        0.10
        212 0.012      7 0.009    129 0.023  (s)         0.31
        716 0.041     15 0.020    445 0.078  (sc)        0.20
        400 0.023      5 0.007    205 0.036  (t)         0.12
        927 0.053     11 0.014    457 0.080  (tc)        0.11

  It would appear tha glyph (d) is impopular as line-starts, whereas
  (h) is word-start with high probability (~80%). However, if (d) and
  (h) are equivalent, the difference can easily be explained as a
  calligraphic effect. In that case the estimated 
  probability of (d)/(h) being word-start drops to 0.28, which is 
  more reasonable.  The same can be said about the other gallows,
  except that they are less probable as word-starts.  
  
       tot occurs   line start   word start  glyph        Pws
       ----------   ----------   ----------  ---------  -----
        513 0.029     44 0.058     50 0.009  (h)         0.81
       1141 0.065      5 0.007     67 0.012  (d)         0.04
       1654 0.094     49 0.065    117 0.021  ([dh])      0.28

        210 0.012      3 0.004     27 0.005  (hc)        0.13
        412 0.023      1 0.001     15 0.003  (dc)        0.02
        622 0.035      4 0.005     42 0.008  ([dh]c)     0.06

        129 0.007      3 0.004      9 0.002  (hcc)       0.22
        440 0.025      0 0.000     14 0.002  (dcc)       0.00
        569 0.032      4 0.005     13 0.004  ([dh]cc)    0.07

         91 0.005      1 0.001      9 0.002  (hz)        0.10
        147 0.008      1 0.001      3 0.001  (dz)        0.06
        238 0.013      2 0.002     12 0.003  ([dh]z)     0.08

         30 0.002      1 0.001      6 0.001  (hzc)       0.31
         52 0.003      0 0.000      3 0.001  (dzc)       0.00
         82 0.005      1 0.001      9 0.001  ([dh]zc)    0.12
         
  Thus, an HD-gallows MAY be a word-start, but is not likely to be.
  
  The gallows (p) and (f) can be taken as probable word-starts.  The
  (p) sign is in fact anomalous, in that it occurs more often at
  line-start than at (internal) word-start.  This observation is
  consistent with the theory that (p) and (f) are ornate "capitals"
  and hence common as par-starts.
   
       tot occurs   line start   word start  glyph        Pws
       ----------   ----------   ----------  ---------  -----
        195 0.011     71 0.093     22 0.004  (p)         3.44
         32 0.002      2 0.003      4 0.001  (f)         0.59
        227 0.013     73 0.096     26 0.005  ([fp])      3.04
 
  The other FP-gallows are possibly word-starts, but are so rare that
  we should not trust them:

       tot occurs   line start   word start  glyph        Pws
       ----------   ----------   ----------  ---------  -----
         12 0.001      1 0.001      4 0.001  (pz)        0.79
          3 0.000      0 0.000      2 0.000  (fz)        0.00
         15 0.001      1 0.001      6 0.001  ([fp]z)     0.63

          1 0.000      0 0.000      0 0.000  (fzc)       0.00
          9 0.001      0 0.000      2 0.000  (pzc)       0.00
         10 0.001      0 0.000      2 0.000  ([fp]zc)    0.00
  
  The unattached groups (c?) (cc?), and (ccc?) have a small 
  probability of being word-starts, but since they are 
  likely to be errors, we should not trust them.
  
       tot occurs   line start   word start  glyph        Pws
       ----------   ----------   ----------  ---------  -----
         38 0.002      2 0.003     11 0.002  (c?)        0.50
         48 0.003      1 0.001      9 0.002  (cc?)       0.20
         29 0.002      0 0.000      3 0.001  (ccc?)      0.00


  Finally, these glyphs seem NOT bo be valid word-starts:
  
       tot occurs   line start   word start  glyph        Pws
       ----------   ----------   ----------  ---------  -----
        150 0.009      1 0.001     96 0.017  (scc)       0.06
        126 0.007      1 0.001     84 0.015  (tcc)       0.07

         43 0.002      0 0.000      5 0.001  (ak)        0.00
        405 0.023      0 0.000     28 0.005  (ar)        0.00
        414 0.024      0 0.000     36 0.006  (am)        0.00
        495 0.028      0 0.000     17 0.003  (an)        0.00
        557 0.032      0 0.000     39 0.007  (ae)        0.00

       2055 0.117      1 0.001     62 0.011  (bg)        0.00

          2 0.000      0 0.000      0 0.000  (IIIK?)     0.00
          2 0.000      0 0.000      0 0.000  (IIK?)      0.00
          2 0.000      0 0.000      0 0.000  (IIL?)      0.00
          6 0.000      0 0.000      2 0.000  (L?)        0.00
          7 0.000      0 0.000      0 0.000  (N?)        0.00
         10 0.001      0 0.000      0 0.000  (II?)       0.00
         10 0.001      0 0.000      0 0.000  (K?)        0.00
         10 0.001      0 0.000      1 0.000  (I?)        0.00
         12 0.001      0 0.000      1 0.000  (6?)        0.00
         23 0.001      0 0.000      0 0.000  (M?)        0.00
         28 0.002      0 0.000      1 0.000  (air?)      0.00

  By the way, note that the following glyphs appear to occur often af
  word-start, IF we trust the word divisions as transcribed by
  FSG/Currier.  However, if we consider only line-starts, their
  frequency drops considerably.  Hence they are probably false
  word-starts:
  
       tot occurs   line start   word start  glyph        Pws
       ----------   ----------   ----------  ---------  -----
       2055 0.117      1 0.001     62 0.011  (bg)        0.00
       1141 0.065      5 0.007     67 0.012  (d)         0.04
        286 0.016      3 0.004     96 0.017  (or)        0.10
        186 0.011      4 0.005    130 0.023  (r)         0.20
        212 0.012      7 0.009    129 0.023  (s)         0.31
        150 0.009      1 0.001     96 0.017  (scc)       0.06
        400 0.023      5 0.007    205 0.036  (t)         0.12
        126 0.007      1 0.001     84 0.015  (tcc)       0.07