Note 034

Hacking at the Voynich manuscript - Side notes
034 The "key-like sequences" collected

INTRODUCTION

  Here and there in the VMs one finds sequences of very short "words",
  each "word" having either a single EVA letter, or a group of EVA
  letters that tend to occur together.  Those sequences are traditionally
  called "key-like sequences", although there is yet no proof that 
  they have anything to do with cryptography.

  Those sequences are nonetheless valuable hints about the 
  "true" Voynichese alphabet.  The aim of this note is to 
  collect them into a single document, in convenient format.
  
LOCATIONS

  The pages where key-like sequences have been noted are:
  
    f1r   alphabets (Latin, Voynichese, Latin) down right margin.
    
    f49v  digits and Voynichese letters down left margin.
    
    f57v  three rings of Voynichese letters, words, and weirdos.
    
    f66r  column of letters down middle of page.
    
    f69r  letters around central star.
    
    f75v  vertical `word' at upper right corner.
    
    f76r  vertical `title' down left margin.

SPECIAL SYMBOLS

  The key-like sequences include several Voynichese glyphs that are
  otherwise rare ("weirdos") or look like isolated parts of EVA
  characters.  For brevity we write them as &X where X is a single
  capital letter, according to this table:
    
    &X  EVA   description
    --  ----  ------------------------------------------------------
    &C  ???   squarish <c>; <I> with bent ligature.
    &D  &140  (or &163)  right half of <o>
    &G  &171  gallows with three loops.
    &H  &170  like <h> with top detached.
    &I  I     first half of <ih>.
    &J  ???   like <c> plus two stacked <o>s joined by a short bar.
    &K  ???   similar to <&G>, with an extra loop (or separate letter). 
    &L  &169  like <Lo^> with <o> above the <L>.
    &R  &192  like <r> with the <i> almost horizontal, almost a <2>
    &T  &195  upside down lambda with serif.
    &U  ???   rounded <v>, or top half of <o> or <y>.
    &Y  &172  upside down lambda without serif, or <r> with short plume.

PAGE 1r

  Jim Reeds writes [15 Jul 94]: "The erased key on f1r is discussed
  by Brumbaugh. It seems to have 3 vertical columns of letters. The
  leftmost is the ordinary alphabet, lower case italic hand, a
  through z. I could not check for the presence of every letter (I'm
  not sure about j, for instance) but a, b, c, ... o, p, q, r, s,
  ... y, z are pretty clear. Next to those are very spotty frags of
  Voynich letters. I could make out [EVA] <d> next to a, <r> next to c,
  <g> next to y, and one of the gallows letters somewhere near the
  q, r, s range. [...] The 3d column seems to be 1 off from the
  first: italic minuscules, r next to s, and so on. More is visible
  in UV shots than Petersen shows."

PAGE f49v

  This page looks like an ordinary Herbal page, but has an
  extraordinary amount of text (26 lines), and a column of "letters"
  running down the left margin. Here they are:

    f  o  r  y  e  &D k  s  
    -- -- -- -- -- -- -- -- 
    01 02 03 04 05 06 07 08

    p  o  &R y  e  &D    s  
    -- -- -- -- -- --    -- 
    09 10 11 12 13 14    15

    p  o  &R y  e  &D
    -- -- -- -- -- --       
    16 17 18 19 20 21 

    d  y  e  k  y 
    -- -- -- -- --          
    22 23 24 25 26  

  The line breaks above are meant to highligh the near-periodicity of
  the first 21 lines. In the original page, each character is clearly
  aligned with one text line, with a single word space between them.
  The sequence follows the slightly tilted and bent outline of the
  text, and its letters are spaced like the text lines, including a
  wider gap between positions 13 and 14, where matching an extra wide
  paragraph break.
  
  The alignment of the sequence letters with the text lines is
  puzzling because the line breaks do not seem to be significant: the
  text is formatted as paragraphs, and the right margin seems defined
  by the irregular outline of the plant. On the other hand the "period"
  of the sequence gets shorter as the lines get wider; so perhaps
  these symbols are counting words, or letters?
  
  To the left of items 02--06 there are Western digits "1" through
  "5".  It is not obvious whether they are in the same hand as
  the sequence and/or the main text.
  
  The unusual characters <&R> and <&D> and the deformed shape of
  some letters like <k> and <y> suggest that both the "key-like
  sequence" and the digits may be later additions.  Perhaps the reason
  why the digit sequence starts on line 2 is that the scribbler
  though that <r> was the digit "2", and tried to line them up.
  That may also be the reason why he used <&R> instead of <r>
  further down the sequence.

PAGE f57v

  Page f57v contains a B-language paragraph and a circular diagram
  with four rings of text, surrounding four figures and some phrases
  at odd angles.
  
  The third ring of text, from the center out (f57v:3) contains a list
  of 17 isolated "letters", repeated 4 times with slight variations.
  An extra-wide gap at 10:30 and an isolated word at 11:00 outside the
  diagram strongly suggest the sequence starts there.

    o  l  j  r  v  x  k  m  f  &L t  r  &H &G y  &I &Y
    o  l  d  r  v  x  k  m  f  &L t  r  &H &G y  c  &Y
    o  l  d  r  v  x  k  m  p  &L t  r  &H &G y  c  &Y
    o  l  d  r  v  x  k  m  p  &L t  r  &H &G y  c  &Y 
    -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
    01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17
    18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34
    35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51
    52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68  

  As Dennis Mardle observed, these variations suggest that the pairs
  <j> and <d>, <f> and <p>, and <&I> and <c> are equivalent (or at
  least closely related).  On the other hand, the consistent
  occurrence of <k> and <t> in their respective places argues for them
  being distinct.
  
  Note that charater 16 and its periods have a well-developed
  ligature to the right, so the reading <&I>/<c> seems more
  correct than <i>/<e>.

  The second-innermost ring of text (f57v:2) has a few "letters"
  interspersed among normal words:

    daiin otey ofchey shes o d okchod l o  
                           - -        - - 

    lkeeol dkedar o f aros r y
                  - -      - -                       

    chedaiin k chety x docodal v,o tchor sh
             -       -         - -       --

    tedar dal &T daiin aiin otal daro v
              --                      -

  The innermost ring (f57v:1) contains mostly single-letter words with
  a few ordinary-looking words. Some spaces are questionable, but a
  case can be made for 34 items in all:

    x       r       l       o       v       l       
    ------- ------- ------- ------- ------- -------
    01      02      03      04      05      06     
  
    r       m       aiin    d       &H      &C      
    ------- ------- ------- ------- ------- -------
    07      08      09      10      11      12     

    f       r       y       l       k       x      
    ------- ------- ------- ------- ------- -------
    13      14      15      16      17      18     

    l       r       &K,ar   o       r       &U        
    ------- ------- ------- ------- ------- -------
    19      20      21      22      23      24     

    t       l       s       d       y       d,ar          
    ------- ------- ------- ------- ------- -------
    25      26      27      28      29      30     

    teodar  otadal  sheky   otchody
    ------- ------- ------- -------
    31      32      33      34
    
PAGE f66r

  Page f66r is on the same face of the same bifolio as f57v. It
  contains a column with 15 "labels" (normal-length words), another
  column with 34 "letters", and a third column with 32 lines of
  "B"-language text grouped into several paragraphs.
  
  Here are the letters in the middle column:

    y   o   s   sh  y   d   o   f   *   x   air d  
    ---+--- ---+--- ---+--- ---+--- ---+--- ---+--- 
    01  02  03  04  05  06  07  08  09  10  11  12 

    sh  y   f   f   y   o   d   r   f   c   &T  x  
    ---+--- ---+--- ---+--- ---+--- ---+--- ---+--- 
    13  14  15  16  17  18  19  20  21  22  23  24 

    t   o   &T  l   r   t   o   x   p   d  
    ---+--- ---+--- ---+--- ---+--- ---+---
    25  26  27  28  29  30  31  32  33  34 

  [ Can anyone provide a reading for #09? ]
  
  The <sh> in position 13 looks more like <&I'h>.
  
  The two <f>s in positions 15 and 16 are different: the horizontal
  stroke that crosses the leg is straight in #15, but ends with a
  <c>-like hook in #16. The corresponding words on the left column
  also show this difference. In the case, the difference may be
  explained by crowding -- the hook can be drawn only then the <f> is
  word-initial. But this explanation does not apply to the isolated
  letters. Hmmm...
  
  Note again that entry 22 is definitely a <c> and not an <e>.

  The three columns begin and end more or less in sync (one label for
  every two letters and two lines of text) but get out of alignment
  around the middle. Note that 34 = 2 × 17. One possibility is that
  two words (and the alignment) were lost by the scribe, having 17
  words in column 1, one for each of the 17 pairs of "letters" on
  column 2.
  
  Should the middle column be read as 17 pairs, as suggested by the
  initial alignments with the left column? Some of the pairs do seem
  to have some logic, especially if we note that page f66r lies at the
  boundary between languages A and B:
  
    pair  1: <y> and <o> - occur in similar contexts ans bay be equivalent
    
    pair  2: <s> and <sh> - have similar shapes and distributions.
    
    pair  3: <y> and <d> - are dissimilar, but <dy> is a 
             characteristic ending in language B.
             
    pair  4: <*> and <x> - ?                                             
                                                                         
    pair  5: <o> and <f> - the <o> often occurs before <f>.              
                                                                         
    pair  6: <aiin> and <d> - <daiin> is the most common word of language A.
                                                                         
    pair  7: <sh> and <y> - <shy> is also a common combination.          
                                                                         
    pair  8: <f> and <f> - contrasting straight and hooked arm?          
                                                                         
    pair  9: <y> and <o> - again! see above.                             
                                                                         
    pair 10: <d> and <s> - somewhat similar shapes and distributions.    
                                                                         
    pair 11: <f> and <c> - the <c> often preceds <f>                     
                                                                         
    pair 12: <&T> and <x> - two weirdos.                                 
                                                                         
    pair 13: <t> and <o> - often together.                               
                                                                         
    pair 14: <&T> and <l> - ?                                            
                                                                         
    pair 15: <r> and <t> - ?                                             
                                                                         
    pair 16: <o> and <x> - ?                                             
                                                                         
    pair 17: <p> and <d> - the former may substitute for the latter? 
    
  My guess is that the topic of this page is the Voynichese language 
  itself, perhaps an explanation of the differences between "languages"
  A and B (or a change in the spelling system).

PAGE f69r

  This "cosmological" page displays a circular diagram, at whose
  center is a disk divided into six unequal parts by the rays of a
  star. Each sector is labeled with one(?) Voynichese letter.
  
  Clockwise from 01:45 (the presumed "start" of the surrounding
  diagram) the letters are
  
     l   s   &J  y   d   o
     --  --  --  --  --  --
     01  02  03  04  05  06
     
  The <&J> has some topological resemblance to EVA <cd>, but the shape
  and angles do not seem to match.
  
PAGE f75v

  On the upper right corner of this page there is a key-like sequence
  with five letters, all but the last one aligned with the text lines.
  Here they are:
  
    s  l  l  o  r 
    -- -- -- -- --
    01 02 03 04 05
    
  The whole sequence lies above the final letter (<r> or <s>) of the
  5th text line. It can be argued that the sequence includes that
  letter too.
  
PAGE f76r

  This page has no drawings, only four paragraphs of text. The first
  paragraph takes half a page (29 lines), and has a column of
  "letters" along its left margin. The letters are irregularly spaced
  along the column, but seem to be aligned with the main text lines.
  Here they are:
  
    s  d  q  s  o  l  k  r  s
    -- -- -- -- -- -- -- -- --
    01 02 03 04 05 06 07 08 09

CONCLUSIONS

  The key-like sequences should give us clues about the nature of the
  Voynichese alphabet, especially about the grouping of strokes into
  characters and the equivalence of character shapes.
  
  There are two caveats we should keep in mind, however.
  
  First, some sequences may not be original, but scribblings by some
  later owner (or library patron) who tried to decipher the
  manuscript.
  
  This warning is particularly apt for those sequences that have
  parallel sequences of western letters or digits, such as the
  sequence on f1r and the vertical sequence on f49v. The latter is
  dubious also because it uses characters that are not found
  elsewhere, not even on other key sequences, and its symbols are
  somewhat ill-proportioned.
  
  On the other hand, it is somewhat unlikely that someone would
  scribble a "key" on a potentially valuable book, before he knew
  whether the key worked or not. And the warning hardly applies to the
  sequences on f57v, f66r, and f69v, that are obviously part of the
  original layout.

  Second, the symbols used in the sequences may not all be letters. In
  modern texts one finds plenty of non-letter symbols, often mixed
  with letters, in certain special contexts: digits, punctuation,
  footnote daggers and stars, item bullets, paragraph and section
  signs, copyright marks, mathematical symbols, etc. Obviously the
  key-like sequences above are all "'special contexts" where such
  symbols are expected to occur.
  
  In particular, with regards to the 4×17 sequence of f57v, it is hard
  to believe that the <&L> ("Lo^") symbol (which occurs nowhere else
  in the text) is a letter of the alphabet.
  
  Keeping these caveats in mind, what can we learn from the key-like
  sequences?
  
  First, the key sequences suggest that the following symbols are
  complete single letters:

          confirming sequences
          -----------------------------------------------
    symb   49v  57v:3 57v:2 57v:1  66r   69r   75v   76r 
    ----- ----- ----- ----- ----- ----- ----- ----- -----
    <o>    XXX   XXX   XXX   XXX   XXX   XXX   XXX   XXX 
    <y>    XXX   XXX   XXX   XXX   XXX   XXX    -     -  
    ----- ----- ----- ----- ----- ----- ----- ----- -----
    <r>    XXX   XXX   XXX   XXX   XXX    -    XXX   XXX 
    <s>     -     -     -    XXX   XXX   XXX   XXX   XXX 
    <l>     -    XXX   XXX   XXX   XXX    -    XXX   XXX 
    ----- ----- ----- ----- ----- ----- ----- ----- -----
    <d>    XXX    -    XXX   XXX   XXX   XXX    -    XXX 
    <m>     -     -     -    XXX    -     -     -     -       
    ----- ----- ----- ----- ----- ----- ----- ----- -----
    <c>     -    XXX    -     -    XXX    -     -     -  
    <&I>    -    XXX    -     -     -     -     -     -  
    <e>    XXX    -     -     -     -     -     -     -  
    ----- ----- ----- ----- ----- ----- ----- ----- -----
    <f>    XXX   XXX   XXX   XXX   XXX    -     -     -        
    <p>    XXX   XXX    -     -    XXX    -     -     -       
    <k>    XXX   XXX   XXX   XXX    -     -     -    XXX 
    <t>     -    XXX    -    XXX   XXX    -     -     -       
    ----- ----- ----- ----- ----- ----- ----- ----- -----
    <sh>    -     -    XXX    -    XXX    -     -     -       
    ----- ----- ----- ----- ----- ----- ----- ----- -----
    <q>     -     -     -     -     -     -     -    XXX 
    ----- ----- ----- ----- ----- ----- ----- ----- -----
    <v>     -    XXX   XXX   XXX    -     -     -     -       
    <x>     -    XXX   XXX   XXX   XXX    -     -     -       
    ----- ----- ----- ----- ----- ----- ----- ----- -----
    <aiin>  -     -     -    XXX    -     -     -     -      
    <air>   -     -     -     -    XXX    -     -     -             
    ----- ----- ----- ----- ----- ----- ----- ----- -----
    <&C>    -     -     -    XXX    -     -     -     -  
    <&G>    -    XXX    -     -     -     -     -     -  
    <&H>    -    XXX    -    XXX    -     -     -     -       
    <&J>    -     -     -     -     -    XXX    -     -  
    <&K>    -     -     -    XXX    -     -     -     -       
    <&L>    -    XXX    -     -     -     -     -     -  
    <&T>    -     -     -     -    XXX    -     -     -       
    <&Y>    -    XXX   XXX    -     -     -     -     -       
    <&U>    -     -     -    XXX    -     -     -     -      
    ----- ----- ----- ----- ----- ----- ----- ----- -----
    
  The following EVA letters and proposed letters are conspicuously absent
  from the sequences:
  
    <a> <b> <g> <h> <i> <n> <u> <z>
    
    <ch> <ee> <eee>
    
    <cth> <ckh> <cph> <cfh>
    
    <ith> <ikh> <iph> <ifh>
    
    <ct>  <ck>  <cp>  <cf> 
    
  Note in particular that <aiin> and <air> are used as single letters.
  On the other hand the sequences do not give any direct evidence for
  the distinction between <aiin> and its variants <ain>, <aiiin>,
  <oiin>, etc.

  The absence of <a> in these sequences supports the claim that <a> is
  simply a calligraphic variant of <o> or <y>. 
  
  On the other hand the sequences strongly suggest that <o> and <y>
  are distinct letters.

  Note that <e> too is absent, except for the highly suspect sequence
  f49v; this sequence also include <&D> (the right half of <o>), not
  used anywhere else. This suggests that <e> is not an independent
  letter, but rather a modifier of a nearby letter, as suggested by
  the OKOKOKO analysis.
      
  The letter <c> is distinct from <e>. Its occurrences on f57v:3 (the
  4×17 sequence) and f66r show a well-marked ligature, which would be
  superfluous if <c> were just a calligraphic variant of <e>.
    
  The letters <&I> and <c> are probably equivalent. This is directly
  supported by their occurrence in equivalent spots of the 4×17
  sequence, and indirectly by the fact that other sequences that
  include <c> do not include <&I>.
  
  The absence of the `platform gallows' (<cth> and variations)
  confirms the longstanding suspicion that they are contractions
  of the plain gallows with other characters, possibly <ch>.
  
  It seems that <m> is distinct from <d> (and <j>), as attested by
  f57v:3. On the other hand, f57v:3 suggests that <d> and <j> are the
  same letter.
    
  The distinction between <r> and <s> is supported by their clearly
  drawn samples in the sequences of f66r, f75v, and f76r.
      
  Sequence f49v, for whatever it is worth, suggests that <r> and <&R>
  are the same.
    
  The letters <f> and <p> may be equivalent.  This is suggested by their 
  occurrence in corresponding spots of the f49v and f57v:3 sequences.
  However, their separate occurrences on page f66r weakly suggests
  that they are distinct.
      
  On the other hand, f49v and f57v:3 suggest that <f> and <p> are distinct
  from <t> and <k>.

Last edited on 1999-02-01 04:52:00 by stolfi