Hacking at the Voynich manuscript - Side notes
034 The "key-like sequences" collected
INTRODUCTION
Here and there in the VMs one finds sequences of very short "words",
each "word" having either a single EVA letter, or a group of EVA
letters that tend to occur together. Those sequences are traditionally
called "key-like sequences", although there is yet no proof that
they have anything to do with cryptography.
Those sequences are nonetheless valuable hints about the
"true" Voynichese alphabet. The aim of this note is to
collect them into a single document, in convenient format.
LOCATIONS
The pages where key-like sequences have been noted are:
f1r alphabets (Latin, Voynichese, Latin) down right margin.
f49v digits and Voynichese letters down left margin.
f57v three rings of Voynichese letters, words, and weirdos.
f66r column of letters down middle of page.
f69r letters around central star.
f75v vertical `word' at upper right corner.
f76r vertical `title' down left margin.
SPECIAL SYMBOLS
The key-like sequences include several Voynichese glyphs that are
otherwise rare ("weirdos") or look like isolated parts of EVA
characters. For brevity we write them as &X where X is a single
capital letter, according to this table:
&X EVA description
-- ---- ------------------------------------------------------
&C ??? squarish <c>; <I> with bent ligature.
&D &140 (or &163) right half of <o>
&G &171 gallows with three loops.
&H &170 like <h> with top detached.
&I I first half of <ih>.
&J ??? like <c> plus two stacked <o>s joined by a short bar.
&K ??? similar to <&G>, with an extra loop (or separate letter).
&L &169 like <Lo^> with <o> above the <L>.
&R &192 like <r> with the <i> almost horizontal, almost a <2>
&T &195 upside down lambda with serif.
&U ??? rounded <v>, or top half of <o> or <y>.
&Y &172 upside down lambda without serif, or <r> with short plume.
PAGE 1r
Jim Reeds writes [15 Jul 94]: "The erased key on f1r is discussed
by Brumbaugh. It seems to have 3 vertical columns of letters. The
leftmost is the ordinary alphabet, lower case italic hand, a
through z. I could not check for the presence of every letter (I'm
not sure about j, for instance) but a, b, c, ... o, p, q, r, s,
... y, z are pretty clear. Next to those are very spotty frags of
Voynich letters. I could make out [EVA] <d> next to a, <r> next to c,
<g> next to y, and one of the gallows letters somewhere near the
q, r, s range. [...] The 3d column seems to be 1 off from the
first: italic minuscules, r next to s, and so on. More is visible
in UV shots than Petersen shows."
PAGE f49v
This page looks like an ordinary Herbal page, but has an
extraordinary amount of text (26 lines), and a column of "letters"
running down the left margin. Here they are:
f o r y e &D k s
-- -- -- -- -- -- -- --
01 02 03 04 05 06 07 08
p o &R y e &D s
-- -- -- -- -- -- --
09 10 11 12 13 14 15
p o &R y e &D
-- -- -- -- -- --
16 17 18 19 20 21
d y e k y
-- -- -- -- --
22 23 24 25 26
The line breaks above are meant to highligh the near-periodicity of
the first 21 lines. In the original page, each character is clearly
aligned with one text line, with a single word space between them.
The sequence follows the slightly tilted and bent outline of the
text, and its letters are spaced like the text lines, including a
wider gap between positions 13 and 14, where matching an extra wide
paragraph break.
The alignment of the sequence letters with the text lines is
puzzling because the line breaks do not seem to be significant: the
text is formatted as paragraphs, and the right margin seems defined
by the irregular outline of the plant. On the other hand the "period"
of the sequence gets shorter as the lines get wider; so perhaps
these symbols are counting words, or letters?
To the left of items 02--06 there are Western digits "1" through
"5". It is not obvious whether they are in the same hand as
the sequence and/or the main text.
The unusual characters <&R> and <&D> and the deformed shape of
some letters like <k> and <y> suggest that both the "key-like
sequence" and the digits may be later additions. Perhaps the reason
why the digit sequence starts on line 2 is that the scribbler
though that <r> was the digit "2", and tried to line them up.
That may also be the reason why he used <&R> instead of <r>
further down the sequence.
PAGE f57v
Page f57v contains a B-language paragraph and a circular diagram
with four rings of text, surrounding four figures and some phrases
at odd angles.
The third ring of text, from the center out (f57v:3) contains a list
of 17 isolated "letters", repeated 4 times with slight variations.
An extra-wide gap at 10:30 and an isolated word at 11:00 outside the
diagram strongly suggest the sequence starts there.
o l j r v x k m f &L t r &H &G y &I &Y
o l d r v x k m f &L t r &H &G y c &Y
o l d r v x k m p &L t r &H &G y c &Y
o l d r v x k m p &L t r &H &G y c &Y
-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17
18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34
35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51
52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68
As Dennis Mardle observed, these variations suggest that the pairs
<j> and <d>, <f> and <p>, and <&I> and <c> are equivalent (or at
least closely related). On the other hand, the consistent
occurrence of <k> and <t> in their respective places argues for them
being distinct.
Note that charater 16 and its periods have a well-developed
ligature to the right, so the reading <&I>/<c> seems more
correct than <i>/<e>.
The second-innermost ring of text (f57v:2) has a few "letters"
interspersed among normal words:
daiin otey ofchey shes o d okchod l o
- - - -
lkeeol dkedar o f aros r y
- - - -
chedaiin k chety x docodal v,o tchor sh
- - - - --
tedar dal &T daiin aiin otal daro v
-- -
The innermost ring (f57v:1) contains mostly single-letter words with
a few ordinary-looking words. Some spaces are questionable, but a
case can be made for 34 items in all:
x r l o v l
------- ------- ------- ------- ------- -------
01 02 03 04 05 06
r m aiin d &H &C
------- ------- ------- ------- ------- -------
07 08 09 10 11 12
f r y l k x
------- ------- ------- ------- ------- -------
13 14 15 16 17 18
l r &K,ar o r &U
------- ------- ------- ------- ------- -------
19 20 21 22 23 24
t l s d y d,ar
------- ------- ------- ------- ------- -------
25 26 27 28 29 30
teodar otadal sheky otchody
------- ------- ------- -------
31 32 33 34
PAGE f66r
Page f66r is on the same face of the same bifolio as f57v. It
contains a column with 15 "labels" (normal-length words), another
column with 34 "letters", and a third column with 32 lines of
"B"-language text grouped into several paragraphs.
Here are the letters in the middle column:
y o s sh y d o f * x air d
---+--- ---+--- ---+--- ---+--- ---+--- ---+---
01 02 03 04 05 06 07 08 09 10 11 12
sh y f f y o d r f c &T x
---+--- ---+--- ---+--- ---+--- ---+--- ---+---
13 14 15 16 17 18 19 20 21 22 23 24
t o &T l r t o x p d
---+--- ---+--- ---+--- ---+--- ---+---
25 26 27 28 29 30 31 32 33 34
[ Can anyone provide a reading for #09? ]
The <sh> in position 13 looks more like <&I'h>.
The two <f>s in positions 15 and 16 are different: the horizontal
stroke that crosses the leg is straight in #15, but ends with a
<c>-like hook in #16. The corresponding words on the left column
also show this difference. In the case, the difference may be
explained by crowding -- the hook can be drawn only then the <f> is
word-initial. But this explanation does not apply to the isolated
letters. Hmmm...
Note again that entry 22 is definitely a <c> and not an <e>.
The three columns begin and end more or less in sync (one label for
every two letters and two lines of text) but get out of alignment
around the middle. Note that 34 = 2 × 17. One possibility is that
two words (and the alignment) were lost by the scribe, having 17
words in column 1, one for each of the 17 pairs of "letters" on
column 2.
Should the middle column be read as 17 pairs, as suggested by the
initial alignments with the left column? Some of the pairs do seem
to have some logic, especially if we note that page f66r lies at the
boundary between languages A and B:
pair 1: <y> and <o> - occur in similar contexts ans bay be equivalent
pair 2: <s> and <sh> - have similar shapes and distributions.
pair 3: <y> and <d> - are dissimilar, but <dy> is a
characteristic ending in language B.
pair 4: <*> and <x> - ?
pair 5: <o> and <f> - the <o> often occurs before <f>.
pair 6: <aiin> and <d> - <daiin> is the most common word of language A.
pair 7: <sh> and <y> - <shy> is also a common combination.
pair 8: <f> and <f> - contrasting straight and hooked arm?
pair 9: <y> and <o> - again! see above.
pair 10: <d> and <s> - somewhat similar shapes and distributions.
pair 11: <f> and <c> - the <c> often preceds <f>
pair 12: <&T> and <x> - two weirdos.
pair 13: <t> and <o> - often together.
pair 14: <&T> and <l> - ?
pair 15: <r> and <t> - ?
pair 16: <o> and <x> - ?
pair 17: <p> and <d> - the former may substitute for the latter?
My guess is that the topic of this page is the Voynichese language
itself, perhaps an explanation of the differences between "languages"
A and B (or a change in the spelling system).
PAGE f69r
This "cosmological" page displays a circular diagram, at whose
center is a disk divided into six unequal parts by the rays of a
star. Each sector is labeled with one(?) Voynichese letter.
Clockwise from 01:45 (the presumed "start" of the surrounding
diagram) the letters are
l s &J y d o
-- -- -- -- -- --
01 02 03 04 05 06
The <&J> has some topological resemblance to EVA <cd>, but the shape
and angles do not seem to match.
PAGE f75v
On the upper right corner of this page there is a key-like sequence
with five letters, all but the last one aligned with the text lines.
Here they are:
s l l o r
-- -- -- -- --
01 02 03 04 05
The whole sequence lies above the final letter (<r> or <s>) of the
5th text line. It can be argued that the sequence includes that
letter too.
PAGE f76r
This page has no drawings, only four paragraphs of text. The first
paragraph takes half a page (29 lines), and has a column of
"letters" along its left margin. The letters are irregularly spaced
along the column, but seem to be aligned with the main text lines.
Here they are:
s d q s o l k r s
-- -- -- -- -- -- -- -- --
01 02 03 04 05 06 07 08 09
CONCLUSIONS
The key-like sequences should give us clues about the nature of the
Voynichese alphabet, especially about the grouping of strokes into
characters and the equivalence of character shapes.
There are two caveats we should keep in mind, however.
First, some sequences may not be original, but scribblings by some
later owner (or library patron) who tried to decipher the
manuscript.
This warning is particularly apt for those sequences that have
parallel sequences of western letters or digits, such as the
sequence on f1r and the vertical sequence on f49v. The latter is
dubious also because it uses characters that are not found
elsewhere, not even on other key sequences, and its symbols are
somewhat ill-proportioned.
On the other hand, it is somewhat unlikely that someone would
scribble a "key" on a potentially valuable book, before he knew
whether the key worked or not. And the warning hardly applies to the
sequences on f57v, f66r, and f69v, that are obviously part of the
original layout.
Second, the symbols used in the sequences may not all be letters. In
modern texts one finds plenty of non-letter symbols, often mixed
with letters, in certain special contexts: digits, punctuation,
footnote daggers and stars, item bullets, paragraph and section
signs, copyright marks, mathematical symbols, etc. Obviously the
key-like sequences above are all "'special contexts" where such
symbols are expected to occur.
In particular, with regards to the 4×17 sequence of f57v, it is hard
to believe that the <&L> ("Lo^") symbol (which occurs nowhere else
in the text) is a letter of the alphabet.
Keeping these caveats in mind, what can we learn from the key-like
sequences?
First, the key sequences suggest that the following symbols are
complete single letters:
confirming sequences
-----------------------------------------------
symb 49v 57v:3 57v:2 57v:1 66r 69r 75v 76r
----- ----- ----- ----- ----- ----- ----- ----- -----
<o> XXX XXX XXX XXX XXX XXX XXX XXX
<y> XXX XXX XXX XXX XXX XXX - -
----- ----- ----- ----- ----- ----- ----- ----- -----
<r> XXX XXX XXX XXX XXX - XXX XXX
<s> - - - XXX XXX XXX XXX XXX
<l> - XXX XXX XXX XXX - XXX XXX
----- ----- ----- ----- ----- ----- ----- ----- -----
<d> XXX - XXX XXX XXX XXX - XXX
<m> - - - XXX - - - -
----- ----- ----- ----- ----- ----- ----- ----- -----
<c> - XXX - - XXX - - -
<&I> - XXX - - - - - -
<e> XXX - - - - - - -
----- ----- ----- ----- ----- ----- ----- ----- -----
<f> XXX XXX XXX XXX XXX - - -
<p> XXX XXX - - XXX - - -
<k> XXX XXX XXX XXX - - - XXX
<t> - XXX - XXX XXX - - -
----- ----- ----- ----- ----- ----- ----- ----- -----
<sh> - - XXX - XXX - - -
----- ----- ----- ----- ----- ----- ----- ----- -----
<q> - - - - - - - XXX
----- ----- ----- ----- ----- ----- ----- ----- -----
<v> - XXX XXX XXX - - - -
<x> - XXX XXX XXX XXX - - -
----- ----- ----- ----- ----- ----- ----- ----- -----
<aiin> - - - XXX - - - -
<air> - - - - XXX - - -
----- ----- ----- ----- ----- ----- ----- ----- -----
<&C> - - - XXX - - - -
<&G> - XXX - - - - - -
<&H> - XXX - XXX - - - -
<&J> - - - - - XXX - -
<&K> - - - XXX - - - -
<&L> - XXX - - - - - -
<&T> - - - - XXX - - -
<&Y> - XXX XXX - - - - -
<&U> - - - XXX - - - -
----- ----- ----- ----- ----- ----- ----- ----- -----
The following EVA letters and proposed letters are conspicuously absent
from the sequences:
<a> <b> <g> <h> <i> <n> <u> <z>
<ch> <ee> <eee>
<cth> <ckh> <cph> <cfh>
<ith> <ikh> <iph> <ifh>
<ct> <ck> <cp> <cf>
Note in particular that <aiin> and <air> are used as single letters.
On the other hand the sequences do not give any direct evidence for
the distinction between <aiin> and its variants <ain>, <aiiin>,
<oiin>, etc.
The absence of <a> in these sequences supports the claim that <a> is
simply a calligraphic variant of <o> or <y>.
On the other hand the sequences strongly suggest that <o> and <y>
are distinct letters.
Note that <e> too is absent, except for the highly suspect sequence
f49v; this sequence also include <&D> (the right half of <o>), not
used anywhere else. This suggests that <e> is not an independent
letter, but rather a modifier of a nearby letter, as suggested by
the OKOKOKO analysis.
The letter <c> is distinct from <e>. Its occurrences on f57v:3 (the
4×17 sequence) and f66r show a well-marked ligature, which would be
superfluous if <c> were just a calligraphic variant of <e>.
The letters <&I> and <c> are probably equivalent. This is directly
supported by their occurrence in equivalent spots of the 4×17
sequence, and indirectly by the fact that other sequences that
include <c> do not include <&I>.
The absence of the `platform gallows' (<cth> and variations)
confirms the longstanding suspicion that they are contractions
of the plain gallows with other characters, possibly <ch>.
It seems that <m> is distinct from <d> (and <j>), as attested by
f57v:3. On the other hand, f57v:3 suggests that <d> and <j> are the
same letter.
The distinction between <r> and <s> is supported by their clearly
drawn samples in the sequences of f66r, f75v, and f76r.
Sequence f49v, for whatever it is worth, suggests that <r> and <&R>
are the same.
The letters <f> and <p> may be equivalent. This is suggested by their
occurrence in corresponding spots of the f49v and f57v:3 sequences.
However, their separate occurrences on page f66r weakly suggests
that they are distinct.
On the other hand, f49v and f57v:3 suggest that <f> and <p> are distinct
from <t> and <k>.
Last edited on 1999-02-01 04:52:00 by stolfi