Hacking at the Voynich manuscript - Side notes 034 The "key-like sequences" collected Last edited on 1999-12-04 20:49:43 by stolfi INTRODUCTION Here and there in the VMs one finds sequences of very short "words", each "word" having either a single EVA letter, or a group of EVA letters that tend to occur together. Those sequences are traditionally called "key-like sequences", although there is yet no proof that they have anything to do with cryptography. Those sequences are nonetheless valuable hints about the "true" Voynichese alphabet. The aim of this note is to collect them into a single document, in convenient format. This note was updated on 05 Apr 1999, taking into account Rene Zandbergen's Beinecke trip report. LOCATIONS The pages where key-like sequences have been noted are: f1r alphabets (Latin, Voynichese, Latin) down right margin. f49v digits and Voynichese letters down left margin. f57v three rings of Voynichese letters, words, and weirdos. f66r column of letters down middle of page. f69r letters around central star. f75v vertical `word' at upper right corner. f76r vertical `title' down left margin. SPECIAL SYMBOLS The key-like sequences include several Voynichese glyphs that are otherwise rare ("weirdos") or look like isolated parts of EVA characters. For brevity we write them as &X where X is a single capital letter, according to this table: &X EVA description -- ---- ------------------------------------------------------ &A ??? large `picnic table' () with left-facing "feet". &B ??? large `picnic table' () with right-facing "feet". &D &140 (or &163) right half of &E ??? subtype of with hooked horizontal arm. &F ??? subtype of with straight horizontal arm. &G &171 a `gallows' with three loops. &H &170 like with top detached. &I I first half of . &J ??? like with raised by its own height. &K ??? similar to <&G>, with an extra loop (or separate letter). &L &169 like with above the . &R &192 like with the almost horizontal, almost a <2> &S I"h like with a straight left leg. &P ??? subtype of

with hooked horizontal arm. &Q ??? subtype of

with straight horizontal arm. &T &195 upside down lambda with serif. &Y &172 upside down lambda without serif, or with short plume. PAGE 1r Jim Reeds writes [15 Jul 1994]: "The erased key on f1r is discussed by Brumbaugh. It seems to have 3 vertical columns of letters. The leftmost is the ordinary alphabet, lower case italic hand, a through z. I could not check for the presence of every letter (I'm not sure about j, for instance) but a, b, c, ... o, p, q, r, s, ... y, z are pretty clear. Next to those are very spotty frags of Voynich letters. I could make out [EVA] next to a, next to c, next to y, and one of the gallows letters somewhere near the q, r, s range. [...] The 3d column seems to be 1 off from the first: italic minuscules, r next to s, and so on. More is visible in UV shots than Petersen shows." Rene Zandbergen adds [04 Apr 1999]: "Next to 'a' is [EVA] . Next to 'c' is . is below 'h'. Of the roman alphabet, one can see a-f, h, o-r, u-v and z (more like zeta). Of the shifted alphabet two columns to the right I could only see 'p' and 'q' (aligned with 'o' and 'p' respectively)." ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? rht ROM a b c d e f g h i j k l m n o p q r s t u v w x y z - mid EVA * d * r * * * * y * * * * * * * * * * * * * * * * g * lft ROM - a b c d e f g h i j k l m n o p q r s t u v w x y z ? ? ? ? ? ? ? ? ? ? The question marks indicate Roman letters that were not confirmed by Jim or Rene. It is not clear when these sequences were written or erased. PAGE f49v This page looks like an ordinary Herbal page, but has an extraordinary amount of text (26 lines), and a column of "letters" running down the left margin. Here they are: &E o r y e &D k s -- -- -- -- -- -- -- -- 01 02 03 04 05 06 07 08 &P o &R y e &D s -- -- -- -- -- -- -- 09 10 11 12 13 14 15 &P o &R y e &D -- -- -- -- -- -- 16 17 18 19 20 21 d y e k y -- -- -- -- -- 22 23 24 25 26 The line breaks above are meant to highligh the near-periodicity of the first 21 lines. In the original page, each character is clearly aligned with one text line, with a single word space between them. The sequence follows the slightly tilted and bent outline of the text, and its letters are spaced like the text lines, including a wider gap between positions 13 and 14, where matching an extra wide paragraph break. The alignment of the sequence letters with the text lines is puzzling because the line breaks do not seem to be significant: the text is formatted as paragraphs, and the right margin seems defined by the irregular outline of the plant. On the other hand the "period" of the sequence gets shorter as the lines get wider; so perhaps these symbols are counting words, or letters? To the left of items 02--06 there are Western digits "1" through "5". It is not obvious whether they are in the same hand as the sequence and/or the main text. However Rene [04 Apr 1999] reports that all three columns seem to be similar in ink and pen to the main text. The unusual characters <&R> and <&D> and the deformed shape of some letters like and suggest that both the "key-like sequence" and the digits may be later additions. Perhaps the reason why the digit sequence starts on line 2 is that the scribbler though that was the digit "2", and tried to line them up. That may also be the reason why he used <&R> instead of further down the sequence. PAGE f57v Page f57v contains a B-language paragraph and a circular diagram with four rings of text, surrounding four figures and some phrases at odd angles. The third ring of text from the center out (f57v:3) contains a list of 17 isolated "letters", repeated 4 times with slight variations. An extra-wide gap at 10:30 and an isolated word at 11:00 outside the diagram strongly suggest the sequence starts there. o l j r v x k m &E &L t r &H &G y &I &Y o l d r v x k m &E &L t r &H &G y c &Y o l d r v x k m &P &L t r &H &G y c &Y o l d r v x k m &P &L t r &H &G y c &Y -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 As Dennis Mardle observed, these variations suggest that the pairs and , and

, and <&I> and are equivalent (or at least closely related). On the other hand, the consistent occurrence of and in their respective places argues for them being distinct. Note that charater 16 and its periods have a well-developed ligature to the right, so the reading <&I>/ seems more correct than /. The second-innermost ring of text (f57v:2) has a few "letters" interspersed among normal words: daiin otey ofchey shes o d okchod l o - - - - lkeeol dkedar o &F aros r y - -- - - chedaiin k chety x docodal v,o tchor sh - - - - -- tedar dal &T daiin aiin otal daro v -- - The innermost ring (f57v:1) contains mostly single-letter words with a few ordinary-looking words. Some spaces are questionable, but a case can be made for 34 items in all: x r l o v l ------- ------- ------- ------- ------- ------- 01 02 03 04 05 06 r m aiin d &H c ------- ------- ------- ------- ------- ------- 07 08 09 10 11 12 &E r y l k x ------- ------- ------- ------- ------- ------- 13 14 15 16 17 18 l r &K,ar o r y ------- ------- ------- ------- ------- ------- 19 20 21 22 23 24 t l s d y d,ar ------- ------- ------- ------- ------- ------- 25 26 27 28 29 30 teodar otadal sheky otchody ------- ------- ------- ------- 31 32 33 34 PAGE f66r Page f66r is on the same face of the same bifolio as f57v. It contains a column with 15 "labels" (normal-length words), another column with 34 "letters", and a third column with 32 lines of "B"-language text grouped into several paragraphs. Here are the letters in the middle column: y o s sh y d o &F * x air d ---+--- ---+--- ---+--- ---+--- ---+--- ---+--- 01 02 03 04 05 06 07 08 09 10 11 12 &S y &F &E y o d r &F c &T &A ---+--- ---+--- ---+--- ---+--- ---+--- ---+--- 13 14 15 16 17 18 19 20 21 22 23 24 t o &T l r t o &B &Q d ---+--- ---+--- ---+--- ---+--- ---+--- 25 26 27 28 29 30 31 32 33 34 [ Can anyone provide a reading for #09? ] The two s in positions 15 and 16 are different: the horizontal stroke that crosses the leg is straight in #15 = <&F>, but ends with a -like hook in #16 = <&E>. The corresponding words on the left column also seem to contrast <&E> with <&F>. Here, the difference may be merely due to crowding -- the hook can be drawn only then the is word-initial. But this explanation does not apply to the isolated letters. Hmmm... Note again that entry 22 is definitely a and not an . Entries #24 and #32 could be instances of the `picnic table' . However the `serifs' on the legs make them resemble pairs of walking legs: #24 = <&A> is left-going, #32 = <&B> is right-going. The three columns begin and end more or less in sync (one label for every two letters and two lines of text) but get out of alignment around the middle. Note that 34 = 2 × 17. One possibility is that two words (and the alignment) were lost by the scribe when copying from a draft or original which had 17 words in column 1. Should the middle column be read as 17 pairs, as suggested by the initial alignments with the left column? Some of the pairs do seem to have some logic, especially if we note that page f66r lies at the boundary between languages A and B: pair 1: and - occur in similar contexts and may be equivalent pair 2: and - have similar shapes and distributions. pair 3: and - are dissimilar, but is a characteristic ending in language B. pair 4: <*> and - ? pair 5: and - the often occurs before . pair 6: and - is the most common word of language A. pair 7: and - looks like , and is a common combination. pair 8: and - contrasting straight and hooked arm? pair 9: and - again! see above. pair 10: and - somewhat similar shapes and distributions. pair 11: and - the often preceds pair 12: <&T> and - two weirdos. pair 13: and - often together. pair 14: <&T> and - ? pair 15: and - ? pair 16: and - ? pair 17:

and - the former may substitute for the latter? My guess is that the topic of this page is the Voynichese language itself, perhaps an explanation of the differences between "languages" A and B (or a change in the spelling system). PAGE f69r This "cosmological" page displays a circular diagram, at whose center is a disk divided into six unequal parts by the rays of a star. Each sector is labeled with one(?) Voynichese letter. Clockwise from 01:45 (the presumed "start" of the surrounding diagram) the letters are l s &J y d o -- -- -- -- -- -- 01 02 03 04 05 06 The <&J> looks very much like EVA or , except that the two letters are not properly aligned. If we start reading at 11:30, and read <&J> as we get , which sounds like a valid Voynichese word. PAGE f75v On the upper right corner of this page there is a key-like sequence with five letters, all but the last one aligned with the text lines. Here they are: s l l o r -- -- -- -- -- 01 02 03 04 05 The whole sequence lies above the final letter ( or ) of the 5th text line. It can be argued that the sequence includes that letter too. PAGE f76r This page has no drawings, only four paragraphs of text. The first paragraph takes half a page (29 lines), and has a column of "letters" along its left margin. The letters are irregularly spaced along the column, but seem to be aligned with the main text lines. Here they are: s d q s o l k r s -- -- -- -- -- -- -- -- -- 01 02 03 04 05 06 07 08 09 CONCLUSIONS The key-like sequences should give us clues about the nature of the Voynichese alphabet, especially about the grouping of strokes into characters and the equivalence of character shapes. There are two caveats we should keep in mind, however. First, some sequences may not be original, but scribblings by some later owner (or library patron) who tried to decipher the manuscript. This warning is particularly apt for those sequences that have parallel sequences of western letters or digits, such as the sequence on f1r and the vertical sequence on f49v. The latter is dubious also because it uses characters that are not found elsewhere, not even on other key sequences, and its symbols are somewhat ill-proportioned. On the other hand, it is somewhat unlikely that someone would scribble a "key" on a potentially valuable book, before he knew whether the key worked or not. And the warning hardly applies to the sequences on f57v, f66r, and f69v, that are obviously part of the original layout. Second, the symbols used in the sequences may not all be letters. In modern texts one finds plenty of non-letter symbols, often mixed with letters, in certain special contexts: digits, punctuation, footnote daggers and stars, item bullets, paragraph and section signs, copyright marks, mathematical symbols, etc. Obviously the key-like sequences above are all "'special contexts" where such symbols are expected to occur. In particular, with regards to the 4×17 sequence of f57v, it is hard to believe that the <&L> ("Lo^") symbol (which occurs nowhere else in the text) is a letter of the alphabet. Keeping these caveats in mind, what can we learn from the key-like sequences? First, the key sequences suggest that the following symbols are complete single letters: confirming sequences ----------------------------------------------- symb 49v 57v:3 57v:2 57v:1 66r 69r 75v 76r ----- ----- ----- ----- ----- ----- ----- ----- ----- XXX XXX XXX XXX XXX XXX XXX XXX XXX XXX XXX XXX XXX XXX - - ----- ----- ----- ----- ----- ----- ----- ----- ----- XXX XXX XXX XXX XXX - XXX XXX - - - XXX XXX XXX XXX XXX - XXX XXX XXX XXX - XXX XXX ----- ----- ----- ----- ----- ----- ----- ----- ----- XXX - XXX XXX XXX XXX - XXX - - - XXX - - - - ----- ----- ----- ----- ----- ----- ----- ----- ----- - XXX - XXX XXX - - - <&I> - XXX - - - - - - XXX - - - - - - - ----- ----- ----- ----- ----- ----- ----- ----- ----- <&E> XXX XXX - XXX XXX - - - <&F> - - XXX - XXX - - - <&P> XXX XXX - - - - - - <&Q> - - - - XXX - - - XXX XXX XXX XXX - - - XXX - XXX - XXX XXX - - - ----- ----- ----- ----- ----- ----- ----- ----- ----- - - XXX - XXX - - - <&S> - - - - XXX - - - ----- ----- ----- ----- ----- ----- ----- ----- ----- - - - - - - - XXX ----- ----- ----- ----- ----- ----- ----- ----- ----- - XXX XXX XXX - - - - - XXX XXX XXX XXX - - - <&A> - - - - XXX - - - <&B> - - - - XXX - - - ----- ----- ----- ----- ----- ----- ----- ----- ----- - - - XXX - - - - - - - - XXX - - - ----- ----- ----- ----- ----- ----- ----- ----- ----- <&G> - XXX - - - - - - <&H> - XXX - XXX - - - - <&J> - - - - - XXX - - <&K> - - - XXX - - - - <&L> - XXX - - - - - - <&T> - - - - XXX - - - <&Y> - XXX XXX - - - - - ----- ----- ----- ----- ----- ----- ----- ----- ----- The following EVA letters and proposed letters are conspicuously absent from the sequences: Note in particular that and are used as single letters. On the other hand the sequences do not give any direct evidence for the distinction between and its variants , , , etc. The absence of in these sequences supports the claim that is simply a calligraphic variant of or . On the other hand the sequences strongly suggest that and are distinct letters. Note that too is absent, except for the highly suspect sequence f49v; this sequence also includes <&D> (the right half of ), not used anywhere else. This suggests that is not an independent letter, but rather a modifier of a nearby letter, as suggested by the OKOKOKO analysis. The letter is distinct from . Its occurrences on f57v:3 (the 4×17 sequence) and f66r show a well-marked ligature, which would be superfluous if were just a calligraphic variant of . The letters <&I> and are probably equivalent. This is directly supported by their occurrence in equivalent spots of the 4×17 sequence, and indirectly by the fact that other sequences that include do not include <&I>. On the other hand, f66r seems to be contrasting with <&S> = <&I"h> The absence of the `platform gallows' ( and variations) confirms the longstanding suspicion that they are contractions of the plain gallows with other characters, possibly . It seems that is distinct from (and ), as attested by f57v:3. On the other hand, f57v:3 suggests that and are the same letter. The distinction between and is supported by their clearly drawn samples in the sequences of f66r, f75v, and f76r. Sequence f49v, for whatever it is worth, suggests that and <&R> are the same. The letters and

may be equivalent. This is suggested by their occurrence in corresponding spots of the f49v and f57v:3 sequences. However, their separate occurrences on page f66r weakly suggests that they are distinct. On the other hand, f49v and f57v:3 suggest that and

are distinct from and . Sequence f66r suggests that there may be two distinct variants of , <&E> and <&F>. Possibly

too comprises two distinct letters, <&P> and <&Q>. This may be part of the explanation for the rarity of and sequences in the text: perhaps <&F> means , and <&Q> means . CHANGE LOG 1999-01-31 Version 1 created and posted. 1999-02-01 Version 2 created. * re-read <&C> as [G. Landini]. * re-read <&U> as [G. Landini]. * fixed some typos in the text. 1999-12-04 Version 3 created * refined into <&E> and <&F> * refined

into <&P> and <&Q> * refined with feet into <&A> and <&B> * reread second on f66r as <&S> =