Notes on the Voynich Manuscript - Part 10 [1992 January 29] ----------------------------------------- Comments and Suggestions These comments are not very well ordered, but with the start of the new term I'm barely able to keep up with reading the Voynich stuff, much less writing any myself. So here are a few observations I don't want to forget about. 1. Letter correlations. Currier, and now we, have found correlations between the final "letter" of a "word", if that is they are letters and words, and the initial letter of the next word. Granted. However, I have some problems with the interpretation. Currier claims he knows of no language with this feature; I think he's very wrong. First, note that in many languages (welsh for instance) the phoneme (sound) at the start of a word is modified by the previous word. Some systems of writing reflect this change (modern welsh I believe does), and some do not. Secondly, note that in some languages there are grammatical rules that lead to such correlations. In English the chain of causation runs from right to left (a possibility Currier overlooks): "a" changes to "an" before a vowel, and possessives in "-y" change to "-ine". Likewise, both French and Italian elide heavily, and some writing systems reflect this. Finally, I struggled through enough Dante at one time to know that in Italian poetry of the time, endings and beginnings were highly correlated, not because of orthography or grammar, but because of euphony. So, yes, the statistical patterns exist, and they are real, but I have two problems. (a) are they unusual? - would we not find the same with known European languages, and (b) can we reason back from the effect to the cause, given that there might be many or multiple causes, at very different levels of language. I am similarly sceptical of preferred initial letters. After all, in Latin "qu" is never final, and "x" is never initial, and I'm sure parallels can be found in many languages. So this isn't unusual. Preferred paragraph-initial letters, frankly, impress me even less. In Euclid's "Elements", for example, paragraphs usually begin with "Axiom", "Theorem", "Corollary", "Lemma", ... in other words a very small set. The peculiar letter pattern is a consequence of the peculiar word pattern, and that is a consequence of the style of the author. Nothing can be deduced about the language from this effect. [Note: well, I forgot about "xyster" and a couple of other rare Latin words, but I think the point stands.] 2. Need for Comparison This brings me to the second point: the need for comparisons with known texts in known languages. Every experiment done on the Voynich MS should be repeated, on, say, Magna Carta, the Romaunt of the Rose (separating the dialects), the Inferno, the Chanson de Roland - well, you get the idea - a chunk of contemporary (we hope) European documents. To give one example, we can test explicitly Mr Guy's views about how dialects differ by applying his reasoning to the Romaunt of the Rose (Part 1 in a Southern dialect, Part 2 in a Northern, the original spelling preserved), and seeing whether they work. Yes, this is going to be laborious, but I think it's far better to have a statistic that means something, than to have one about whose meaning we must then speculate. 3. Voynich Lines I do not believe the MS is poetry, and I do not believe the line is a functional unit. This is because all I have seen looks like prose, and the lines seem to have been continued to a margin. Especially on the pages with a big drawing, the scribe has filled however much space the drawing left, line after line. Finally, the paragraphs end with short lines, just as do paragraphs of modern prose, and they can't be stanzas with a short final line (as in Keats' "Ode to Psyche", for instance), because the paragraphs are of arbitrary length and show no pattern of lines I can discern. [Note: this completely overlooked the possibility that the text was poetry with the verses run together. Question everything!] However, I agree that a useful hypothesis for the statistical properties is one that assumes elision or syllable fusion of some kind. 4. Letter Construction Currier's observations on how the letters were written I think can be explained fairly simply: the scribes were writing with quill pens. It is always easier to make a downward stroke than an upward with such a pen, which is why "o" is written as two half-moons, and why most rising serifs in Voynich script are deeply curved. This raises the possibility that the script was designed to be so written, in which case we should be prepared for many contracted forms. As an illustration, the Egyptian hieratic script was designed to be written with a reed pen; it consists almost entirely of contractions, each "letter" the fusion of two hieroglyphic symbols, and sometimes those symbols just the first syllable and last consonant of the word in question. It's fiendishly hard to read; I never bothered to take the time, and anyway all modern books use standard hieroglyphic fonts. 5. Words, words, words As I see it, the single biggest problem we have today is deciding whether the spaces do or do not divide words. If they do, then many of Mr Guy's proposed symbol equivalences or alternations don't seem to work; if they don't, then most of my work on prefixes and suffixes is wrong. Let me say, I'm not convinced by either Mr Guy's reasoning or mine on this point. What I did was pose the hypothesis that the words were words, look for confirming evidence, find it abundantly, and claim the answer. That's desperately unscientific, and I should have known better. The correct approach is to frame all reasonable alternative hypotheses, and look not for confirming but for discriminating evidence: evidence the favours some hypotheses over others, and ideally one over all. Here is my shot at the hypotheses: S1W1: The spaces divide words S2: The spaces do not divide words, but are inserted according to some other rule ("always after 9") S3: The spaces are inserted at random, just to confuse W2: Word divisions are indicated in the text by some other means W3: Word divisions are not indicated in the text We can test the 'S' hypotheses by concentrating on spaces, and the 'W' hypotheses by ignoring spaces. I think we already have enough evidence to refute S3. On this issue, I believe our best source of additional evidence is the isolated "words" attached to plants, jars, nymphs &c. If most of them recur in the text as words, and if they show initial and final letter frequencies similar to those of the "words" in the text, then that is strong evidence against S2. Our next best source would be paragraph initial word beginnings; I don't think we should use paragraph endings since it is possible they are padded with gibberish. 6. Gleams of Hope (1) I've now read the Voynich Newsletters of 1978 February and November, and 1980 January. They make me still more certain that the Brumbaugh decipherment is wrong. His transcription errors in particular worry me - there are too many of them; they are too big; and by now the decode matrix must be so embedded in his unconscious that it could be making those errors precisely to construct readable Latin. But that's not my point. Look at the 1978 November letter, and please compare page 7 (Brumbaugh transcription) with p 6 (Voynich original). How long has Prof Brumbaugh been working on this? How familiar must he be with the script, and how many thousands of lines must he have transcribed? And yet, even making allowance for a hasty hand and a poor pen, that script is still very unpolished. Now compare the beautiful, elegant, stylish and almost effortless penmanship of the original. This is not a scientific argument, but an aesthetic one, yet to me it leaps off the page: this penmanship is the craft of long and loving practice. And that says two things: first, that the Voynich MS was written to be read, not decoded, and was indeed to its readers the equivalent of plain text; and secondly, that this work is not unique, it is the only survivor of an entire corpus: for how else could that long practice have been acquired? 7. Gleams of Hope (2) And one gleam is Mr Guy, whose long, exciting and valiant notes have I think elucidated, if not yet the mystery itself, at least a great deal more of its attributes. But the whole team, I think, brings to bear on this problem two new things. First, we have collectively a very diverse body of knowledge, far more that the earlier teams, and we are less likely to imprison ourselves in fixed ideas. But secondly, most of us can do what they couldn't - we can use the machines as fast and effective tools for the testing of hypotheses, exactly as Ms D'Imperio advised. They took weeks to create one computer program: specification, design, flowcharting, data preparation, testing, begging machine time - I am not surprised they achieved so little with the computers. Today, on the other hand, I can have an idea, throw together some recycled code to test it, entrust the result to my old and familiar compiler, and have the numbers in twenty minutes. Then all I have to worry about is the insight - but our predecessors seemed never to have the time for that. [Note: alas, soon after I wrote the above, my beloved machine, with its old and familiar compiler, was taken away, and with it much of my ability to write programs very, very rapidly. I was unable to port the compiler to the new machine, for reasons not here germane.] Robert