Notes on the Voynich Manuscript - Part 22 [1995 January 9] ----------------------------------------- The Nature of the Voynich Manuscript, Revisited 1. A Preliminary Observation Look at the MS. It was written fluently and at speed. This is proven, in my view, by the elegance of the hands, and by the length of text between successive refills of the pen, which can be inferred from the density of the ink. It is written in a script that I have argued elsewhere was devised explicitly to promote a fast, cursive hand. If the text contains meaning, then, that meaning could be encoded very quickly, or at least copied very quickly. And I believe the copyist was not transcribing opaque cypher text, but understood the meaning, and the evidence for this is the vanishingly small number of seeming transcription errors we have found in the corpus. Contrast, for instance, Brumbaugh's numerous transcription errors, which, by the way, are to me clear proof that his alleged decipherment is bogus. [Note: The author's eloquence is here obstructing the search for truth. It is worth reminding the reader that there is a much simpler explanation for the complete absence of corrections from the MS: that there is no meaning to correct, merely lucrative gibberish to be generated as rapidly as possible.] 2. Possible Cyphers So, if the VMS is a cypher text, the cypher must be very simple. It must be readable virtually at sight. Of the set of "dense" cyphers - those where most of the encoded text is signal - I think that rules out anything more complex than a "gold bug" substitution cypher. Not to mention, of course, that nothing more complex was even known in the fourteenth century. [Note: why "fourteenth" century? Because the author's reasoning processes are faulty, and cause him here to assume as true what he had previously only conjectured. That's how hypotheses get out of control.] Indeed, even that is too difficult for most "secret" communications, especially those of occult or secret societies, who are a desperately verbose bunch and therefore tend to adopt simple cyphers. One obvious example is the Caesar cypher, based not on a random substitution but on a simple cyclic shift of the alphabet. Another are the Masonic cyphers, most of which are based on a rectangular grid, populated by letters, and with each cypher symbol a glyph designating a part of the grid. You can learn such a cypher in half an hour, and become fluent in it in an afternoon. The point being, of course, that if this were the nature of the text, we could have cracked it long ago. That leaves, then, the "sparse" cyphers, where the trick is to submerge the signal in noise. Typically, a Trithemian cypher. But they are harder to use than commonly imagined, and I invite you to test this. Here is my encoding rule: a word beginning with a vowel stands for digit "0"; a word beginning with a consonant stands for digit "1". If you please, encode the first hundred digits of the binary representation of pi in a plain English cover text. G'wan, I dare yaz! [And, indeed, one member of Team Voynich did exactly what I'd dared him to do. It would be useful to add a link to that text, because I think it merits further analysis.] You can't do it. You run out of words too soon, and get repetitious, and the entropy of the text soon increases beyond the point of plausibility. So, use a dictionary and a random-number generator? Sorry: that generates a text in which all words are equally likely, so "the" will occur with the same frequency as "nephelococcygia". Well, sort the dictionary in descending order of frequency and use a generator that yields the integers in negative-binomial distribution. I could do that, though I doubt John Dee could - but the result would lack any of the grammatical regularities of natural language. No: there is exactly one way to generate a plausible cover text for a Trithemian cypher, and that is to start from a real text. Take a book, any book, and enter it by some rule, and find the first word after the entry point that meets the constraints. The simplest rule, of course, is to go through the book front to back. So, to generate 50 pages of the VMS, take a few thousand bits of pi, take a copy of "Moby Dick", and if the message begins "11,0..." then the cover text begins "Call me Ishmael". Wrap up the result in a gold bug cypher to make assurance double sure. But, friends, we could reverse that entire process. First, we could break the second cypher, and recover the mangled cover text. Then, I bet, we could *identify* that text, since any book accessible to a mediaeval monk or renaissance con man would be accessible to us. And finally, by setting the original text alongside the mangled copy, we would have a good chance of determining the rule by which it was mangled, and that breaks the supposedly unbreakable part of the code. [Note: and here the author's faulty reasoning begins to devour its own tail. Why would you want a "plausible cover text", anyway? To hide the fact that this is a cypher. But first, that was very rarely a major concern. And secondly, the major concern was that the cypher should not be broken, so way choose a cover text in a way that makes the cypher easier to break? However, the conclusion in my next paragraph is perhaps less dubious than the reasoning that led to it.] Repeated attempts at decipherment have failed because there is no cypher to break. The opacity of the text was not part of its authors' intention; it is due to a false assumption we are making. For the life of me, I cannot find that assumption. 3. Is it Gibberish? No, it isn't. We have applied to this demon-haunted document the best and most powerful quantitative tests of twentieth-century linguistics, and they all tell us the same thing: there is meaning in the MS; *it is language*. [Note: again, faulty reasoning. Most of the above overpraised "tests" were devised to determine the characteristics of a text already known to be language. They were not devised to separate language from gibberish. It remains therefore an open question how much of the results are due to the text of the MS, and how much are due to the underlying assumptions, which might force the algorithms to find patterns even when none exists.] As little as a hundred years ago, nobody could have constructed a gibberish text that would have fooled our investigative apparatus; a fortiori, nobody before us in the history of Western Europe could have done it. Indeed, I believe there is only one group in existence with the expertise and resources to forge the MS: Team Voynich; us. And we didn't do it, officer, honest! 4. The Roebuck in the Thicket Or, the secret that eludes discovery, in Robert Graves' metaphor. Or, perhaps better, Borges story about Ibn Rushd's commentary on Aristotle's Poetics, which could not be written because its author had no understanding of the concept of the theatre. What are we missing? Or, more likely, what are we assuming that causes us to look at the text in exactly the wrong way? That we see as in a glass, darkly? (I Corinthians xiii:12 - "Blepomen gar arti di' esoptrou en ainigmati" - why do I still have the crazy notion that the base language of the MS is Greek?) Again, I suggest we list our assumptions. And I've here reiterated my primary assumption, along with the reasoning that leads to it. If you can overturn that assumption, please do so. Robert Firth [Note: In retrospect, I think that this would have been exactly the wrong way to proceed. What we were short of was not assumptions, hypotheses, or ingenuity. Or criticism of said assumptions and hypotheses. What we needed most of all was FACTS. And we still don't have them, even about such simple things as the age of the vellum, the chemical nature of the pigments, any possible candidates to match the illustrations... Yes, it's fun to be M. Dupin, or Mycroft Holmes, and deduce the truth by pure intellect, without stirring from ones chair. But it doesn't work. My thanks to those members of Team Voynich who did some real leg work.]