Last edited on 2005-05-23 02:11:54 by stolfi

Voynich Manuscript stuff

Who else is out there?

Thanks to Dennis Stallings for many corrections and additions.

Who else is not out there?

What are my feelings about the manuscript?

What is new in here?

Text retouching: [15 jul 04] [updated 22 jul 04] The text on page f1r shows clear evidence of extensive retouching -- and that may hold for many other pages as well.

The third big red weirdo of f1r: [03 jul 04] It is defintely not a big digit "2". Could it be...?

Zbigniew Banasik's theory: [23 may 04] Is the VMS written in the old Manchu language?

The weirdos of f1r: [26 feb 02] [updated 03 mar 2002] Two big red symbols on page f1r are seen from a different angle.

Chinese Theory redux!: [18 jan 02] The Voynichese word length distribution is surprisingly similar to that of Vietnamese and other East Asian languages.

On the VMS Word Length Distribution: [23 dec 00] The distribution of word lengths in the VMS lexicon is quite peculiar, and fits a surprisingigly simple formula.

Special words: [07 feb 98] [rebuilt 10 jul 00] A word-by-word colorized edition of the Vms electronic text (a "best pick" of the Reeds/Landini interlinear file, converted to EVA and slightly corrected). Each Vms page is presented as a separate HTML page. Color is used to highlight "special" words that occur on the page much more often than expected by chance.

A grammar for Voynichese words: [14-jun-00] The latest version of my crust-mantle-core paradigm.

Pictures from an Expedition : [29 May 00] Some pictures of the recent Voynich Club Expedition to Prague, sort of.

The reds of f67r: [15 dec 1999] Are the red colors on pages f67r1 and f67r2 done with the same ink?

Slides from my VMS talk [15 dec 1999] given last July at the Brazilian Mathematics Colloquium in Rio.

VMS Word Coloring Service: [16 Feb 99] A WWW tool that lets you highlight your favorite VMS words in various colors.

The month names: [16 Feb 99] A pencil reproduction of the month names from the "zodiac" pages f70v2--f73v.

The `key-like sequences' [01 Feb 99] collected and discussed.

A full concordance of the VMS [16 Jan 99] based on the interlinear file release 1.6e6 (see below).

Landini's interlinear in EVA, version 16e6: [28 Dec 98] here you can get the interlinear file (interln16.evt) started by Gabriel Landini and Jim Reeds, converted by me to the EVA alphabet, and augmented by John Grove and myself. This new version includes Takeshi Takahashi's new, nearly complete transcription, and is now fully synchronized for your convenience and profit.

The "michiton" text: [07 Nov 98] A pencil rendering of the text on page f116v (the "michiton oladabas" page), for those who haven't got a better reproduction.

Where are the bits? [redone 13 Jul 98][12 Jul 98] Colorized pages of various texts in various languages, including even some VMs pages, that show how much each letter contributes to the conditional entropy hk.

Page scatter plots: [03 Jul 98] based on page-by-page frequencies of words and OKOKOKO elements. Shows conspicuous clustering of the pages into sections, and the relationship between sections.

OKOKOKO, or The fine structure of Voynichese words: [30 mar 98] Another proposal for the basic building blocks of Voynich words. Includes several data files and Unix scripts of possible interest. [Largely superseded by the VMS word grammar]

Enhanced text images: [29 mar 98] Images of Voynichese text, clipped from Jacques Guy's gallery, with enhanced contrast and EVA transcription. Includes all pages posted in GIF form by Ron Carter to this date.

A big list of labels and titles: [01 feb 98] [updated 20 jul 98] A slightly expanded version of John Grove's label collection, with full location codes, in HTML and machine-readable formats.

The Name of the Sunflower: [27 jan 98] An attempt to identify the names of plants by their patterns of occurrence in the herbal pages. Also some colorized Voynich text showing intriguing patterns.

The sunflower story: [17 jan 98] Detailed comparison of real sunflowers (and other related plants), with the thing on page f33v.

Beer bellies: [17 jan 98] A "proof by example" that the prominent tummies seen in some of the "Nymphs" do not mean they are pregnant.

Word occurence maps in EVA: [30 Dec 97] A set of tables showing where each word or phrase occurs in the text. These maps are somewhat similar to my label occurrence maps of [23 Oct 97], but differenet in several key details: they use EVA, pay attention to word spaces, and list most of the words and phrases in the Vms (not just labels).

The Generalized Chinese Theory: [24 Nov 97] I think my recent prefix-midfix-suffix analysis of Voynich words justifies having a second look at Jacques Guy's "Chinese" theory. (This is an update of my note of [21 Nov 97].)

Plots of A×B counts for word components: [22 Nov 97] Inspired by G.Landini's paper, I made some plots comparing the frequencies of prefixes, midfixes, and suffixes in Currier languages A and B. The page also includes plots of unifixes (all-soft words), and word tails (midfix-suffix combinations).

Plots of Rayman's character counts: [20 Nov 97] At Rene's suggestion, I plotted Rayman's counts of distinct characters per page, for each page in the herbal section.

Comparing languages A and B at the sub-word level: [12 Nov 97] This note compares Currier's "language A" and "language B" subsets of the "herbal" section, in terms of the prefix-midfix-suffix paradigm (below).

The prefix-midfix-suffix paradigm: [12 Nov 97] This note describes a simple but surprisingly effective lexical paradigm for the structure of Voynich "words". If we divide the EVA letters in two sets, "soft" and "hard", we can parse almost every word as either a string of "soft" letters, or a single string of "hard" letters with a soft prefix and a soft suffix; and each component seems to be drawn from a rather small repertoire.

Label occurrence maps: [23 Oct 97] Here you will find a set of maps and tables showing how the occurrences and near-occurrences of figure labels are distributed in the running text, across the whole manuscript. (This is a much bigger but disjoint version of the f77v label search mentioned below.) [Largely superseded by the VMS word grammar]

Top-down structural analysis of the Vms: [06 Oct 97] Here is the beginning of a top-down analysis of the Voynich manuscript, focusing mainly on the book itself (as opposed to the historical context and possible authorship). Comments and contributions are desperately welcome...

References to f77v labels in the biological text: [11 Sep 97] The figures on page f77v have labels; I have checked whether those labels occur in the main text of the biological section, with weakly positive results.

The Oresme manuscripts: [07 Sep 97] Here you will find a few samples of 14th century "technical" manuscripts, in abbreviated Latin, with expanded transcriptions and translations. Athough not directly related to the Voynich manuscript, they provide some context as to the use of abbreviation, handwriting styles, scribal variations, etc..

Word occurrence map for the biological section: [16 Aug 97] This map that shows the places where each word is mentioned in the Biological section.

Word pair table [10 Aug 97] Here you will find a table with counts of the occurrences of each pair of consecutive words in the Biological section.

What is old in here?

[01 Aug 97] I once posted a prefix-suffix decomposition for most of the words in the Biological section, using an "error forgiving" alphabet. This item is now obsolete in view of the prefix-midfix-suffix decomposition listed above.

What data am I using?

Practically all my analysis has been based on G. Landini's interlinear compilation of several existing machine transcriptions. Jim Reeds deserves the credit for having edited most of those transcriptions and made them available to the public.

What software am I using?

I have been using standard Unix and GNU tools (grep, gawk, sed, tr, sort, uniq, etc.). I use the C-shell (csh) from within Stallman's Gnu Emacs editor; its "rectangle editing" features are wonderful for rearranging and reformatting the output tables.

When I run some program, I usually record the shell command together with its output and other comments, in a "notebook" file. Besides being good "scientific" (ahem!) practice, the notebook file makes it trivial to re-run the program later on different data, or afer improving the scripts. You may browse my notebooks, old and new, if you like, but be warned---many of the things recorded there are bogus, dead ends, or just plain stupid. In the new notebboks you will also find many of the (Unix) scripts that I have used. Usage instructions are often included as comments at the beginning of each script.