Hacking at the Voynich manuscript - Side notes 065 Ink separation experiments Last edited on 2004-10-30 11:41:42 by stolfi ENVIRONMENT SETUP ln -s ${HOME}/programs/c/IMG/ppminksep pgmdir ln -s ../../work set path = ( tools ../tools ../../tools ../../../tools ${path} ) RUNNING THE TESTS set runs = ( rt-f102v1-1 ) foreach run ( ${runs} ) echo $run set mkf = "../../test.make" ( cd images/${run} && make MAKEFILE=${mkf} -f ${mkf} all ) end A DETAILED ANALYSIS We analyze an extract from f102v1, comprising the bottom container, the adjacent plant, and some of the adajcent text. The image is rt-102v1-1.tif, extracted from 1006253.sid with sid-view and then smoothed with "gimp" (two rounds of Gaussian blur, radius 1.0 pixels). Vellum and Dirt --------------- First we analyze "images/rt-f102v1-1-bg", which consists of rectangular bits of background clipped from various places in rt-102v1-1.tiff, and pasted together with "gimp". This image can be explained very nicely as a random affine mixture of three components Vellum 214 209 202 / 255 Dirt 210 197 174 / 255 Shadow 001 000 001 / 255 This data implies that the set of colors lies on the plane 2*R - 3*G + 1*B = 3. More precisely {cr*R + cg*G + cb*B + ct} where do-adjust-plane 2 -3 1 -3 img o = 207.7 199.7 186.9 / 255 u = +0.5659 -0.7885 +0.2408 uC = +005.1 v = +0.5174 +0.5671 +0.6408 vC = +340.5 w = -0.6418 -0.2381 +0.7289 wC = -044.6 sdev = 0.5986 pwh = 255.3 254.6 255.1 / 255 pbk = 002.9 -04.0 001.2 / 255 hue = 255.0 176.5 000.0 / 255 hue = 000.0 071.4 255.0 / 255 set cr = "+565.9" set cg = "-788.5" set cb = "+240.8" set ct = "-5112.4" The pixel deviations from this plane have a normal distribution with zero mean and SD = 0.5986. This plane misses the diagonal of the RGB cube by about one unit in the G direction. Projecting the colors Vellum, Dirt, and Shadow onto this plane, and clipping to the unit cube, gives, very nearly, echo ' \ ur=0.5659; ug= -0.7885; ub = 0.2408; \ or = 207.7; og = 199.7; ob = 186.9; \ xr = 214; xg = 209; xb = 202; \ s = (xr-or)*ur + (xg-og)*ug + (xb-ob)*ub; \ xr-s*ur; xg-s*ug; xb-s*ub \ ' | bc -lq ProjVellum = 214.1 208.9 202.0 / 255 ProjDirt = 209.8 197.3 173.9 / 255 ProjShadow = 006.7 000.0 005.3 / 255 "Vellum" would be the color of clean vellum under maximum illumination, i.e. on the lightward side of bumps. The "Dirt" component is actually "maximally dirty vellum", since even the dirtiest pixels in this image apparently have a substantial "Vellum" component. The set of colors from this image has a very sharp "bright envelope", a straight edge along the Vellum-Dirt line. The direction of this line is surprisingly close to the simple vector u = (1 3 7). Extending this line at the Vellum end until the most neutral color, and at the Dirt end until the RGB cube boundary, gives approximately PureVellum 216 215 216 / 255 PureDirt 185 122 000 / 255 The "dark envelope" is a little less sharp, but still fairly straight and fairly parallel to the bright envelope. It corresponds to a mixture of Shadow with maximum weight 0.07. Because of this small maximum weight, it is not possible to tell whether the mixture is additive or multiplicative, nor the precise coordinates of the Shadow component. The maximally saturated color on this plane is 255.0 176.5 000.0 / 255 The Dirt component image (img-ca.ppm) has a lumpy texture with dominant wavelength about 15-20 pixels. Individual patches have very different texture intensities, correlating with their visual dirtiness. Conversely the Vellum image (img-bg.ppm) is stronger and smoother over the visually cleaner patches. The Shadow image (img-sh.ppm) has a fine grained and more random texture (dominat wavelength 5-7 pixels). Unlike the Dirt image, its intensity (and the strength of the texture) is fairly uniform over all patches, independentely of their visual dirtiness. The residuals transverse to the Vellum-Dirt-Shadow plane, expressed as a fourth component Green = 178 254 167 / 255, have maximum weight 0.02. Its image (img-cb.ppm) is similar to the Dirt image, but much fainter, suggesting that most of the error may be due to a slight error in the Dirt coordinates. It should be noted that since the distance between the Vellum and Dirt points is relatively small (0.11), it is not certain that the Vellum-Dirt mixture is additive (as it would be if the dirt consisted of opaque particles) or multiplicative (as would be expected from a transparent colored film). For the same reason, one cannot tell whether the input image has suffered any gamma encoding or non-linear brightness contrast adjustmets. Brown ink text -------------- Next we analyze "images/rt-f102v1-1-nk", which consists of isolated brown ink text. The text was extracted with the help of a mask created by "gimp". First an area containing only brown text and background vellum was clipped from "images/rt-102v1-1.tif" and saved as "full.ppm". Then that image was converted to grayscale and thresholded so as to eliminate both the very dark tail and the lighter background, leaving only the characters. Two of the characters (EVA "t" and "r") were erased by hand because they were suspected of retouching or otherwise anomalous. The resuling image was thresholded again (to remove fuzzy edges in brushed-out areas) and saved as "mask.pgm". Then we executed pnmxarith -multiply full.ppm mask.pgm > img.ppm convert img.ppm img.tif The pixels in this image form an elongated, mostly flat cloud, roughly bounded by the colors Vellum 210 208 210 / 255 Ink 108 067 022 / 255 Dirt 222 206 185 / 255 This cloud has a sharp straight edge at the lighter end, and another one at the dark end, but those are artifacts of the mask construction procedure (which selected only pixels in a certain range of brightnesses). There is no sign of bending in the cloud, even though it spans a range of brightness from about 0.50 to about 0.75. The mean plane of the cloud (adjusted visually) is cr*R + cg*G + cb*B + ct = 0. where set cr = "41"; set cg = "-70"; set cb = "31"; set ct = "-405" do-adjust-plane ${cr} ${cg} ${cb} ${ct} img o = 185.4 167.8 147.6 / 255 u = +0.4801 -0.8046 +0.3496 uC = +005.6 v = +0.4716 +0.5727 +0.6705 vC = +282.5 w = -0.7397 -0.1570 +0.6544 wC = -066.9 sdev = 0.9538 bg = 223.9 223.9 223.9 / 255 pwh = 254.6 255.6 254.7 / 255 pbk = 002.7 -04.5 002.0 / 255 hue = 255.0 145.2 000.0 / 255 hue = 000.0 103.8 255.0 / 255 set cr = "+480.1" set cg = "-804.6" set cb = "+349.6" set ct = "-5623.1" The pixel deviations from this plane have a normal distribution with SD = 0.9098, i.e. just twice the quantization error. This fact is consistent with (albeit not characteristic of) an opaque particulate ink and no gamma encoding. Projecting the manually fitted corners above onto this plane gives echo ' \ ur = 0.4801; ug = -0.8046; ub = 0.3496; \ or = 185.4; og = 167.8; ob = 147.6; \ xr = 210; xg = 208; xb = 210; \ s = (xr-or)*ur + (xg-og)*ug + (xb-ob)*ub; \ xr-s*ur; xg-s*ug; xb-s*ub \ ' | bc -lq ProjVellum = 209.4 209.0 209.6 / 255 ProjInk = 108.0 067.0 022.0 / 255 ProjDirt = 222.0 205.9 185.0 / 255 The angle between the normal of this plane and that of the background-only patch quilt ("images/rt-f102v1-1-bg") is very nearly 8 degrees. To look for the intersection between these two planes, we intersect the lines {o + t*v} and {o + t*w} of the background sampler with the ink sampler plane: Background sampler: o = 207.7 199.7 186.9 / 255 v = +0.5174 +0.5671 +0.6408 vC = +340.5 w = -0.6418 -0.2381 +0.7289 wC = -044.6 Brown ink sampler: o = 185.4 167.8 147.6 / 255 u = +0.4801 -0.8046 +0.3496 uC = +005.6 Vellum, Dirt, and Ink --------------------- Next we analyze "images/rt-f102v1-1-tx", which consists of dirty vellum background plus brown-ink text. We try to combine the information obtained in the two preceding analyses. Namely we use a tetrahedon where two of the faces are the two planes above. 2*R - 3*G + 1*B - 3 = 0 40*R - 61*G + 24*B - 756 = 0 Good points should be the intersectionso of the lines Ink -> Vellum and Ink -> Dirt with the first plane, namely CommonVellum 220 222 228 / 255 CommonDirt 255 246 232 / 255 To these points we add extra points from the two analyses, namely Ink 108 067 022 / 255 Shadow 001 000 001 / 255