Exercise 2

ATTENTION - changed the PCA part - 21/3

Due date: 25/3

The zip file train17.zip contains a PGM image collection of hand written 1 and 7 digits. Each image has 64x64 pixels in the PGM format, where each pixel has value 0 or 1.Each image file has a name in the format X_yyy.BMP.inv.pgm where X is the digit represented in the image.

The file test17.zip contains test images in the same format.

PGM files start with 3 lines: P2
64 64
1
which are not relevant to us, followed by 64x64 pixels separated by a blank or a line change. For us, these 64x64 pixels represent the atributes/dimensions of the data. The class of each file/data is the digit represented in the file name.

How to use the PCA in R:
n number of dimensions to keep

pca<- prcomp(train)
newtrain<-pca$x[,1:n]
newtest<-scale(test,pca$center,pca$scale)%*%pca$rotation[,1:n]
The first line computes teh PCA. The second returns the train data transformed into the new reduced PCA dimensions - you should have done something similar in the first exercise. The third line use the training PCA to transform the test data.


Last modified: Mon Mar 17 11:28:06 BRT 2014