Program the multiplicative NMF algorithm, and use it first for computing a factorization of the whole training set, with 40 columns in W, say. Experiment with the maximum number of iterations, so that you are sure that the algorithm has converged or is close to convergence. Look at the columns of W using ima2 (see below). Also illustrate, using ima2, for a number of digits in the training set their respective approximation using the NMF basis.
Using a training set, compute the NMF of each class matrix. Use 5-20 basis vectors and classify unknown test digits according to how well they can be represented in terms of the respective bases (use the relative residual vector in the least squares problem as a measure). Compare your classification results with those obtained using the SVD.
The test data are available at the URL http://www.mai.liu.se/~laeld/matrix-methods/computer-assignments/digits/. The following files are provided:
dzip.mat and azip.mat: the first is a vector that holds the digits (the number) and the second is an array of dimension 256 x 1707 that holds the training images. The images are vectors of dimension 256, that have been constructed from 16 x 16 images.
dtest.mat and testzip.mat hold the test data.
The data are a subset of the US Postal Service Database, and we downloaded them from the web page of the book The Elements of Statistical Learning, Hastie, Tibshirani and Friedman (2001). Springer-Verlag.