CS63 Fall 2002
Lab 9: Digit recognition with neural networks
Due: Friday, December 13 by noon
One very successful application of neural networks has been in the
recognition of zip codes on hand addressed envelopes. We will be
trying a simpler version of this task for today's lab. Instead of
trying to recognize five digit numbers, we will try to recognize
single digits encoded as 6x6 images. We will be using the
xtlearn software described in the final section of your
reading packet.
-
I have provided the neural network files needed to run the following
examples we discussed in class: or, and, and
xor. Copy the contents of the the following directory to
your home directory:
/home/meeden/Public/cs63/lab9-nnets
Try running experiments with each of these problems and be sure you
understand how to use the xtlearn software to train and test
the networks.
-
We need to generate an appropriate training set. Take the
blank template below, which is six rows by six columns, and create your
own version of the digits 0-9. Use the full extent of the space and
center the digits in the space as much as possible.
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
For example, here is one possible way to represent the digit one:
0 0 0 1 0 0
0 0 1 1 0 0
0 0 0 1 0 0
0 0 0 1 0 0
0 0 0 1 0 0
1 1 1 1 1 1
Add your versions to the top of the file digits.data (in the
digits subdirectory) just below the first two lines. Make
sure that you separate each digit with a blank line. Go to the top of
this file and change the number of patterns to 240. Also go to the top
of the file
digits.teach and add another 10 lines with 1-10 on them, and
update the number of patterns to 240.
-
According to Plunkett and Elman in their book of exercises written for
the xtlearn software, for certain kinds of problems we should
use the cross-entropy error measure rather than the default mean
squared error measure: "The cross-entropy has been used as an
alternative to squared error. Cross-entropy can be used as an error
measure when a network's output nodes can be thought of as
representing independent hypotheses (e.g. each node stand for a
different concept), and the node activations can be understood as
representing the probability (or confidence) that each hypothesis
might be true. In that case, the output vector represents a
probability distribution, and the error
measure--cross-entropy--indicates the distance between what the
network believes this distribution should be, and what the teacher
says it should be. There is a practical reason to use cross-entropy
as well. It may be more useful in problems in which the targets are 0
and 1 (though the output obviously may assume values in between)."
Our problem is exactly this type, so when learning
select the "Use and log X-entropy" option, under the training
options.
-
Use the basic feedforward network I provided in the digits
subdirectory with 36 input units, 10 hidden units, and 10 output units
(one for each digit). Run at least 5 trials. What percentage of the
digits are recognized on average? Are some of the sets of digits
easier or harder than others?
-
Create a new network which attempts to provide feature detectors. For
example, have one set of hidden units only pay attention to the top
row of the input, another set only to the next row, and so on. You
could also have hidden units which only pay attention to one column,
or one corner. So rather, than having the hidden layer connect to
every input node, you are allowing certain units to focus on certain
subsets of the input. Run at least 5 trials with this new
architecture. How does it compare to the fist set of experiments?
-
Notice I have included a program called cluster in the
lab9-nnets directory. To use this program to do a cluster
analysis, create a file of hidden activations and an associated file
of labels for those patterns, and then do:
% cluster file.hidden file.label
To create a file of hidden activations you need to include a line in
the .cf file under the SPECIAL: section, of the
following form
selected = startNode-endNode
Then, after the network has learned, go to the Test menu and
choose Probe selected nodes. By default, the values will be
output to the screen. But you can choose to have them sent to a file
under the Testing options. Remove the comments at the top of
the file, so that the first line is a set of hidden node activations.
Finally create a file of labels, with one label per line, describing
the input associated with each set of hidden node activations.
-
Summarize your results in tabular form. Include a picture of your new
network's architecture. Try doing a cluster analysis of each
network. Discuss your results in detail. Which architecture seems to
perform the best? Why do you think this is the case? Create a web
page with your write up. Send me an email letting me know the URL.