CS81: Lab 5

To begin, run update81 to copy the starting point files into your home directory (cs81/labs/5/).

Introduction

The Resource Allocating Vector Quantizer (RAVQ) was developed by Fredrik Linaker and Lars Niklasson. Given a data set of vectors, the RAVQ generates a set of model vectors that are meant to represent typical categories of vectors within the data set. Resource allocating in this case means that the number of categories is not fixed, but is dynamically determined as the unsupervised learning proceeds.

The RAVQ consists of three main parts:

Input buffer
Moving average of vectors in the input buffer
Set of model vectors

When the RAVQ begins, the buffer must be initialized by filling it with the first n inputs, where n is the buffer size. Then a moving average can be calculated. After the buffer is full, at each step the current input is added to the buffer and the oldest input in the buffer is deleted, maintaining the size of the buffer at n. A moving average vector is calculated by averaging all the inputs currently in the buffer. Then the RAVQ determines whether the current moving average is a member of an existing category or if it qualifies as a new model vector. To do so the moving average must meet two criteria. It must be a good representation of the inputs and it must be unique enough when compared to the existing set of model vectors.

There are three key RAVQ parameters that must be set before learning begins. Each of these parameters will affect the number of categories that will be created.

Buffer size
This value determines the number of input vectors that are stored in the buffer. The size should reflect the probable rate of change within the environment. A small buffer size may lead to spurious categories that are based on noise. A large buffer size will cause the moving average to be quite stable and may lead to very few categories.
Epsilon
This value determines how close the moving average must be to the input buffer vectors to qualify as a possible new model vector. A small epsilon means that the vectors that make up the moving average must be nearly identical in order to be considered as a potential category. A large epsilon means that the vectors that make up the moving average could be quite dissimilar and still be considered as a potential category.
Delta
This value determines how different a moving average vector must be from all existing model vectors to justify creating a new model vector. A small delta means that the moving average need not be very different from existing model vectors in order to create a new category. A large delta means that the moving average must be significantly different from the existing model vectors to create a new category.

In this lab we will observe a simulated pioneer robot that is wall following in a two-room world similar to the experiment described in the paper Sensory flow segmentation using a resource allocating vector quantizer by Linaker and Niklasson. On each time step it will check its sonar sensors, determine a motor action, and then create a vector of size 10 that combines the 8 front sonar values and the 2 motor commands (translate and rotate). These values are scaled to the range [0,1] and then passed into the RAVQ.

The RAVQ will be trying to learn appropriate categories for this environment and will report any time the current category changes. After a period of learning, the robot will stop and print out the current set of RAVQ categories. Then we will analyze the categories it found, trying to describe them verbally. Next we will reposition the robot back to the starting point and observe its behavior again. We will also try modifying the RAVQ parameters to gain a better understanding of how these settings affect the quantity and type of categories that are formed.

Finding categories with RAVQ

To start the categorization process, type:
```
python basicRAVQ.py &
```
This will open up a pyrobot window. Before you begin, use the mouse to grab the lower right corner of the pyrobot window and drag it down to make it bigger. Then press the Run button. This will run the robot for 325 steps. It will print each step number and in addition show you when the current category found by the RAVQ has changed. At the end it will print out all of the RAVQ's current categories.

Press the Stop button. For each category generated by the RAVQ, write down a verbal description of what it represents. Remember that the RAVQ is using the 8 front sonar sensor values (labeled 0-7 in the diagram below) and 2 motor commands. The sonar sensors reflect distances to an obstacle; so small values indicate that an obstacle is quite close and large values indicate that there is open space in front of that particular sensor. Because the robot is wall following on its right side, you would expect that the values associated with sensors 5-7 should be quite low through most of the experiment. The motor commands were originally in the range [-1, 1]. To return them to this range, multiply by 2 and then subtract 1.

Move the robot back to its approximate starting position. Be sure to set its heading to be towards the bottom of the screen. At the command line of the pyrobot window type:
```
self.counter = 0
```
This will reset the counter. Now press the Run button again. Whenever a category is reported, stop the robot and compare it's current situation to your verbal description of that category. Do they seem to coincide? The RAVQ may find additional categories in the second circuit around the environment.

Now let's try to modify the parameter settings and see how the number of categories changes. In the pyrobot window, click on the brain's filename: basicRAVQ.py. This will bring up an edit window containing the program. At the top of the file find the code where the main parameters are set:
```
self.bufferSize = 7
self.epsilon = 0.3
self.delta = 0.6
```
Change the parameter settings here. Then save the file in the edit window. Go back to the pyrobot window and press the Reload Brain button. Then press the Run button to see the results. Only change one parameter at a time, keeping the others at their initial values. Record the number of model vectors created in each case.

Based on your findings from experimenting with the parameters, change all three parameters to try to get the most possible model vectors.

Now try setting all three parameters so that you get an appropriate number of categories for this environment. In doing this you should go to the RAVQ implementation to verify exactly when a new model vector is created. The code is located at /usr/local/pyrobot/brain/ravq.py. Look at the method updateModelVectors and be sure you understand the criteria that must be satisfied for model vector creation. Several instance variables are crucial to this process. To see their values during the experiment, go to the setup method in the basicRAVQ.py file and change self.ravq.verbosity to 2. Then re-run the experiment, stopping whenever a model vector is created, and check the values. You may also want to stop at locations where you expected a model vector to be created, but it was not. How would you modify the parameters so that a model vector would be created in these situations?

CS81 Lab5: Unsupervised categorization