To begin, run update81 to copy the starting point files
into your home directory (cs81/labs/4/).
Intelligent Adaptive Curiosity (IAC) was developed by Pierre-Yves Oudeyer, Frederic Kaplan, and Verena Hafner with the goal of providing a robot with an intrinsic motivation system that pushes it to focus on situations that maximize its learning progress. IAC contains a memory that is sub-divided into sensorimotor regions. Each region contains:
In the original description, IAC's memory is organized hierarchically as a tree. Initially all exemplars are members of a single region. When the region grows too large it is split into two new regions. The split is made based on a feature of the SM(t) component of the exemplars such that the sum of the variances of the S(t+1) component, weighted by the number of exemplars in that new region, is minimized. This process continues recursively on any region that grows too large.
The main processing loop of IAC works as follows:
In their paper Intrinsic motivation systems for autonomous mental development, Oudeyer, Kaplan, and Hafner describe an experiment that is designed to test the effectiveness of IAC's goal of maximizing learning progress. One part of the domain is easy to learn, one part of the domain is more complex, and another part is unlearnable. We should expect to see IAC first focus on the easy portion, then focus on the more complex portion, while ignoring the unlearnable portion.
The robot is placed in an environment with a smart toy. In their experiment, the robot could command it's left and right wheels and also emit a frequency that makes the toy move. When the frequency is:
The robot then senses its distance from the toy. In their experiment the robot has no way of accurately predicting what the new distance to the toy will be when it emits the middle frequency. If the toy is front of the robot and the robot moves forwards, the new distance to the toy will be closer. However if the toy is behind the robot and the robots moves forward, the new distance will be farther. The robot only knows the current distance to the toy and not its direction.
We will modify the setup slightly to make the second frequency more predictable. The robot will only be able to translate without rotation. The toy will always start at a random position in front of the robot. If the robot moves forward it will always decrease the distance from the toy. Similarly if the robot moves backward it will always increase the distance from the toy.
In this new setup, SM(t) consists of three items:
The paper describes an experiment that lasts for 5000 steps. In
the first 250 steps, the robot emits all three frequencies equally.
This is when the initial region first splits, and the robot begins
focusing on the third frequency. From then until about time step
3000, the robot emits the third frequency (the easiest one to predict)
about 90% of the time. From that point on it focuses on emitting the
second frequency (the more complex one to predict) about 85% of the
time. It never spends more that 10% of the time on the first
frequency, which is unlearnable.
Let's try to reproduce similar results with our simplified version of the experiment. Our modification has made the second frequency even more predictable, so it should only improve the results. Because the full experiment can take quite a while to run, let's start with a shorter version that will last only 1000 steps. To start the experiment do: python iac.py
Near the end of the iac.py file, an IACBrain is constructed and passed a number of arguments including the maximum region size, the motor vector size, the sensor vector size, the maximum number of steps in the experiment, and the probability of a random action. When applying IAC to other problems, you will need to modify these arguments for your domain.
Running iac.py will generate a number of data files recording the percentage of time the robot emits each of the three frequencies during every 50 time steps. These data files are updated continuously and can be viewed while the experiment is running. The following will produce a graph similar to Figure 4 on page 274 of the IAC paper:
xgraph -ly 0,1 -P *.data
Every time the memory is split, the current memory configuration is written to a log file. You can view this file as IAC runs to monitor the how the memory is changing.
After the experiment is completed, additional data files will be written summarizing the mean error rate of each region over time. The following will produce a graph similar to Figure 5 on page 274 of the IAC paper:
xgraph *.errIf there are too many error graphs, you may want to instead view them in smaller groups, such as just R0-R9:
xgraph R?.err
How do your initial results differ from what is described in the
paper? What feature and value is the first memory split done on?
Let's split up into groups and review the implementation of IAC to verify that we have implemented all aspects of the system correctly: