Use the github server to get the starting point notebook for this lab.
Novelty search is a method of searching for a variety of solutions rather than trying to optimize a particular goal. To do this, novelty search maintains an archive of behaviors, comparing current behaviors against the saved ones to determine whether a current behavior is novel enough to add to the archive. A current behavior's novelty is defined as the average distance to its k nearest neighbors in the archive.
We will be representing a behavior as a list of points, where points may be of some arbitrary dimension, and the list may be of some arbitrary length. For example, consider the coverage task that we worked on in the previous lab. One way to represent the behavior of a robot in this task is to record the new grid locations it visits in order. The length of this particular behavior space will be the total number of grid locations, which for a 15x15 grid is 225. The length of each point in this behavior space will be 2, representing the (x, y) location of each grid square.
constructor: Create class variables to store the archive as an initially empty list, the k (an integer) to be used in calculating nearest neighbors, the limit (an integer) for the maximum size of the archive, the threshold (a float) for how novel an example has to be before it will be added the archive, and the max_dist (a number) for the maximum possible distance between any two behaviors in the archive.
point_dist: Returns the Euclidean distance between points p1 and p2, which are assumed to be lists of equal length. To calculate distance, sum up the squared differences between each component of the points. Return the square root of this sum. This method should work for points of any length.
behavior_dist: Returns the sum of the point distances between behaviors b1 and b2, which are assumed to be lists of points of equal length. This method should work for behaviors of any length.
k_nearest_dist: For the given behavior, returns the sum of the behavior distances to its k nearest neighbors in the archive.
sparseness: For the given behavior, returns the normalized average distance (based on the max_dist) to its k nearest neighbors in the archive. If the archive is empty, then return 1.0. If the archive has less than k behaviors, the use the archive's current size to compute the average distance. The sparseness of a behavior will represent its novelty.
check_archive: This method checks whether the given behavior is novel enough to add to the archive and returns its sparseness. The parameter other_info can be used to store additional information along with the behavior in the archive. This will be helpful when doing analyses of the evolutionary run, once it completes. The archive is represented as a list of tuples, where each tuple contains a behavior, its sparseness, and the its other associated information. The behavior at the front of the archive is the oldest, and the behavior at the end of the archive is the newest. When the size of the archive is less than k, then add the given behavior to the archive. Otherwise check if the sparseness of the given behavior is greater than the threshold, and if so, add the behavior to the archive. After adding a behavior, test if the archive size has exceeded the limit, and if so, remove the oldest behavior from the archive.
plot_behaviors and plot_archive: You will implement these methods later.
Once you complete your first pass at the implementation, use git to add, commit, and push your changes.
import random
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
%matplotlib inline
class NoveltySearch(object):
def __init__(self, k, limit, threshold, max_dist):
pass
def point_dist(self, p1, p2):
pass
def behavior_dist(self, b1, b2):
pass
def k_nearest_dist(self, behavior):
pass
def sparseness(self, behavior):
pass
def check_archive(self, behavior, other_info=None):
pass
def plot_growth(self):
pass
def plot_behaviors(self, max_x=1.0, max_y=1.0):
pass
Test your implementation by creating some sample behaviors where all points are two dimensions and have coordinates in the range [0, 1], and all behaviors contain four points. Determine the maximum possible distance between such points. You'll also need to determine a good threshold based on your test behaviors.
def unit_tests():
pass
unit_tests()
Ensuring that the outcome of a machine learning process is correct can be tricky because it may incorporate randomness that leads to different results for each run. It is often the case that you will need to implement additional functionality to assist in analyzing the results. We will use matplotlib's pyplot to create several figures to help us monitor the search.
Once you complete the analysis tools, use git to add, commit, and push your notebook.
In order to gain more intuition about how novelty search operates, you will explore how various parameters settings affect the contents of the archive. Let's focus on the threshold parameter since it is crucial in determining when a behavior is novel enough to be added to the archive.
Complete the function random_2d_behavior that returns a behavior of the given length as a list of 2d points in the range [0,1].
Next, complete the function randomized_tests. Create an instance of NoveltySearch with a k of 3 and a limit of 10. Let's use behaviors of length 3. You'll need to determine the max_dist for these behaviors, and you will experiment with the threshold setting. Loop 500 times, generating a random behavior and calling the check_archive method. After the loop, call the plot_growth and plot_behaviors methods to see the results.
Recall that the sparseness values are normalized, so the threshold setting should be a value in the range [0,1]. Smaller thresholds should lead to the archive reaching the limit in size more quickly. For high enough thresholds, the archive may never reach its size limit. What threshold value(s) seem to strike the best balance of using the capacity of the archive at a reasonable pace? You should run multiple experiments at each threshold setting. Include a markdown table showing your results.
def random_2d_behavior(length):
pass
def randomized_tests():
pass
randomized_tests()
Be sure to save this notebook to the repository.