CS35 Homework #6:

Binary Search Trees

Due 11:59pm Monday, 5 November 2007

For this assignment, you will be adding some additional methods to various tree classes and comparing binary search trees to a red-black balanced binary search tree. You should begin by copying the files from cs35/labs/Oct-22-Mon to your cs35/homework/06 directory. You will be adding code to these files, but it is important that you do not change the interfaces for existing functions unless explicitly told to do so. For this assignment, you may work with a partner to discuss how to implement operations and analyze their big-O runtime, but you should submit your own code.

Height
Recall that the height of node v in a tree is the length of the longest path from v to an external node. The height of an external node is 0, and the height an any internal node is one more than the maximum height of its children nodes. The height of a tree is the height of the root of the tree. We will adopt the convention that the height of an empty tree and a tree containing only a root node is zero.

Add a method height() that returns the height of a tree to the most generic class/interface that could support this method. Possible options include the Tree interface, the BinaryTree interface, the BinarySearchTree class or the RBTree class. If you decide that the height() method should belong as a part of an interface, implement the method in the most general class that implements that interface. We have already written a height method in-class and distributed it in update35. Just ensure that you have this code in the right place.

In your README answer the following questions:

Predecessor and Successor
Given a binary tree T and a key k, the predecessor of k in T is the entry with the largest key in T smaller than k. Similarly, the successor of k in T is the entry with the smallest key in T larger than k. Note that the predecessor or successor of k may exist in T, even if no entry in T has a key k. Furthermore, a predecessor or successor may not exists in T for a given key.

Add methods Entry<K, V> pred(K key) and Entry<K, V> succ(K key) that return a predecessor or successor of a given key k. Add these methods to the most generic class/interface that could support this method. If you decide the methods should belong as a part of an interface, implement the method in the most general class that implements that interface. If the predecessor or successor does not exist for a particular key, you should return a null value. Keep in mind that elements with no internal children may have predecessors or successors even though they cannot be found using the method we discussed in class. You implementation should run in O(h) time, where h is the height of the tree.

In your README answer the following questions:

Range find
Write a method Iterable<Entry<K,V>> findRange(K key_min, K key_max) that returns an Iterable list of all Entry objects in a tree whose keys k are in the range key_min <= k < key_max. The method findAll(K key) might be helpful to understand how to return an Iterable object. Your implementation should run in O(h+s) time where h is the height of the tree, and s is the number of elements returned. Such a method is called output sensitive since the running time depends not only on the input size, but also the size of the output.

In your README answer the following questions:

ZIP Codes
Design a simple class that can store ZIP code records. ZIP code data can be found in the file /usr/local/doc/zipcodes.txt. Each record is one single comma separated line with the following fields: ZIP code, latitude, longitude, city name, county name, state abbreviation, and population. Many cities do not have accurate population information and list the population as 0. You class should store a single record, not a list of records.

Define the following comparison classes that implement the Comparator interface for ZIP codes:

Write a parser that reads in records from the the zipcodes.txt file and creates a list of ZIP code records. Write test code that tests each of your comparison classes and sorts the list by different criteria. You should print out a few records each list to show they are sorted appropriately.

Experiments
Write a short, simple test class that demonstrates that your methods work. Try to devise good comprehensive test cases. You may use any key, value types you like for these test cases.

Additionally, perform the following tests with the ZIP code data:

In your README, summarize the results of your experiments. Does using a red-black tree result in a better balanced search tree? Why or why not?

Submitting
Along with your Java source code, you should hand in a README file, and your Makefile if you used one. These files will be imported automatically via handin35.