CS35: Homework #6

For this assignment, you will be adding some additional methods to various tree classes and comparing binary search trees to a red-black balanced binary search tree. You should begin by copying the files from cs35/labs/Oct-22-Mon to your cs35/homework/06 directory. You will be adding code to these files, but it is important that you do not change the interfaces for existing functions unless explicitly told to do so. For this assignment, you may work with a partner to discuss how to implement operations and analyze their big-O runtime, but you should submit your own code.

Height

Recall that the height of node v in a tree is the length of the longest path from v to an external node. The height of an external node is 0, and the height an any internal node is one more than the maximum height of its children nodes. The height of a tree is the height of the root of the tree. We will adopt the convention that the height of an empty tree and a tree containing only a root node is zero.

Add a method height() that returns the height of a tree to the most generic class/interface that could support this method. Possible options include the Tree interface, the BinaryTree interface, the BinarySearchTree class or the RBTree class. If you decide that the height() method should belong as a part of an interface, implement the method in the most general class that implements that interface. We have already written a height method in-class and distributed it in update35. Just ensure that you have this code in the right place.

In your README answer the following questions:

Where did you define the method height()?
Where did you implement the method height()?
Give an argument that the run time of your implementation is O(n)
Briefly describe a way to implement height() such that it runs in O(1) time. How could you update the value of the height in O(h) time where h is the height of the tree? You do not need to implement this or give a detailed explanation, but highlight the basic idea, and discuss why updating the height might be tricky.

Predecessor and Successor

Given a binary tree T and a key k, the predecessor of k in T is the entry with the largest key in T smaller than k. Similarly, the successor of k in T is the entry with the smallest key in T larger than k. Note that the predecessor or successor of k may exist in T, even if no entry in T has a key k. Furthermore, a predecessor or successor may not exists in T for a given key.

Add methods Entry<K, V> pred(K key) and Entry<K, V> succ(K key) that return a predecessor or successor of a given key k. Add these methods to the most generic class/interface that could support this method. If you decide the methods should belong as a part of an interface, implement the method in the most general class that implements that interface. If the predecessor or successor does not exist for a particular key, you should return a null value. Keep in mind that elements with no internal children may have predecessors or successors even though they cannot be found using the method we discussed in class. You implementation should run in O(h) time, where h is the height of the tree.

In your README answer the following questions:

Where did you define/implement your methods?
Give an argument that the run time of your implementation is O(h)

Range find

Write a method Iterable<Entry<K,V>> findRange(K key_min, K key_max) that returns an Iterable list of all Entry objects in a tree whose keys k are in the range key_min <= k < key_max. The method findAll(K key) might be helpful to understand how to return an Iterable object. Your implementation should run in O(h+s) time where h is the height of the tree, and s is the number of elements returned. Such a method is called output sensitive since the running time depends not only on the input size, but also the size of the output.

In your README answer the following questions:

Where did you define/implement your findRange?
Give an argument that the run time of your implementation is O(h+s)

ZIP Codes

Design a simple class that can store ZIP code records. ZIP code data can be found in the file /usr/local/doc/zipcodes.txt. Each record is one single comma separated line with the following fields: ZIP code, latitude, longitude, city name, county name, state abbreviation, and population. Many cities do not have accurate population information and list the population as 0. You class should store a single record, not a list of records.

Define the following comparison classes that implement the Comparator interface for ZIP codes:

A comparison class that orders by increasing ZIP code.
A class that orders records alphabetically by city name and breaks ties by ZIP.
A class that orders records by increasing latitude and breaks ties by ZIP

Write a parser that reads in records from the the zipcodes.txt file and creates a list of ZIP code records. Write test code that tests each of your comparison classes and sorts the list by different criteria. You should print out a few records each list to show they are sorted appropriately.

Experiments

Write a short, simple test class that demonstrates that your methods work. Try to devise good comprehensive test cases. You may use any key, value types you like for these test cases.

Additionally, perform the following tests with the ZIP code data:

Sort the records by ZIP code and insert these records sequentially into both a standard binary search tree and a RB-tree. Use ZIP codes as the key for both trees. For each tree, measure the total build time and the final height of the tree.
Sort the records by city name and insert these records sequentially into both a standard binary search tree and a RB-tree. Again, use ZIP codes as the key for both trees. For each tree, measure the total build time and the final height of the tree.

In your README, summarize the results of your experiments. Does using a red-black tree result in a better balanced search tree? Why or why not?

Submitting

Along with your Java source code, you should hand in a README file, and your Makefile if you used one. These files will be imported automatically via handin35.

CS35 Homework #6:

Binary Search Trees