Your work for this lab consists of three parts:
As usual, you can get the initial files for this lab by running update35. The new files are listed here, and the files that you will need to edit are highlighted in blue:
$ cd cs35/labs/10 $ cp ../09/binaryHeap* library/You may need to patch up your include statements to get it to compile.
You should start by completing and testing the HashTable class. Our HashTable uses an array of vectors to store the data within the hash table, using chaining to resolve collisions. Note that this is not a vector of vectors. table is a standard C++ array, where each index contains an actual vector object (not a pointer). You should be able to draw this out as a memory diagram.
Each vector stores key-value pairs; for keys of type K and values of type V each key-value pair therefore has type Pair<K,V>.
You must implement public and private methods for the HashTable interface. We have provided the initial start in hashTable-inl.h, but feel free to divide the implementation into several files. We have provided you the implementation of insert as well as two hash functions so that your hash table can be utilized on either ints or strings.
You must test each HashTable function you implement using the testHash program. We have provided the data members you should keep track of and the implementation of insert. You should first read and understand these portions. We have also provided two sample hash functions, one for ints and one for strings. You may modify these as you see fit.
Note that we have defined a variable MAX_ALPHA. Load factor is generally noted as the letter alpha, and thus the highest load factor allowed is specified using MAX_ALPHA. Also note the odd declaration for the constructor:
HashTable(int capacity = 53, double maxLoadFactor = 0.8);This says that the user can specify two parameters, capacity and maxLoadFactor when constructing a HashTable. If the user omits these arguments, the default values are 53 and 0.8 respectively. This is a general technique for specifying optional parameters in C++.
Lastly, do not forget to use exceptions if a function was not called properly. For example, if there are no items in the table, which functions should be called? If a search process fails, which functions can handle that properly and which should throw an exception?
Scrabble is a popular board game based on a players ability to generate words from a given set of letters (called tiles). While there are more details needed to learn the game, all that you need to know for this assignment is that a player may only play actual words. That is, a legal play is is a word that exists in an English dictionary. Second, an important skill is being able to generate a list of possible plays given a finite set of letters in no particular order.
Your goal for the main program is to help a user learn about possible word plays. (Of course, this tool will NEVER be used to cheat in a real game. Scout's honor :) ). Your program will greet the user and provide a simple interface with four options:
The two main technical parts to implement are a spell checker and an anagram solver. You will solve both problems using dictionaries. There are not too many other details, we want to encourage creativity for this lab.
To solve both problems, you will need to load a dictionary of known words. You will need to load the words in the file /usr/local/doc/sowpods.txt. You will need to construct dictionaries to solve each problem. The first dictionary will index all known words, and is simply a problem of verifying if a word is in the data structure or not.
For the anagram solver, you will need to store a dictionary of anagrams and the associated list of words that can be created from it. This is slightly more complicated. One tip to consider is that anagrams are not order specific, so it may be best to sort all of the letters first. For example the letters c,a,t form the words cat, act, and tac. If the user had given me the letters in a different order, e.g. t,c,a the answered would be the same. So it is probably best to store your sequence of letters in sorted order and then sort the letters given by the user to search the dictionary.
There are a few ways to generate all legal plays. Perhaps the easiest way is to generate a list of all possible subsets of letters, and then use the anagram solver on each subset of letters. Implement the anagram solver first. Warning: Make sure to avoid printing out the same legal plays multiple times. One way to do this is to build up a dictionary of legal plays as you iterate over subsets. You can print out your dictionary after you finish iterating.
$./scrabbleCheat Welcome to Scrabble Solver Menu: (0) Exit (1) See if a play is legal (2) Find all legal anagrams for a set of tiles (3) Find all legal plays for a set of tiles Select your choice (0-3):1 Enter the word you wish to play: quilt Congratulations, you have found a legal play! Menu: (0) Exit (1) See if a play is legal (2) Find all legal anagrams for a set of tiles (3) Find all legal plays for a set of tiles Select your choice (0-3): 1 Enter the word you wish to play: QuILt Congratulations, you have found a legal play! Menu: (0) Exit (1) See if a play is legal (2) Find all legal anagrams for a set of tiles (3) Find all legal plays for a set of tiles Select your choice (0-3): 1 Enter the word you wish to play: blerg Sorry, that is not a legal play Menu: (0) Exit (1) See if a play is legal (2) Find all legal anagrams for a set of tiles (3) Find all legal plays for a set of tiles Select your choice (0-3): 2 Enter the letters you wish to play: tca Your legal plays are: act cat Menu: (0) Exit (1) See if a play is legal (2) Find all legal anagrams for a set of tiles (3) Find all legal plays for a set of tiles Select your choice (0-3): 2 Enter the letters you wish to play: ACT Your legal plays are: act cat Menu: (0) Exit (1) See if a play is legal (2) Find all legal anagrams for a set of tiles (3) Find all legal plays for a set of tiles Select your choice (0-3): 2 Enter the letters you wish to play: bl Sorry, no anagram exists Menu: (0) Exit (1) See if a play is legal (2) Find all legal anagrams for a set of tiles (3) Find all legal plays for a set of tiles Select your choice (0-3): 3 Enter the letters you wish to play: table All possible plays: blate bleat ablet table belt blet blat abet bate beat beta tael tale teal tela late leat able albe bael bale blae bet bel alt lat ate eat eta tae tea ale lea elt let tel bat tab alb bal lab at ta et te ab ba ae ea be al la el Menu: (0) Exit (1) See if a play is legal (2) Find all legal anagrams for a set of tiles (3) Find all legal plays for a set of tiles Select your choice (0-3): 0 Thanks for Playing Scrabble Solver!
The Standard Template Library (STL) contains many templated implementations of data structures covered in this course, plus many beyond. In this lab, you will use the vector class which is most closely related to the List interface and uses something akin to the ArrayList implementation. You can read about its usage here.
In testHash.cpp or a separate file, explore some of the parameters of your HashTable implementation. As discussed in class, the performance of HashTables depend on several factors (e.g., initial choice of capacity, hash function, load factor, etc.). Pick one or more of these factors as the independent variable in your experimental design. Use reportStats to gather information about several possible dependent variables (i.e., the y-axis) including the number of key-comparisons, the percentage of non-zero buckets in the table, the maximum size of any list, the average size of non-zero buckets, etc. This is by no means an exhaustive set of possible dependent and independent variables.
Once you explore some possible pairings of variables, produce graphs that show a couple interesting relationships. This can be done with one or two graphs. For example, you may have one graph showing the effect of initial capacity (x-axis) on comparisons (y-axis) with a curve for two different hash functions. Write a short paragraph describing how you generated the data (e.g., "inserted 1,000,000 values into a HashTable, varying the maximum load factor each time"). Write another sentence or two summarizing your findings.
This portion can be done with your lab partner. You will hand in your solution
in during class on
Tuesday, November 27. We suggest
modifying testHash.cpp. When you hand in the
normal part of the lab, be sure that testHash.cpp shows the original hash
table tests. You can encapsulate your experiments in additional functions,
not be deleting the original tests.
In class, we discussed how dictionaries are like arrays but with generic indexing. The subscript operator in arrays (e.g., C++ standard arrays, Python lists) is very similar to the get operation. In fact, let us overload the subscript operator for CircularArrayLists. First, let us recall how get(int i) is defined:
template <typename T> T CircularArrayList<T>::get(int i) { if (i < 0 || i >= size) { throw std::runtime_error("Attempted to get out of bounds."); } return values[(headPos+i)%capacity]; }Hopefully that is nothing new! Now let us see how the indexing operator is defined for this class:
template <typename T> T& CircularArrayList<T>::operator[](int i) { if (i < 0 || i >= size) { throw std::runtime_error("Attempted to index out of bounds"); } return values[(headPos+i)%capacity]; }
Notice that this looks almost exactly the same. The main difference is in the return value. get returns a value of type T. Just as we pass-by-value in C++, by default we return-by-value. That means that the value returned is a copy of the original item stored in the CircularArrayList.
The subscript operator is used to get values, and thus the get return-by-value would be suitable. But we also wand to use indexing to modify or set values e.g., list[i] = 7. To do this, we need to have access to the actual piece of memory where the data is stored. So, we return a data type of T&. This tells C++ not to return a copy, but to return the actual chunk of memory where the data is stored so that it can be modified.
Try defining the indexing operator for Dictionaries. Unleash the power of polymorphism!
Once you are satisfied with your program, hand it in by typing handin35 at the unix prompt.
You may run handin35 as many times as you like, and only the most recent submission will be recorded. This is useful if you realize after handing in some programs that you'd like to make a few more changes to them.