git@github.swarthmore.edu:cs35-f21/lab08-<your-teamname>
Due Wednesday, November 24th at 11:59pm.
The main goal for this lab is to gain experience working with hash tables. To that end, you will:
Implement two new versions of the Dictionary
ADT:
A LinearDictionary
for a simpler dictionary implementation for small sets of elements.
A HashTable
implementation that uses separate chaining (each chain
will itself be a LinearDictionary
).
Apply your hash table implementation to create a simple program to assist Scrabble players find legal plays.
As usual, you can select your partner using Teammaker.
The URL for your repository will be:
git@github.swarthmore.edu:cs35-f21/lab08-<your-teamname>
Your starting code can be found in the appropriate repository for your team. The following is a description of the repository’s initial contents; bolded files are those which you must change in completing the lab.
adts/
: As with your last lab, the ADT interface declarations are stored here.
Makefile
: the build instructions for this lab.
linearDictionary.h/-inl.h
: The LinearDictionary
class you will be implementing to store collisions within your HashTable
.
hashTable.h/-inl.h
: the HashTable
class implementation files.
hashFunctions.h/cpp
: Simple hash functions for integer and string
keys, similar to hashes we saw in class.
tests.cpp
: Unit test wrapper that will call provided tests for
all three classes you will define. The tests for the classes are in testLinearDictionary.cpp
, testHashTable.cpp
, and testScrabbleAssistant.cpp
.
manualTests.cpp
: A "sandbox" test program that you can
use while you develop your program.
main.cpp
: a simple Scrabble Assistant application.
scrabbleAssistant.h/cpp
: a C++ class encapsulating the
functionality of the Scrabble assistant.
We recommend that you implement a LinearDictionary
as a vector of key-value pairs. All of the operations should run in \(O(n)\) time. You may be thinking: “Why would I want such an inefficient implementation of the Dictionary
ADT?” The answer is that, for very small dictionaries, simpler implementations perform much better as they have a smaller constant overhead.
We will use the LinearDictionary
to represent the chain of collisions in our HashTable
. Because we expect each chain to be relatively short, the LinearDictionary
provides a nice abstraction that will be efficient enough for our purposes.
Incrementally develop and test your linear dictionary as you implement using manualTests.cpp
. We recommend starting with your constructor, insert
, and either get
or contains
. Once you are satisfied with your dictionary’s ability to add and find items, finish implementing the remaining methods one at a time.
Unit tests have been provided for this lab; once you have completed your LinearDictionary
implementation, the appropriate unit tests should pass.
$ make tests
$ ./tests linearDictionary
Success: 11 tests passed.
Test time: 0.06 seconds.
Implementing the various methods will require you to manipulate the underlying vector you are using to store key-value pairs.
For the most part, this is straightforward (e.g., using vector::size()
to
get the size of the LinearDictionary
, or vector::push_back()
to insert). The
exception to this is implementing LinearDictionary::remove()
, which requires removing
an element in the middle of the vector. The vector
class provides a means to remove an element from the end of
a list (vector::pop_back()
) but doesn’t give a simple interface for removing
elements e.g. from an arbitrary index. In linearDictionary.h/inl.h
we
have fully implemented a helper method,
removeFromVector()
to make this task simpler by specifying a vector. The
method takes the vector
you wish to delete from and the index for locating
the element to remove. For example:
vector<string> myVec = {"a","b","c","d"};
//myVec is now ["a","b","c","d"]
removeFromVector(myVec,2); //removes the element at index 2 -> "c"
//myVec is now ["a","b","d"]
It may be useful to you in this lab to concatenate one vector to another.
Here’s an example of how to do this. The vector a
contains the integers 10 and 20.
The vector b
contains the integers 30 and 40. You can insert all the elements of b
to the end of vector a
as shown below:
vector<int> a = {10, 20};
vector<int> b = {30, 40};
// concatenate b onto the end of a
a.insert(a.end(), b.begin(), b.end());
Once your LinearDictionary
implementation is complete, you can begin
implementing a chaining hash table as we discussed in class. The
figure at right shows an example of a chaining hash table with size 3,
capacity 4, and storing integer keys with string values.
Your implementation will be in the files hashTable.h
and
hashTable-inl.h
. Here are some important details about this class:
We recommend that you use a dynamically allocated array of statically-allocated LinearDictionary
objects to represent the hash table.
The HashTable
class has two constructors. One of the constructors allows the user to specify a maximum load factor; the other should use a sensible default value (e.g. 0.75). In either case, select a reasonable default table capacity, (e.g., capacity=5
).
All operations on your HashTable (other than getKeys
and getItems
) must run in \(O(1)\) average time.
The HashTable must maintain a reasonable load factor in order to operate efficiently. You will create a private helper method called expandCapacity
to assist with increasing the table’s capacity when the load factor becomes too high.
Note that hashing functions have been provided for you in hashFunctions.h
and hashFunctions.cpp
for keys of type int
and string
.
Incrementally develop and test your hash table in a similar manner as LinearDictionary
e.g., write tests along the way using manualTests.cpp
Start with your constructor, insert
, and get
to test that you can insert and find items. Then complete your implementation by implementing and testing one method at a time.
Once you are able to pass all of your own tests in manualTests.cpp
then try our unit tests:
$ make tests
$ ./tests hashTable
Success: 14 tests passed.
Test time: 0.50 seconds.
Once you have a working hash table, you can start writing your Scrabble
Assistant. We’ll design the application in two parts: a
ScrabbleAssistant
class and a separate main
program. The
main program (which we’ve written for you) provides a simple menu
interface and makes calls to the ScrabbleAssistant
, which encapsulates
all of the Scrabble Assistant features. The ScrabbleAssistant
class
should load an English-language dictionary (the main program allows
the user to specify a dictionary on the command line, but uses /usr/share/dict/words
as a default) into a hash
table and support the following functions:
spellCheck
: given a word, this function should return true if and
only if the word is in the dictionary. Implementing this should be
straightforward and should run in \(O(1)\) time.
anagramsOf
: given a string of letters, this function returns a
vector of all anagrams (i.e., permutations of these letters) which
represent valid words. For example the letters "tca" will return "cat" and "act" as they contain exactly the same letters. This function should also take constant
time in terms of the size of the dictionary (it is allowed to
depend on the length of the string).
findWords
: given a string of letters, this function returns a vector
of all legal words that the player could spell with the given letters.
You’re welcome to include any data members you need for the
ScrabbleAssistant
, as well as any private helper methods you’d
like.
As you incrementally implement your solution, test your
ScrabbleAssistant
using the manualTests.cpp
file. Once you are
convinced that your implementation is working, then try our unit
tests:
$ make tests
$ ./tests scrabbleAssistant
Success: 4 tests passed.
Test time: 0.88 seconds.
Your ScrabbleAssistant
constructor should use a HashTable
data
member to store all of the legal words provided in the vector
parameter called words
. Remember that a Dictionary
ADT stores keys
with their associated values. In this application, we only care about
the keys, which are the legal words in a real-world dictionary. It will
take some work to insert all of these words in the beginning, but after
that we should be able to look up any word in constant time.
Create your constructor and then implement the spellCheck
method.
In order to make anagramsOf
run in constant time, it will be helpful
to maintain an additional data member in the ScrabbleAssistant
constructor. We can also keep a HashTable
of anagrams. First,
write a helper function to sort the letters in a
string. (e.g. sortString("cat")
should return "act"). You don’t
have to do this yourself; instead, use the std::sort function provided
in the algorithm library.
#include <algorithm>
string sortLetters(string s) {
sort(s.begin(), s.end());
return s;
}
Then, create a HashTable
that maps each sorted string to a vector of
all words that sort to that string. For each word in the dictionary:
sort the letters of the word to form the key
if the sorted letters are not already a key in the dictionary, add it with a vector containing just the current word.
if the key exists, then update the vector to include the word
For example, the search "tca" will sort to "act". Our dictionary
should return a vector containing all legal words with the letters
"act" (i.e., anagramDictionary.get("act")
should return ["act",
"cat"]
). Be sure to write your own tests to verify that anagrams are
being constructed properly. If no anagrams exist, your function
should return an empty vector (not throw a runtime_error
).
Optional improvement: Now that you’ve created a second HashTable
to
store all legal anagrams of words, think about how you could use this
one table to also implement spellcheck
, still in constant time. If
you’d like to, you can eliminate the legal words HashTable
and just
implement all methods using the anagrams table.
This anagram dictionary will quickly provide you all possible legal words
that use the whole set of letters in the search. In Scrabble, however,
you are not required to play all letters in your hand. To account for
this, you’ll also need to evaluate all legal plays using fewer
letters as well in the function findWords
. Implementing findWords
will require some clever design.
One way to implement findWords
is to break down the task into the
following subtasks:
Find the power set of the given string
The power set is the set of all subsets of the given letters. For
example, "abcd" should return all (set) combinations of size 4
("abcd"), size 3 ("abc", "abd", "acd", "bcd"), size 2 ("ab", "ac",
"ad", "bc", "bd"," cd"), and size 1 ("a", "b", "c", "d"). We have
provided this for you in the method stringPowerSet
.
Get all anagrams of each string in the power set (using
anagramsOf
)
Accumulate all the anagrams in a single vector
Remove Duplicates: When you use the stringPowerSet
method,
you will likely find at some point that you have a vector of words
that contain several duplicates. It might be helpful to create a
removeDuplicates
function that eliminates any extra copies of a
word. There are several ways of accomplishing this:
Use a double-for loop that compares all pairs of words in the vector, looking for pairs that are the same. This will run in \(O(n^2)\) time.
First, sort the vector of words. Then, make a linear scan over the list, eliminating adjacent duplicates. This will take \(O(n log n)\) time using an efficient sorting algorithm.
Create a hash table with words as keys. Then, insert each word in the hash table assuming it isn’t already there. This should take \(O(1)\) time per word, or \(O(n)\) overall time.
Once the above code is complete, your ScrabbleAssistant
will
be complete!
For this lab, your program is required to run without memory errors or leaks. Because we are using large data structures in this lab, valgrind will run very slowly. We recommend that you run valgrind on each section of the lab separately as shown below. However, each test may still run quite slowly, so be patient.
Test the LinearDictionary
: valgrind ./tests linearDictionary
Test the HashTable
: valgrind ./tests hashTable
Test the ScrabbleAssistant
: valgrind ./tests scrabbleAssistant
You are required to observe good coding practices. You have seen these requirements many times, and should know them by now. If you’re unsure, check previous labs where the style requirements were stated explicitly.
When you are finished, you should have
A completed implementation of LinearDictionary
and the helper methods
A completed implementation of HashTable
and the helper methods
A completed implementation of ScrabbleAssistant
No memory errors
No memory leaks
Code which conforms to the style requirements