For this program you may work with one partner (you may not work in groups larger than two).
If you work with a partner:
Run update21, if you haven't already, to create the cs21/labs/08 directory. Then cd into your cs21/labs/08 directory and create the python programs for lab 08 in this directory (handin21 looks for your lab 08 assignments in your cs21/labs/08 directory):
$ update21 $ cd cs21/labs/08 $ vim spellchecker.pyWith the labs/08/ directory we are giving you an example ignorefile and document file you can use to test your program. You should test your program on other documents and ignore files as well.
As you create your own documents and ignore files for testing, make sure that there is a blank line at the very end.
For this assignment you will implement an automated spell checking program that uses both searching and sorting. You should continue to practice using top-down design, and focus on breaking up the problem into appropriate functions. Also be sure to test your program incrementally.
You will use a dictionary of words that is available in a file on our system to determine whether each word in the document is a valid word. However, some valid words like Swarthmore will not appear in this dictionary. Therefore you will need another file of words, created by you, that should also be considered correct.
The goal of your program is to report all misspelled words found in the document. For example, suppose you had the following document called letter:
Dear Sam, How are you? I'm fine, but I mis seeing you more oftn. Hope allis well with your family, especialy your brother. I'll be stoping by next Satuday when I get home from Swarthmore. Love, AliceAnd suppose that you have a file named ignore with the following contents (Alice and Sam are in the system dictionary file):
Swarthmore WWWRunning the spell checker on this document should result in the following interaction with the user:
Enter filename of document to spell check: letter Enter filename of list of words to ignore: ignore The following errors were found by the spell checker: mis oftn allis especialy stoping Satuday
Focus initially on checking whether the words in the document are found in the standard dictionary.
Implement and test a BinarySearch function. The best way to do this is to try it out on your own using the algorithm we discussed in class, and then test it by passing in a sorted list of values. If you have problems, you can refer to p.428 of the textbook that lists the Binary Search code.
Implement and test this functionality before moving on to the next step.
Once you have the basic spell checking functionality implemented from above, you can add the ability to ignore other words not found in the dictionary. Many spell checking programs use this ignore list idea. The ignore list often contains proper nouns (e.g. Chopp), well known acronyms (e.g. WWW) that do not appear in a dictionary, but are also not misspellings.
With the lab starting point code is an example ignore file that you can add entries to. Just be sure to keep in the line break after the very last line in this file (just add new words in the middle to avoid making this change).
[ 11, 8, 13, 6, 5 ] ^ ^ compare and swap [ 8, 11, 13, 6, 5 ] ^ ^ compare [ 8, 11, 13, 6, 5 ] ^ ^ compare and swap [ 8, 11, 6, 13, 5 ] ^ ^ compare and swap [ 8, 11, 6, 5, 13 ] 13 has bubbled up to the top and is in its final sorted position
The sample output is using the document and ignorefiles that you got with update21.
$ python spellchecker.py Enter filename of document to spell check: document The following errors were found by the spell checker: Swarthmore WWW Swarthmore's students' department: Danner Horner Kelemen Knerr Lia Meeden Newhall Rothera Turnbull Wicentowski
$ python spellchecker.py Enter filename of document to spell check: document Enter filename of list of words to ignore: ignorefile The following errors were found by the spell checker: Swarthmore's students' department: Lia
Note:These are not required parts of the assignment.
Once you have the full functionality implemented and tested. You can try adding some additional features. Here are some suggestions:
These features should not be added by using the ignore file, but should be added to the spell checking functionality.
A good test case for these changes is something like the following that contains quotes, numbers, and parens:
Barak Obama said today, "I'm pleased with the passage of the Health Care bill." After 8 months of negotiations (which were sometimes quite heated), the bill is a reality.With the above features added, your program should indicate that this document has no misspelled words.
Once you are satisfied with your program, hand it in by typing handin21 in a terminal window.