CS91R Lab 01: NY Times games assistants
Due Wednesday, February 5, before midnight
Goals
The goals for this lab assignment are:
-
Learning the basics of regular expressions
-
Learning the basics of
awk
to prototype solutions on the command-line -
Learning how to use regular expressions in
awk
andpython
-
Applying these tools to build word game assistants
Cloning your repository
Log into the CS91R-S25 github
organization for our class and find your git repository for Lab 01,
which will be of the format lab01-user1-user2
, where user1
and user2
are you and your partner’s usernames.
You can clone your repository using the following steps while connected to the CS lab machines:
# create your cs91r/labs directories if you have not done so
$ cd
$ mkdir cs91r
$ cd cs91r
$ mkdir labs
# cd into your cs91r/labs sub-directory and clone your lab01 repo
$ cd ~/cs91r/labs
$ git clone git@github.swarthmore.edu:CS91R-S25/lab01-user1-user2.git
# change directory to list its contents
$ cd ~/cs91r/labs/lab01-user1-user2
# ls should list the following contents
$ ls
bee.py collab.md ex1.py README.md sample.cast sample.gif wordle.py words.txt
Python Regular Expressions warmup
CLAB 2B contains information on how to write regular expressions in python. It has two questions, and for this warmup, you should complete the first question (#18). Work with your lab partner on this — you don’t need to work with your in-class group.
Regular expressions refresher
If you’d like a refresher on regular expressions, we’ve created
a regular expression tutorial. We have
provided an example of how to code the tutorial
examples in python. Be sure you understand how the examples work
before moving on to the next section. This file (ex1.py
) is also in
your Lab 01 repository.
1. Spelling Bee assistant
The New York Times hosts a number of word games. One of these games is Spelling Bee. Like the other games on the NY Times website, these puzzles are updated only once a day. Like any successful product, there are knock-off versions such as spellbee.org.
The spellbee.org version allows you to link to a specific puzzle, which is useful for briefly explaining the game. For example, let’s look at this puzzle, which has the letter L in the center in yellow and the letters E, Q, T, I, U, R around the outside in white. Your goal is to make as many words as possible with the letters provided. You can use each letter as many times as you want (0 or more times), but you must use the yellow center tile at least once. Words must be at least 4 letters long.
The dictionary that the NY Times uses and the dictionary that the spellbee.org website uses are both unpublished, but we can use our own dictionary: a dictionary that contains words that are legal to play in the board game Scrabble. Our dictionary list should contain all of the words in the spelling bee solution, and it will almost certainly contain extra words that aren’t in the solution. That’s because the Scrabble word list is much more expansive (and contains words that some users might find offensive). However, using this list should ensure that we find all of the answers the game wants us to find and then a bunch more that the game considers invalid.
You can find the dictionary here: /data/cs91r-s25/scrabble/scrabble.txt
1.1 Using awk to prototype a solution
In a single command line, use awk to produce all of the valid words for the spellbee.org puzzle described above.
For the solution we are looking for, you can pipe the output of one awk command into another, but you shouldn’t use any awk syntax that is more complicated than the examples provided above.
If you’d like to use DataCamp’s Regular Expressions Cheat Sheet to figure out how to solve it with more complex regular expressions, you’re welcome to do so, but be sure to show us how you would do it using only what we’ve taught you, too.
Questions
Put the answer to these questions, and any additional questions, in
your README.md
file.
-
Use
asciinema
to record interactions in the shell usingawk
demonstrating your spelling bee assistant.
1.2 Using python to get answers
To begin, write a python program called bee.py
that calculates all
of the answers to the spelling bee game. You will need to read in all
of the words from the dictionary file into python. Then, for each word
in the file, use the same regular expressions you just used in awk to
determine the solutions. Use the python regex
example shown earlier for reference.
Once you have that working, add two other features:
-
Your python program should read the letters of the puzzle from the command line.
-
Optional: Your program should produce the maximum achievable score.
To read the letters of the puzzle from the command line, you will need
to access a special list, sys.argv
. To access that, you will need to
import sys
at the top of your program. In python, argv
stores the name of
your python program in argv[0]
and then any other values you typed starting
in argv[1]
. See the provided python argv example to see
this in use.
Output
When you are done, your program should work like this:
$ python3 bee.py
(Provide an error message to let the user know they should provide letters)
$ python3 bee.py leiqrtu
Words: 105
The words are:
eelier
...
You can also run your program like this if you’d like:
$ python3 bee.py < /data/cs91r-s25/scrabble/scrabble.txt
(Provide an error message to let the user know they should provide letters)
$ python3 bee.py leiqrtu < /data/cs91r-s25/scrabble/scrabble.txt
Words: 105
The words are:
eelier
...
Notice that eelier
is not a valid word in the spellbee.org puzzle,
and we report far more words that the NY Times expects because our
dictionary contains more words than the NY Times dictionary. For this
lab, that’s fine!
Questions
Put the answer to these questions, and any additional questions, in
your README.md
file.
-
Use
asciinema
to record interactions with yourbee.py
program to demonstrate that it is working.
Optional: Calculating points
In the spelling bee game, you get points for each valid word that you make. Four letter words are worth 1 points. All longer words have points equal to the number of letters in the word. However, if you make a word that uses each letter at least one time (called a "pangram"), you get 7 bonus points. Given the scoring rules, calculate the score for each word and report only the total score of all the words you can make. HINT: The python set data structure is very useful for figuring out if a word is a pangram.
If you implement this, your new output would look something like this:
$ python3 bee.py
(Provide an error message to let the user know they should provide letters)
$ python3 bee.py leiqrtu
Words: 105 Points: 485
The words are:
eelier
...
Optional: Generating hints
If you’d like some more python programming practice or enjoy the spelling bee puzzle, here’s an optional extension.
The NY Times provides hints. Can you generate similar hints for the user’s letters?
See the NY Times glossary of spelling bee terms if you’re unfamiliar with this grid.
Here’s what the hints looked like from the puzzle on January 1, 2025:
Center letter is in bold.
T A B G L N O
WORDS: 47, POINTS: 195, PANGRAMS: 1, BINGO
4 5 6 7 8 Σ
A: 1 3 1 - - 5
B: 5 2 2 1 - 10
G: 2 2 1 2 - 7
L: 1 1 - 1 1 4
N: - 1 1 1 - 3
O: 1 - - - - 1
T: 8 6 1 - 2 17
Σ: 18 15 6 5 3 47
Two letter list:
AB-1 AL-2 AT-2
BA-2 BL-5 BO-3
GA-2 GL-2 GN-1 GO-2
LA-1 LO-3
NA-3
ON-1
TA-8 TO-9
2. Wordle assistant
Another popular word puzzle on the NY Times website is Wordle. Like Spelling Bee, the puzzle is updated every day, and there are lots of knock-off versions, such as hellowordl.net. Like spellbee.org. we can link directly to a specific puzzle. Play this wordle puzzle to remind yourself how to play if you have never played or haven’t played in a while and need a refresher.
We will write an assistant for wordle so that, at any point in the game, we can get a list of all of the valid guesses we have remaining. For example, using the example puzzle above, if we’d guessed "STEAL" (follow along in another tab), we would know the following information:
-
There is at least one S in the puzzle, but it is not in the first position.
-
There is at least one E in the puzzle, but it is not in the third position.
-
There is no T, A, or L in the puzzle.
Given a dictionary containing five-letter words, your assistant should provide a list of all words that match those criteria.
If we then guessed "HOUSE" (follow along in another tab), we would know:
-
There is at least one S in the puzzle, but it is not in the first or fourth position.
-
There is at least one E in the puzzle. One of them is in the fifth position and, if there are more E’s, they are not in the third position.
-
There is at least one U in the puzzle, but not in the third position.
-
There is no T, A, L, H or O in the puzzle.
You will use the same Scrabble dictionary you used in the Spelling Bee assistant to build our Wordle assistant.
2.1 Using awk to prototype a solution
In a single command line, use awk to produce all of the valid remaining words after making at least one guess.
For the solution we are looking for, you can pipe the output of one awk command into another, but you shouldn’t use any awk syntax that is more complicated than the examples provided above.
If you’d like to use DataCamp’s Regular Expressions Cheat Sheet to figure out how to solve it with more complex regular expressions, you’re welcome to do so, but be sure to show us how you would do it using only what we’ve taught you, too.
You do not have to worry about what happens when the word you guess
has repeated letters and they aren’t all in the answer. For example, if you guessed ISSUE in the puzzle above, you would
get ISSUE. This tells
you that there is only one S in the answer. However, you don’t have to
explicity handle that case — you just need to make sure there is no S in the
second position.
|
Questions
Answer each of these questions as relate to this puzzle. For each of the first 4 questions:
-
provide text output in a markdown code block and,
-
demonstrate that they all work in a single
asciinema
recording (see Section 3).
-
To start, you guess STEAL and your output is STEAL. Given this information, what
awk
command (or series ofawk
commands) would you use to show all the remaining valid words? -
Your next guess is VIDEO and the output is VIDEO. Given this information and the information from your first guess, what
awk
command (or series ofawk
commands) would you use to show all the remaining valid words? -
Your third guess is PRICE and the output is PRICE. Given this information and the information from your previous guesses, what
awk
command (or series ofawk
commands) would you use to show all the remaining valid words? -
Your fourth guess is BEING and the output is BEING. Given this information and the information from your previous guesses, what
awk
command (or series ofawk
commands) would you use to show all the remaining valid words? If yourawk
commands were correct, you should only get two words as your output here.
For this last question, answer in your README.md
file as plain text:
-
Although you might not have explicity written it down, what algorithm are you using (in your head) to construct the
awk
commands?
2.2 Using python to improve the assistant
Write a solution to the Wordle assistant in python. Start by
hard-coding the examples you used from the command-line in awk
. The
regular expressions will be the same. As with the spelling bee assistant,
you will have to read in a dictionary of words into python as a first step.
Once you are sure that the hard-coded examples are providing the same
output as the awk
examples, modify your program so that it accepts
multiple command-line arguments in the following format:
$ python3 wordle.py <guessed_letters> <col1> <col2> <col3> <col4> <col5>
The <guessed_letters>
argument will include all of the letters you’ve guessed
so far. You can duplicate letters here and your program should handle that
without a problem.
Each of the 5 <col>
arguments will be either:
-
The
.
symbol if no yellow or green letters have appeared at this position yet, or -
An upper-case letter for each green letter at this position AND a lower-case letter for each yellow letter at this position.
Here is an example of how you would run the program after making your first guess of "STEAL" in the puzzle above:
$ python3 wordle.py steal . . e . .
Be sure you understand why that is how you would run the program given the instructions above.
After making your second guess, "VIDEO", you could run your program like this:
$ python3 wordle.py stealvido . i e e .
After making your third guess, "PRICE", you could run your program like this:
$ python3 wordle.py pricestealvido . i Ie e e
Just like with the awk
commands, your program should print all
remaining valid words.
The ex1.py
file may provide some guidance on how to code this up.
Questions
-
Use
asciinema
to record your interactions with yourwordle.py
program to demonstrate that it is working. -
Suggest any improvements you’d like to make to your current program (aside from not handling repeated letters properly)?
-
Would you make any changes to the program’s interface? If no, why not? If yes, what would the new interface look like? Show some sample interactions with the modified interface in a markdown code block.
3. How to turn in your solutions
Edit the README.md
file that we provided to discuss how you solved
each problem. For each part (spelling bee with awk, spelling bee with python,
wordle with awk, wordle with python), use asciinema to record a terminal session
and include it your README.md
.
For example, here is an asciinema recording of the awk
examples shown above:
To record your session in asciinema, use the following command:
$ asciinema rec -i 2 awk_regex.cast
When your session is over, convert it to a .gif
file:
$ agg awk_regex.cast awk_regex.gif
Add any .cast
and .gif
files to your repository.
You can name your .cast
files anything you’d like, but you will need
at least four of them to include in your writeup. Do not worry if you
make typos as you are working in a recording: we are not evaluating
your ability to type!