CS21 Lab 3: Loops and Conditionals
Due Saturday, September 24, by 11:59pm
Programming Tips
As you write programs, use good programming practices:
-
Use a comment at the top of the file to describe the purpose of the program (see example).
-
All programs should have a
main()
function (see example). -
Use variable names that describe the contents of the variables.
-
Write your programs incrementally and test them as you go. This is really crucial to success: don’t write lots of code and then test it all at once! Write a little code, make sure it works, then add some more and test it again.
-
Don’t assume that if your program passes the sample tests we provide that it is completely correct. Come up with your own test cases and verify that the program is producing the right output on them.
-
Avoid writing any lines of code that exceed 80 columns.
-
Always work in a terminal window that is 80 characters wide (resize it to be this wide)
-
In
vscode
, at the bottom right in the window, there is an indication of both the line and the column of the cursor. == Are your files in the correct place?
-
Make sure all programs are saved to your cs21/labs/03
directory! Files
outside that directory will not be graded.
$ update21 $ cd ~/cs21/labs/03 $ pwd /home/username/cs21/labs/03 $ ls Questions-03.txt (should see your program files here)
Goals
The goals for this lab assignment are:
-
practice using if-else statements
-
continue working with for loops and accumulators
-
use formatted printing
-
learn how to import modules that extend python’s capabilities
1. Leet Speak
Leet speak is method of modifying text by replacing some letters with other symbols (such as numbers). We will change every upper- and lower-case A
to 4
, every upper- and lower-case E
to 3
, every upper- and lower-case L
to 1
, every upper- and lower-case O
to 0
, and every upper- and lower-case T
to 7
.
In the file leet.py
, write a program that takes a string as input
and prints out the Leet speak version of the string.
You must use an accumulator to solve this problem. |
$ python3 leet.py Type in some text to have it made 1337! Input: Leet speak Leet version: 1337 sp34k $ python3 leet.py Type in some text to have it made 1337! Input: Taylor Swift's new album comes out next month Leet version: 74y10r Swif7's n3w 41bum c0m3s 0u7 n3x7 m0n7h $ python3 leet.py Type in some text to have it made 1337! Input: tattletale tattoo Leet version: 7477137413 747700 $ python3 leet.py Type in some text to have it made 1337! Input: unbudging skunks Leet version: unbudging skunks
Take a look at the last two examples. The string tattletale tattoo
has every letter converted to a number, so the output is a bit strange looking. And the string unbudging skunks
has no letters converted, so the the output looks the same as the input. That’s to be expected!
2. Digitally inspecting texts
We will gradually build up a program that allows users to investigate texts (e.g. novels) that are available through a python library that is installed on the CS lab machines.
You will start by writing a program that searches a text for a word that the user enters. We will show you how to read in all of the words from Jane Austen’s "Emma" as a list of strings. Your program will ask the user what word they want to search for. You will search through the words of the novel and report back how many times the word was found.
To get a list of words from the user, you’ll use the nltk
library. Put the following line at the top of your program, which you will save in the file find.py
.
import nltk
Once the nltk
library has been imported, you can get the words from any number of files that are included with
nltk
. We will start with Jane Austen’s "Emma", which is accessible through the nltk
library using the file name 'austen-emma.txt'
. The nltk
library will automatically read that file for you and store all of the words in a list of strings:
file_name = 'austen-emma.txt'
words = nltk.corpus.gutenberg.words(file_name)
In case you are wondering what this looks like, let’s just type those lines into python3:
$ python3
Python 3.10.4 (main, Jun 29 2022, 12:14:53) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import nltk
>>> file_name = 'austen-emma.txt'
>>> words = nltk.corpus.gutenberg.words(file_name)
>>> words
['[', 'Emma', 'by', 'Jane', 'Austen', '1816', ']', ...]
>>> words[0]
'['
>>> words[3]
'Jane'
Notice that words
is just a list of strings. The first word (if you
can call it that) is the string '['
. The second word is 'Emma'
. The
third word is 'by'
. The fourth word (at index 3) is 'Jane'
. You do
not need to make this list. This list of words is automatically
created for you when you run these lines:
file_name = 'austen-emma.txt'
words = nltk.corpus.gutenberg.words(file_name)
Now that you have a list of all the words in the text, you can try to count how many times a specific word appears. Ask the user to type in a word and count how many times that word appears in the list of words you read in. You are only
looking for EXACT matches. For example, if the user types the
, you will match only against the word the
, but not words like there
(which starts with the
) or other
(that has the
in the middle) or even The
(with a capital letter).
Here are four examples of the running program. User input is shown in bold.
$ python3 find.py Searching austen-emma.txt What word do you want to search for? the Found 4844 times. $ python3 find.py Searching austen-emma.txt What word do you want to search for? horse Found 9 times. $ python3 find.py Searching austen-emma.txt What word do you want to search for? dog Found 1 times. $ python3 find.py Searching austen-emma.txt What word do you want to search for? giraffe Found 0 times.
2.1. Reporting the length of the text
You would expect that words would appear more often in longer texts than in shorter texts. Therefore, when investigating a text, it’s helpful to know how long the text is. Add information about the length of the text to your output. You can get the number of words using the len
function on the list of words.
words = nltk.corpus.gutenberg.words('austen-emma.txt')
length = len(words)
Once you know the length of the text, report it as part of your output.
$ python3 find.py Searching austen-emma.txt Enter a word to find: Emma Found 865 times. There are 192427 words in the text.
2.2. Choosing the novel to read from
There are a handful of novels available to you in the nltk
library, including Jane Austen’s "Emma" that you used above. Here are some of the options available:
Novel | file name |
---|---|
"Emma" by Jane Austen |
|
"Persuasion" by Jane Austen |
|
"Sense and Sensibility" by Jane Austen |
|
"Alice’s Adventures in Wonderland" by Lewis Carroll |
|
"Leaves of Grass" by Walt Whitman |
|
"Julius Caesar" by William Shakespeare |
|
"Hamlet" by William Shakespeare |
|
"Macbeth" by William Shakespeare |
|
"Stories to Tell to Children" by Sarah Cone Bryant |
|
"The Parent’s Assistant" by Maria Edgeworth |
|
"Moby Dick" by Herman Melville |
|
Add a menu to your program that allows the user to choose which text they’d like to search in. Make Jane Austen’s "Emma" the first option, then choose at least three other 3 novels you’d like to add to the menu. Here is an example of the program running:
$ python3 find.py Choose the text to search from the following choices: a. "Emma" by Jane Austen b. "Moby Dick" by Herman Melville c. "Stories to Tell Children" by Sarah Cone Bryant d. "Leaves of Grass" by Walt Whitman Your selection: a Searching austen-emma.txt Enter a word to find: dog Found 1 times. There are 192427 words in the text. $ python3 find.py Choose the text to search from the following choices: a. "Emma" by Jane Austen b. "Moby Dick" by Herman Melville c. "Stories to Tell Children" by Sarah Cone Bryant d. "Leaves of Grass" by Walt Whitman Your selection: b Searching melville-moby_dick.txt Enter a word to find: dog Found 17 times. There are 260819 words in the text. $ python3 find.py Choose the text to search from the following choices: a. "Emma" by Jane Austen b. "Moby Dick" by Herman Melville c. "Stories to Tell Children" by Sarah Cone Bryant d. "Leaves of Grass" by Walt Whitman Your selection: c Searching bryant-stories.txt Enter a word to find: dog Found 14 times. There are 55563 words in the text. $ python3 find.py Choose the text to search from the following choices: a. "Emma" by Jane Austen b. "Moby Dick" by Herman Melville c. "Stories to Tell Children" by Sarah Cone Bryant d. "Leaves of Grass" by Walt Whitman Your selection: d Searching whitman-leaves.txt Enter a word to find: dog Found 3 times. There are 154883 words in the text.
If the user enters an invalid choice of text, let the user know that it was an invalid choice, then search just search "Emma" even though they hadn’t selected that. For example:
$ python3 find.py Choose the text to search from the following choices: a. "Emma" by Jane Austen b. "Moby Dick" by Herman Melville c. "Stories to Tell Children" by Sarah Cone Bryant d. "Leaves of Grass" by Walt Whitman Your selection: f Invalid selection. Searching austen-emma.txt Enter a word to find: dog Found 1 times. There are 192427 words in the text.
2.3. Concordance (OPTIONAL)
When doing searches like these, it’s often helpful to see some context of where the word occurred in the text. A common way to do this is to use the Key Word in Context (KWIC) method.
Create a KWIC display of the words you found in your searches by displaying the three words before and three words after the word you were searching for. See if you can figure out how to keep the word you are searching for (the "key word") centered in the output. For example, here is an example of the KWIC output when searching for dog
in "Stories to Tell Children":
$ python3 find.py Choose the text to search from the following choices: a. "Emma" by Jane Austen b. "Moby Dick" by Herman Melville c. "Stories to Tell Children" by Sarah Cone Bryant d. "Leaves of Grass" by Walt Whitman Your selection: c Searching bryant-stories.txt Enter a word to find: dog new puppy - dog to take home the puppy - dog was dead . A puppy - dog , Mammy ," a puppy - dog ! The way a puppy - dog is to take the puppy - dog ' s neck the puppy - dog on the ground palace his pet dog ran to meet But the little dog was so used kick my own dog , if I pretty little white dog . The keeper the beautiful little dog to the court keep the little dog from growing , Am I a dog , that thou Found 14 times. There are 55563 words in the text.
Be careful about words that occur near the start or end of the novel since they may not have 3 words before or after them. For example, Fancy
is the last word of "Leaves of Grass", so the final line of the concordance can not contain 3 words after Fancy
:
$ python3 find.py Choose the text to search from the following choices: a. "Emma" by Jane Austen b. "Moby Dick" by Herman Melville c. "Stories to Tell Children" by Sarah Cone Bryant d. "Leaves of Grass" by Walt Whitman Your selection: d Searching whitman-leaves.txt Enter a word to find: Fancy - Bye My Fancy Good - bye - Bye My Fancy ! Good - - bye my Fancy ! Farewell dear - bye my Fancy . Now for - bye my Fancy . Yet let hail ! my Fancy . Found 6 times. There are 154883 words in the text.
3. Answer the Questionnaire
Each lab will have a short questionnaire at the end. Please edit
the Questions-03.txt
file in your cs21/labs/03
directory
and answer the questions in that file.
Once you’re done with that, you should run handin21
again.
Submitting lab assignments
Remember to run handin21
to turn in your lab files! You may run handin21
as many times as you want. Each time it will turn in any new work. We
recommend running handin21
after you complete each program or after you
complete significant work on any one program.
Logging out
When you’re done working in the lab, you should log out of the computer you’re using.
First quit any applications you are running, like the browser and the terminal. Then click on the logout icon ( or ) and choose "log out".
If you plan to leave the lab for just a few minutes, you do not need to log out. It is, however, a good idea to lock your machine while you are gone. You can lock your screen by clicking on the lock icon. PLEASE do not leave a session locked for a long period of time. Power may go out, someone might reboot the machine, etc. You don’t want to lose any work!