Week 7: top-down design (TDD)
Monday
This week and next week we are working on top-down design. This is a useful technique for writing larger, more complex programs. Today (Monday) we will learn about file input and output, so we can use data stored in files in our programs. On Wednesday we will start learning TDD.
Files
Motivations and possible uses:
-
video game data files (read in terrain data; keep track of high scores)
-
grader program: store student grade data in a file (don’t have to type it in each time the program runs)
-
iTunes: how does iTunes keep track of number of plays for each song??
Files can be just text files, like you edit with atom.
syntax
The basic syntax for opening a file is:
myfile = open(filename,mode)
where filename
is the name of a file, and mode
is the mode used
for opening: usually read ("r") or write ("w") mode. Both arguments are strings,
and myfile
is just the variable I picked to store the file object
returned by the open()
function.
Here is an example of opening a file called poem.txt
for reading,
and storing the file object in a variable called infile
:
infile = open("poem.txt", "r")
examples
Once you have a file object, you can use the input and output methods on the object.
OUTPUT
Here’s how to open a file for writing (note: myfile
is a variable
name that I choose, and "newfile"
is the name of the file to write to):
$ python3 >>> myfile = open("newfile", 'w') >>> myfile.write("write this to the file \n") >>> myfile.write("and this.... \n") >>> myfile.close()
and here are the results of this:
$ ls newfile $ cat newfile write this to the file and this....
What happens if we leave out the \n
on each line??
INPUT
I have a file called words.txt
with a few words in it:
$ cat words.txt happy computer lemon zebra
To open a file for reading, use 'r' mode:
>>> infile = open("words.txt", 'r')
File words.txt
must exist, otherwise we get an error.
The infile
variable, which is a FILE
type, can be used as a sequence
(e.g., in a for
loop!):
>>> for line in infile: ... print line ... happy
computer
lemon
zebra
We can use the for
loop like above, or we could use the
file methods: readline()
to read one line,
readlines()
to read them all at once.
>>>> # need to close and reopen to get back to start of file >>> infile.close() >>> >>> infile = open("words.txt", "r") >>> word = infile.readline() >>> print word happy
>>> word = infile.readline() >>> print word computer
>>> infile.close() >>> infile = open("words.txt", "r") >>> words = infile.readlines() >>> print words ['happy\n', 'computer\n', 'lemon\n', 'zebra\n']
So readlines()
reads in EVERYTHING and puts each line into a python list.
NOTE: the newline characters are still part of each line!
Sometimes you want to read in EVERYTHING, all at once.
Sometimes it’s better to read data in line-by-line and process
each line as you go (use the for
loop: for line in infile
)
File I/O Notes:
-
reading from and writing to files is usually S L O W
-
for this reason, we usually read in data at the beginning of a program and store it in a list or other data structure (ie, if we need the data throughout the program, it’s much faster to refer to the list rather than the file)
-
also, reading from the file is similar to watching a movie on VHS tapes — at the end of the movie, you have to rewind the tape to get back to the beginning. Once we do that
for line in infile
loop above, we are at the end of the file. You can "rewind" the file by closing and reopening it (or use theseek()
method in python)
str methods: strip() and split()
Suppose we have a file of usernames and grades, like this:
$ cat grades.txt saul: 93 thibault: 92.5 lauri: 100 andy: 70 jeff: 67.5 kevin: 85
If I want to find the average of all of those grades, I need to read
in each line, then somehow pull out the grade and store it. This is
where the str
methods strip()
and split()
can be used. Here are
examples of each:
>>> infile = open("grades.txt", "r") >>> line = infile.readline() >>> print(line) saul: 93
>>> data = line.split(":") >>> print(data) ['saul', ' 93\n'] >>> grade = float(data[1]) >>> print(grade) 93.0
So split()
just splits a string and returns the results as a list.
In the above example we split the string on a colon (":"). Here are a
few more examples of split()
. By default (with no arguments given), it
splits the string on whitespace.
>>> S = "a,b,c,d,e,f,g" >>> L = S.split(",") >>> print(L) ['a', 'b', 'c', 'd', 'e', 'f', 'g'] >>> phrase = "the quick brown fox jumped over the lazy dog" >>> words = phrase.split() >>> print(words) ['the', 'quick', 'brown', 'fox', 'jumped', 'over', 'the', 'lazy', 'dog']
And strip()
will strip off leading and trailing characters. Again, by
default it strips off whitespace. If you provide an argument, it will
strip off that:
>>> S = " hello\n" >>> print(S) hello >>> print(S.strip()) hello >>> >>> word = "Hello!!!!!" >>> print(word.strip("!")) Hello
Your turn
Can you write a program to read the grades.txt
file into a python list
of grades, and then calculate the average grade?
Here’s an example of what we want:
$ python3 grader.py [93.0, 92.5, 100.0, 70.0, 67.5, 85.0] average grade: 84.7
challenge
Once you have the grades in a list, can you find the highest and lowest grades?
$ python3 grader.py [93.0, 92.5, 100.0, 70.0, 67.5, 85.0] average grade: 84.7 highest grade: 100.0 lowest grade: 67.5
Wednesday
top-down design
As we write bigger and more complex programs, designing them first will
save time in the long run. Similar to writing an outline for a paper,
using top-down design means we write out main()
first, using functions
we assume will work (and that we will write, later). We also decide on
any data structures we will need.
What we want to avoid is writing a whole bunch of functions and code, and then realizing we forgot something, which changes some of the functions, which changes other functions, and so on.
Furthermore, we want a way to test each function as we write it. Many first-time programmers will write lots of functions and then start testing them. If each function has a few bugs in it, this makes it a lot harder to debug.
A typical top-down design includes the following:
-
main()
all written out -
function stubs written (
def
with params, function comment, dummyreturn
value) -
data structures clearly defined (store data in a list? an object? a list of objects? something else?)
-
the design should run without any syntax errors
-
running the design should show how the program will eventually work
Once you have the design all written out, you can now attack each function one-at-a-time. Get the function to work. Test it to make sure it works, then move on to the next function.
TDD example
Suppose we want to write this square word game:
$ python3 squarewords.py l|l|e d|c|o r|r|a word 1? corralled Correct! Score = 10 g|p|i l|l|a g|i|n word 2? Incorrect...word was: pillaging Score = 0 d|i|n g|a|c c|o|r word 3? according Correct! Score = 10 s|n|i d|a|o o|t|n word 4? quit
The game uses 9-letter words, and displays them in a 3x3 box, where the letters run either vertically or horizontally, and the start of the word can be anywhere in the 3x3 box. The user’s job is to figure out each 9-letter word.
Here’s my TDD for the above program:
"""
squareword puzzle game as example of tdd
J. Knerr
Fall 2019
"""
from random import *
def main():
words = read9("/usr/share/dict/words")
score = 0
wordnum = 0
done = False
while not done:
word = words[wordnum]
display(word)
answer = getInput(wordnum)
if answer == "quit":
done = True
elif answer == word:
score += 10
print("Correct! Score = %d" % (score))
else:
score -= 10
print("Incorrect...word was: %s Score = %d" % (word,score))
wordnum += 1
def display(word):
"""display word in 3x3 grid with random start position"""
print(word)
def getInput(n):
"""get user's guess, make sure it's valid, return valid guess"""
guess = input("word %d? " % (n+1))
# should allow 9-letter word, quit, and empty string
return guess
def read9(filename):
"""read all 9-letter words from file, shuffle the order, return list"""
words = ["aaaaaaaaa","bbbbbbbbb","ccccccccc"]
return words
main()
Notice that main()
is completely written! And the goal is that it
shouldn’t need to change much, as I implement the remaining functions.
Also, the program runs, but doesn’t really do much yet (since I haven’t really written all of the functions). Here’s what it looks like so far:
$ python3 design-squarewords.py aaaaaaaaa word 1? 123 Incorrect...word was: aaaaaaaaa Score = -10 bbbbbbbbb word 2? bbbbbbbbb Correct! Score = 0 ccccccccc word 3? quit
Now, since I have a working program (it runs without syntax errors!), I can attack each function separately. I want to write a function and thoroughly test it before I move on to the next function.
write the getInput(n)
function
Can you write the getInput(n)
function? It should keep asking the user
for input until it gets a valid string: either a 9-letter word, "quit",
or the empty string.
a|r|y m|a|x i|l|l word 1? hello Please enter a 9-letter word! word 1? 123456789 Please enter a 9-letter word! word 1? wwww eeee Please enter a 9-letter word! word 1? abcdefghi Incorrect...word was: maxillary Score = -10
Friday
review of squarewords functions
Here’s one way to write the getInput(n)
function from the squarewords.py
file:
def getInput(n):
"""get user's guess, make sure it's valid, return valid guess"""
while True:
guess = input("word %d? " % (n+1))
# should allow 9-letter word, quit, and empty string
if guess == "" or guess == "quit":
return guess
elif len(guess)==9 and guess.isalpha()==True:
return guess
else:
print("Please enter a 9 letter word!!")
The while True
is just an infinite loop. I use it here to get the loop
going, and not have to worry about a specific condition. Because it’s an
infinite loop, I need to make sure there’s a way out. That’s what the
return guess
lines do: return the user’s guess back to main()
, so
both the while
loop and the function are done. The only way out of
this infinite loop is if we get valid input from the user. Otherwise we
print the error message ("Please enter a 9 letter word!!") and loop
back to the input()
call at the top of the loop.
And here’s how to read in the data from the word file (one word per line), and only select the 9-letter, lowercase, all-alphabetic words:
def read9(filename):
"""read all 9-letter words from file, shuffle the order, return list"""
words = []
inf = open(filename, "r")
data = inf.readlines()
inf.close()
# get 9-letter words from data, add to words
for word in data:
word = word.strip()
if len(word)==9 and word.islower() and word.isalpha():
words.append(word)
return words
Note in this one how I user word.islower()
and not
word.islower()==True
. Either would work, but word.islower()
is
already a boolean (True or False), so I don’t need to compare it — I
can just use it as is: if the word is 9 characters AND they are all
lowercase AND they are all alphabetic characters (abcdefg….), add the
word to the list of words.
top-down-design on flashcards
Here’s an example of the program I want: read in some flashcards from a file, ask the user each card, keep track of how many they get correct, print an appropriate message, ask if they want to go again.
$ python3 flashcards.py Flashcards file? german.txt ============================== essen: to eat Correct! - -- -- -- -- -- -- -- -- -- - kaufen: to buy Correct! - -- -- -- -- -- -- -- -- -- - besuchen: to visit Correct! - -- -- -- -- -- -- -- -- -- - fahren: to travel Correct! - -- -- -- -- -- -- -- -- -- - lieben: to jump Nope....lieben = to love - -- -- -- -- -- -- -- -- -- - schlafen: to sleep Correct! - -- -- -- -- -- -- -- -- -- - spielen: to run Nope....spielen = to play - -- -- -- -- -- -- -- -- -- - trinken: to drink Correct! - -- -- -- -- -- -- -- -- -- - verstehen: to keep Nope....verstehen = to understand - -- -- -- -- -- -- -- -- -- - OK...not terrible. Go again? (y/n) n Bye!
And here’s a sample flashcards data file:
$ cat german.txt essen:to eat kaufen:to buy besuchen:to visit fahren:to travel lieben:to love schlafen:to sleep spielen:to play trinken:to drink verstehen:to understand
So again, the goal of the top-down design process is:
-
main()
all written out -
function stubs written (
def
with params, function comment, dummyreturn
value) -
data structures clearly defined (store data in a list? an object? a list of objects? something else?)
-
the design should run without any syntax errors
-
running the design should show how the program will eventually work
In class we did this part together. Here’s the design after we typed it all in:
flashcards program
J. Knerr
Fall 2019
"""
from random import *
def main():
filename = input("Flashcards file? ")
cards = readFile(filename)
done = False
while not done:
shuffle(cards)
# ask questions
ncorrect = flash(cards)
# print message
message(ncorrect, len(cards))
# ask if they want to go again
done = quit()
def quit():
"""return True if they want to quit"""
result = input("Go again? ")
if result != "y":
return True
else:
return False
def message(ncorrect, nprobs):
"""print message to user based on percent correct"""
print("Good work!")
def flash(cards):
"""given the cards, ask questions, return how many correct"""
return 3
def readFile(filename):
"""open file, read in all data, return list-of-lists"""
cards = [["q1","a1"], ["q2","a2"], ["q3","a3"]]
return cards
main()
list-of-lists
Note the use of a "list of lists" in the readFile()
function.
I want to read in each card and make a list, like this: ["essen", "to eat"]
.
Then I want to store all cards (which are lists) in a list.
So the final data structure will look like this:
cards = [['essen', 'to eat'], ['kaufen', 'to buy'], ['besuchen', 'to visit'],
['fahren', 'to travel'], ['lieben', 'to love'], ['schlafen', 'to sleep'],
['spielen', 'to play'], ['trinken', 'to drink'], ['verstehen', 'to understand']]
For the above, what is cards[1]
? And what is cards[1][0]
?
>>> cards = [['essen', 'to eat'], ['kaufen', 'to buy'], ['besuchen', 'to visit'], ...] >>> cards[1] ['kaufen', 'to buy'] >>> cards[1][0] 'kaufen' >>> cards[1][1] 'to buy'
implement the readFile()
function
Here’s one way to write the readFile()
function to read in
the data and return it as a list of lists.
def readFile(filename):
"""open file, read in all data, return list-of-lists"""
inf = open(filename,"r")
lines = inf.readlines()
inf.close()
cards = []
for line in lines:
card = line.strip().split(":")
cards.append(card)
return cards