A file is a sequence of data that is stored on your computer. For many tasks, especially task which use large amounts of data, input data will come from one or more files, and you will write output to a file instead of to the screen.
A file is typically made of many lines. There is a special newline character that is stored at the end of each line in a file: "\n". However, it is not visible when you look at the file.
Today, we'll use a running example using a file named football.txt
. First, let's look at this file in atom.
Tate Golden Philadelphia Eagles
Agholor Nelson Philadelphia Eagles
Matthews Jeremy Philadelphia Eagles
Brown Antonio Pittsburgh Steelers
Bell Le'Veon Pittsburgh Steelers
Jackson DeSean TampaBay Buccaneers
In this file, data about each football player is stored on a separate line. Each line has the same format:
lastName firstName team mascot
To open a file, you do: <filevar> = open(<filename>, <mode>)
<filevar>
is what your program will call the file.<filename>
is the name of the file, in this case football.txt
.<mode>
is how you plan to use the file. "r" is for reading, "w" is for writing.To close a file, you do: <filevar>.close()
There are a couple of ways that you can read data from a file. Here is perhaps the simplest:
infile = open("myfile", "r")
for line in infile:
# process one line of the file
infile.close()
You can also read all the lines in at once:
infile = open("myfile", "r")
# lines is now a list of strings, one for each line of the file
lines = infile.readlines()
for line in lines:
# process one line at a time
infile.close()
Let's start by just trying to see the contents of the file:
infile = open("football.txt", "r")
for line in infile:
print(line)
infile.close()
Is the output of this code snippet what you expected? How is it different than the input file?
The first thing you'll probably want to do with a file is remove the newline character from the end of the line. You can do this with the strip()
method.
For files with multiple pieces of data per line, you'll want to break the line up into those individual pieces. Do this with the split()
method. This will take a single string and return a list of "words" --- it treats spaces like dividers for different pieces of data.
line.strip()
-- remove trailing whitespace (e.g., spaces, tabs "\t", newline "\n")line.split()
-- treat line as a list of strings separated by whitespace. return that listline.split(<pattern>)
-- like line.split()
, but treat as list of strings separated by <pattern>
We've talked over the semester how a lists can contain any type of data. Lists can even contain other lists. Lists of lists are common with when doing file I/O. If each line is a list of data, then its common to store load the entire contents of the file in to a big list, where each element is itself a list representing data from one line of the file.
Let's modify our program to store our information in a list of lists called players
Each player is a list containing lastName, firstName, team, and mascot.
players = [ ['Tate', 'Golden', 'Philadelphia', 'Eagles'],
['Agholor', 'Nelson', 'Philadelphia', 'Eagles'],
... ]
What would players[0]
give? What about players[1]
?
You can use double indexing to get at the data inside the player list.
players[0][0]
returns Tate
players[0][1]
returns Golden
players[1][1]
returns Nelson
Let's go back to our program and modify it to store the players information in a list of lists:
infile = open("football.txt", "r")
players = []
for line in infile:
player = line.strip().split()
players.append(player)
infile.close()
print(players)
print("There are %d players in the list." % (len(players)))
Write a function getPlayers()
that takes a list of players and a team and prints out all players from the list on that team.
Write a main function that reads in a list of players from a file, asks the user for a team name, and uses getPlayers()
to print out all players from the given team.
Today, we'll use a running example using a file named bestInShow.txt
. First, let's look at this file in atom.
2018,Whippet,Whiskey
2017,Brussels Griffon,Newton
2016,Greyhound,Gia
2015,Skye Terrier,Charlie
2014,Bloodhound,Nathan
2013,American Foxhound,Jewel
In this file, data about each winner is stored on a separate line. Each line has the same format:
Year,Breed,Name
Reading in dog show winners from the file is a similar process to the example from last lecture. Each line of text represents one dog show winner.
Open up bestInShow.py
. First, look at the code in the main function that reads in dog show winners from the file. Run the program. Does it do what you expected? Then, write a function getWinners()
that takes a list of winners and breeds and prints out any winners of that breed. Test the function by calling getWinners()
multiple times on different inputs. Does it do what you expected?
Opening a file for writing is similar to opening a file for reading --- just change the mode to "w".
Writing to a file is similar to printing to the screen. To write to a file, use the write()
method.
outfile = open("helloworld.txt", "w")
outfile.write("hello world!!")
outfile.write("It's a brand new day!")
outfile.close()
Modify bestInShow.py
to add some additional winning dogs to your list dogs
:
Then, write the entire list of winners to a file called moreBestInShow.txt
. How does the contents compare to bestInShow.txt?
Once you created a top-down design of your program, you should start incrementally developing and testing your solution. The process of implementing and testing individual components in isolation is called * unit testing*. It has several advantages:
There are two main ways of unit testing in python. First, you can test in some other function (e.g., in your main()
function, or in a special test()
function). Second, you can use the Python Interpreter. Testing in the interpreter requires a little setup:
if __name__ == "__main__":
# call main only if you're running as a program
main()
>>> from blackjack_tdd2 import *
>>> help(didWin)
>>> didWin(21,20)
>>> didWin(44, 18)
>>> didWin(22,28)
>>> didWin(15,22)
python3
What should you test? Make several tests for each function, each with a different focus and on different arguments. Test boundary conditions too (e.g. what happens when you give an empty string?)
Open blackjack_tdd2.py
and test the generateDeck()
implementation using the Python interpreter. There is at least one error. Fix the error(s) and retest until you are confident generateDeck()
does what it should.
Once you finish, start implementing playUser()
and playDealer()
. Implement them one at a time, and thoroughly test whichever you do first before mobing onto the second function.