Motivations and possible uses:
Files can be just text files, like you edit with vim.
The basic syntax for opening a file is:
myfile = open(filename,mode)
where filename
is the name of a file, and mode
is the mode used
for opening: usually read ("r") or write ("w") mode. Both arguments are strings,
and myfile
is just the variable I picked to store the file object
returned by the open()
function.
Here is an example of opening a file called poem.txt
for reading,
and storing the file object in a variable called infile
:
infile = open("poem.txt", "r")
Once you have a file object, you can use the input and output methods on the object.
Here's how to open a file for writing (note: myfile
is a variable
name that I choose, and "newfile"
is the name of the file to write to):
$ python
>>> myfile = open("newfile", 'w')
>>> type(myfile)
<type 'file'>
>>> myfile.write("write this to the file \n")
>>> myfile.write("and this.... \n")
>>> myfile.close()
and here are the results of this:
$ ls
newfile
$ cat newfile
write this to the file
and this....
What happens if we leave out the \n
on each line??
I have a file called words.txt
with a few words in it:
$ cat words.txt
happy
computer
lemon
zebra
To open a file for reading, use 'r' mode:
>>> infile = open("words.txt", 'r')
File words.txt
must exist, otherwise we get an error.
Also note: infile
is a variable with a file type stored in it:
>>> type(infile)
<type 'file'>
And it can be used as a sequence (in a for
loop!):
>>> for line in infile:
... print line
...
happy
computer
lemon
zebra
We can use the for
loop like above, or we could use the
file methods: readline()
to read one line,
readlines()
to read them all at once.
>>>> # need to close and reopen to get back to start of file
>>> infile.close()
>>>
>>> infile = open("words.txt", "r")
>>> word = infile.readline()
>>> print word
happy
>>> word = infile.readline()
>>> print word
computer
>>> infile.close()
>>> infile = open("words.txt", "r")
>>> words = infile.readlines()
>>> print words
['happy\n', 'computer\n', 'lemon\n', 'zebra\n']
So readlines()
reads in EVERYTHING and puts each line into a python list.
NOTE: the newline characters are still part of each line!
Sometimes you want to read in EVERYTHING, all at once.
Sometimes it's better to read data in line-by-line and process
each line as you go (use the for
loop: for line in infile
)
for line in infile
loop above, we are at the end of the file. You can "rewind" the
file by closing and reopening it (or use the seek()
method in python)Suppose we have a file of usernames and grades, like this:
$ cat grades.txt
lisa :95
jeff :35
jason :88
adam :97
frances :96
rich :77
Here's a program to figure out the average of those grades:
# store grades in python list, in case we need them later
grades = []
# read all grades into a list
gfile = open("grades.txt", "r")
for line in gfile:
name, grade = line.split(":") # split in two, based on colon
grades.append(float(grade))
gfile.close()
# find ave of grades in list
total = 0.0
for g in grades:
total += g
ave = total/len(grades)
print "ave grade = ", ave
How would you read in multiple quiz grades for each student, finding the average quiz grade for each student?
$ cat quizgrades.txt
lisa :98,100,95,93,99
jeff :58,50,55,53,59
rich :88,78,89,75,79
frances :67,78,89,90,99
$ python quizgrades.py
lisa: 97.0
jeff: 55.0
rich: 81.8
frances: 84.6