CS21 Week 9: File I/O, Top-Down-Design
Week 9 Topics
-
File I/O (reading from files)
-
Top-Down Design
-
More Nested Loop Practice
Monday
Lab 8
Some details about Lab 8, the 2-week Top-Down Design lab.
File Input/Output
For many tasks, especially tasks that use large amounts of data, program input data is read in by the program from one or more input files, and program results may want to be stored an saved, so the program may write program results to one or more output files instead of to the terminal window.
We are going to focus on reading from file this week. Writes to a file are conducted in a very similar way, but we won’t write to files until later in the semester.
Files
A file is a sequence of data that is stored on your computer.
A file typically consists of many lines. There is a special newline character
that is stored at the end of each line in a file: "\n"
. However, it is not
visible when you look at the file.
Today, we’ll use a running example using a file named words.txt
.
We can view the file contents by opening the file in vim, or by cat’ing the file contents out to the terminal window:
-
vim words.txt
-
cat words.txt
computer science Python Ninjas CS21 Intro Think Fun
reading from a file
To read (or write) from (to) a file, the programmer must follow these three steps:
-
Open the file to read. To do this we call the
open
function passing in the file name string, and the mode in which to open in ("r"
for reading):infile = open("words.txt", "r")
If the call to
open
is successful, it returns a newfile
object that we can use to read in from the file -
Read the contents of the file using some methods of the
file
class. One example is thereadlines
method that reads each line of the file into a list of strings, for example:# lines is a list of strings, one string element for each line in the file lines = infile.readlines()
Note: there are many other ways to read in values from a file, as not every program may want to read in all the values, nor all the values at one time (this can be a particular problem for huge files). This is just one way to read in the file contents.
-
Close the file when done reading from it:
infile.close()
A file is accessed sequentially, which means that that when you open a file to read (or write), reads (or writes) start reading from the current position in the file, which starts at the begining of the file. As data are read (or written) the current position moves to immediately after he last value read (or written), and the next value read (or written) will be read from the current position, causing the current position to move again. We call this a sequential access because inorder to read the nth item in the file, first the previous n-1 items must be read.
For our example file, words.txt
, the very first line read in from the file is:
computer
the next line read in is:
science
and so on.
There are ways to move the current file position around in the file other than by reading (or writing) from the start to the desiered spot. For now, however, we will read the file in its default sequential order.
example program that reads from a file
Let’s look at the program readwords.py
that reads in the
contents of words.txt
file into a list of strings, one string per
line in the file.
In this file, there is a single word per line. We are going to look the main program that has all three steps fore reading in values from the file.
It contains a call to the function print_list
to print out the resulting
list of strings read in.
-
Is the output what you expected?
-
How is it different than the input file?
-
Do you have an idea of what is going on here?
We are going to implement the function fix_list
to make use of the
strip()
string method function to try to fix up the list. Then let’s
uncomment the calls to fix_list
and print_list
from the end of main
to see if we did it.
More ways to read
We are not going to look at this togther, and you don’t need to read values
from a file now in more than one way, but if you are curious, in the file filetest.py
are a set of functions for reading in the contents of a file in many
different ways (all at once into a list, one line at a time, one character
at a time). We will generally use the first way in this class as it is super
handy, but if you use python for processing large files later, these other
ways might be useful.
Top-Down Design
Continue with Top-Down Design from last week.
Wednesday
We are going to continue with Top-Down Design today.
Friday
More File Input/Output
Today we are going to look at some other programs that read in values from a file and manipulate the values in different ways.
Files with numeric data
We are going to look at readnumbers.py
that reads in a set of numbers for
an input file. The code to read in the values from a file (you can specify
one of two files to read from numbers.txt
in your current directory,
or a file from a professor’s directory specifying the full path name
/home/newhall/public/cs21/lotsofnums.txt
).
We are going to look at the main program that already has all three steps for reading in the file contents into an array of strings, one per line, and then we will:
-
implement the function
convert_to_int
that convert the list of strings to their corresponding int values -
implement the
get_average
function that takes this list of ints compute the average to test our converted list
Try out your solution with both files, the smaller one, numbers.txt
, first.
Files with multiple elements per line
Finally, sometimes file data has multiple values per line. data.txt
is
an example file that contains lines student data, where each student has
four pieces of data (a name, an age, a major, and a gpa value), each comma
separated in the list.
cat data.txt
Alain,22,CS,3.1
Ali,19,Math,2.9
Anastassia,20,Math,3.0
...
We are going to look at a program, readdata.py
, that reads in the file
contents into a list of strings, fixes up the list calling fix_list
that uses
strip
to remove trailing white space.
We are then going to write a function called separate_list
that takes the
list and converts each element of the list from a string to a list of strings,
such that these inner lists contain a string value for each of the 4 pieces
of data associated with each student.
This is an example of breaking up a line with multiple data up to access the
individual data items, using the split
string method function. split
takes
a string to split on, in this example it will be ,
(split(",")
), and
returns a list of substrings created from the string object, splitting them by
the passed string == it treats commas like dividers for different pieces of data.
A call to split()
that does not pass a string to split by, splits the string
by white space characters — it treats spaces like dividers for different
pieces of data.
Common String methods for file I/O
-
line.strip()
— remove trailing whitespace (e.g., spaces, tabs"\t"
, newlines"\n"
) -
line.split()
— treat line as a list of strings separated by whitespace. return that list -
line.split(<pattern>)
— likeline.split()
, but treat as list of strings separated by<pattern>
Top-Down Design
Continue with Top-Down Design.