Take a look at the file called /scratch/knerr/itunesData-Feb2016.csv
.
You can use less
or cat
or head
to view the contents of the file:
$ head /scratch/knerr/itunesData-Feb2016.csv
The A Team,4:19,Ed Sheeran,9/9/12; 12:34 PM,16
A-Punk,2:18,Vampire Weekend,9/16/13; 12:56 PM,27
A.M. Radio,3:57,Everclear,8/21/09; 5:37 PM,52
A.M. Radio,3:59,Everclear,12/11/09; 9:31 AM,17
About to Break,3:56,Third Eye Blind,10/11/10; 10:57 AM,22
Abracadabra,5:08,Steve Miller Band,4/13/12; 5:43 PM,4
Abraham,5:12,Mark Erelli,4/5/09; 3:00 PM,2
Absence Makes The Heart Grow Fonder,2:28,Loudon Wainwright III,1/8/07; 11:02 AM,28
Accelerate,3:34,R.E.M.,7/20/09; 3:52 PM,43
Accident Waiting To Happen,4:03,Billy Bragg,1/8/15; 3:09 PM,11
Each line in the file represents one song in my itunes library, and
the data for each song consists of: song title, song length/time, artist,
date of purchase, and number of plays.
Since this is a csv (comma-separated values) file, we can easily
pull all of this info into python using file IO and str methods
like split()
.
Before we pull in all of this data, how shall we store it? One way is to make a list for each song ([title,time,artist,date,plays]), and then store all of those song lists in another list. So we will have a list-of-lists!
Here's a simple example of python list-of-lists:
>>> L1 = list("abc")
>>> L2 = range(1,4)
>>> L3 = list("XYZ")
>>> LOL = [L1,L2,L3]
>>> print(LOL)
[['a', 'b', 'c'], [1, 2, 3], ['X', 'Y', 'Z']]
>>> print(LOL[0])
['a', 'b', 'c']
>>> print(LOL[0][2])
c
Notice how, using indexing twice ([0][2]
) we can select items from
the sub-lists!
readFile(filename)
function:Given the name of the data file, read in all lines, store them
in a list, and return them to main()
. For example, in main()
,
I would like to call this function like this:
data = readFile("/scratch/knerr/itunesData-Feb2016.csv")
and the readFile()
function would take care of opening the file,
reading in all of the lines, into a python list, closing the file,
and returning the python list.
Start by just trying to read in all of the lines. Once you get that
working, try to split()
a line into a sub-list:
>>> line = "Abraham,5:12,Mark Erelli,4/5/09; 3:00 PM,2"
>>> data = line.split(",")
>>> print(data)
['Abraham', '5:12', 'Mark Erelli', '4/5/09; 3:00 PM', '2']
>>> data[4] = int(data[4])
>>> print(data)
['Abraham', '5:12', 'Mark Erelli', '4/5/09; 3:00 PM', 2]
Once you have all of the data store in a list-of-lists, then you can start asking some interesting questions!