CS21 Lab 9: Sorting
Due Saturday, November 20, before midnight
Please read through the entire lab before starting!
This is a one week lab. Since the design process should be similar to the previous lab, lab 08, you will be doing the design and implementation for lab 9 in a single week as well. If you found the open-ended nature of the lab 08 challenging, you will want to start this lab early.
Goals
-
Write sorting programs, selection sort and bubble sort
-
Learn to filter and reduce dataset sizes before sorting
-
Continue practicing top-down design (TDD)
Earthquake Data
For this lab, we will continue to work with the same dataset we used in the previous lab, lab 08 Searching. As a reminder, here are some details about the data set, it is a month’s worth of earthquake data collected by the United States Geological Survey (USGS) in 2016, from 07/27/2016 to 08/25/2016. The dataset is a modified version from the CORGIS educational dataset archive, and it contains 8394 entries. We have defined an earthquake class to represent one earthquake record.
Using sorting to investigate earthquake data
You will implement a program called sorting.py
, which will
allow the user to search through this dataset and learn interesting
information. To start, copy your Lab 08 earthquakes.py
file into
a new sorting.py
that you will use for Lab 09:
cp ~/cs21/labs/08/earthquakes.py ~/cs21/labs/09/sorting.py
More details about copying files are here: UsingUnix.
You will adding three new sorting related options, options 4, 5, and 6. Depending on the user’s selection, beyond your lab 08, you will prompt them for additional information and then do one of:
-
Rank the Earthquakes by Magnitude, print the top N largest.
-
Sort Earthquakes within a location.
-
Sort Earthquakes by the Most Recent.
Similar to Lab 08, your porgram should prompt the user with a menu of seven choices:
Please select one of the following choices: (1) Find by ID (2) Find by location (3) Map state (4) Find the largest earthquakes by magnitude (5) Find the largest earthquakes within a location (6) Find the most recent earthquakes (7) Quit Choice?
and then prompt them for additional information and execute one of the menu choices. You implemented the first three choices in Lab 08. Your program should also:
-
Find the largest earthquakes by magnitude, using selection sort.
-
Find the largest earthquakes in a location, using bubble sort.
-
Find the most recent earthquakes.
Find the largest earthquakes by magnitude
Note: This feature was an optional extra challenge in lab 08. If you implemented this feature as part of lab 8, please be sure that your version for lab 9 conforms to the specification below (most notably, that it uses selection sort).
For option 4 (find the largest earthquakes by magnitude), you must:
-
prompt the user for the number of earthquakes they want displayed
-
use selection sort to sort the earthquakes by decreasing order of magnitude
-
print out the N earthquakes with the highest magnitude
For example, here the user asks for the eight largest earthquakes (by magnitude):
$ python3 sorting.py Please select one of the following choices: (1) Find by ID (2) Find by location (3) Map state (4) Find the largest earthquakes by magnitude (5) Find the largest earthquakes within a location (6) Find the most recent earthquakes (7) Quit Choice? 4 How many (N)? 8 Largest 8 earthquakes by magnitude: us100068jg: magnitude 7.7 @ 28km SSW of Agrihan, Northern Mariana Islands (18.5439, 145.541) on 2016-07-29 17:18:26 us10006exl: magnitude 7.4 @ South Georgia Island region (-55.2793, -31.874) on 2016-08-19 03:32:22 us10006d5h: magnitude 7.2 @ 109km E of Ile Hunter, New Caledonia (-22.4953, 173.1114) on 2016-08-11 21:26:35 us10006gbf: magnitude 6.8 @ 25km W of Chauk, Burma (20.9192, 94.5789) on 2016-08-24 06:34:55 us10006fik: magnitude 6.4 @ South Georgia Island region (-55.2775, -31.7546) on 2016-08-20 23:45:23 us10006a2k: magnitude 6.3 @ 70km ENE of Iwo Jima, Japan (24.9477, 142.0074) on 2016-08-04 12:24:33 us10006a1d: magnitude 6.2 @ 53km NW of Abra Pampa, Argentina (-22.3942, -66.0814) on 2016-08-04 10:15:12 us10006d6d: magnitude 6.2 @ South of the Fiji Islands (-25.1394, -177.3386) on 2016-08-11 23:29:33 Please select one of the following choices: (1) Find by ID (2) Find by location (3) Map state (4) Find the largest earthquakes by magnitude (5) Find the largest earthquakes within a location (6) Find the most recent earthquakes (7) Quit Choice? 7
Find the largest earthquakes within a location
For option 5 (find the largest earthquakes within a location), you must:
-
prompt the user for a location string
-
prompt the user for the number of earthquakes they want displayed.
-
use linear search to find all the records for which the user-entered string appears in the full location.
-
use bubble sort to sort these earthquakes in descending order of magnitude.
Here are some examples:
$ python3 sorting.py Please select one of the following choices: (1) Find by ID (2) Find by location (3) Map state (4) Find the largest earthquakes by magnitude (5) Find the largest earthquakes within a location (6) Find the most recent earthquakes (7) Quit Choice? 5 Location? Pennsylvania How many (N)? 3 Largest 3 earthquakes in Pennsylvania: ld60050103: magnitude 1.37 @ 8km E of Wellsboro, Pennsylvania (41.7358333, -77.196) on 2016-08-22 02:15:45 Please select one of the following choices: (1) Find by ID (2) Find by location (3) Map state (4) Find the largest earthquakes by magnitude (5) Find the largest earthquakes within a location (6) Find the most recent earthquakes (7) Quit Choice? 5 Location? New Jersey How many (N)? 3 Largest 3 earthquakes in New Jersey: ld60122961: magnitude 0.98 @ 1km N of Wanaque, New Jersey (41.0555, -74.2913333) on 2016-08-08 22:01:44 ld60122426: magnitude 0.81 @ 0km NNE of Brookdale, New Jersey (40.8413333, -74.1771667) on 2016-07-30 20:34:44 ld60049603: magnitude 0.54 @ 0km W of Butler, New Jersey (41.0103333, -74.3435) on 2016-08-08 22:33:00 Please select one of the following choices: (1) Find by ID (2) Find by location (3) Map state (4) Find the largest earthquakes by magnitude (5) Find the largest earthquakes within a location (6) Find the most recent earthquakes (7) Quit Choice? 7
Find the most recent earthquakes
For option 6 (find the most recent earthquakes), you must:
-
prompt user for the number of earthquakes, get the input N
-
sort the earthquakes by date/time. To help ease the implementation, the earthquake library provides a
compare_dates(quake1, quake2)
function, which takes as input two earthquake objects and returnsTrue
if quake 1 is more recent than quake 2, otherwise, it returnsFalse
. -
print the N most recent earthquakes
-
You can use either selection sort or bubble sort for this feature
This dataset is a month’s worth of earthquake data collected by the United States Geological Survey (USGS) in 2016, from 07/27/2016 to 08/25/2016. So your sorting should go from the latest one on 8/25/16, 23:58 towards the earliest one on 7/27/16, 0:19. |
Here is an example:
$ python3 sorting.py Please select one of the following choices: (1) Find by ID (2) Find by location (3) Map state (4) Find the largest earthquakes by magnitude (5) Find the largest earthquakes within a location (6) Find the most recent earthquakes (7) Quit Choice? 6 How many (N)? 5 Most Recent 5 earthquakes: ci37672360: magnitude 0.89 @ 14km NE of Yucaipa, CA (34.1191667, -116.9336667) on 2016-08-25 23:58:01 ci37672328: magnitude 1.55 @ 6km NNW of Chatsworth, CA (34.308, -118.6353333) on 2016-08-25 23:36:11 nc72685251: magnitude 1.06 @ 6km WNW of The Geysers, California (38.8050003, -122.8215027) on 2016-08-25 23:30:05 ak13879193: magnitude 1.4 @ 5km ESE of Big Lake, Alaska (61.4984;-149.8627) on 2016-08-25 23:19:18 nc72685246: magnitude 2.42 @ 22km ENE of Soledad, California (36.5154991, -121.0998306) on 2016-08-25 23:19:01
Optional Extra Challenge (Sort States)
For option 8 (Sort States by total numbers of Magnitude M+ Earthquakes), you must:
-
Prompt the user for an integer M from 1 to 7 (e.g., 5).
-
Print the States orders by total numbers of Magnitude M+ Earthquakes
-
Print the numbers of Magnitude M+ Earthquakes for each State in the list
-
If the integer M is not valid, print a suitable error message.
-
If no records match the State and M+, let the user know that you couldn’t find any.
Here is an example:
$ python3 sorting.py Please select one of the following choices: (1) Find by ID (2) Find by location (3) Map state (4) Find the largest earthquakes by magnitude (5) Find the largest earthquakes within a location (6) Find the most recent earthquakes (7) Quit (8) Sort States by total numbers of Magnitude M+ Earthquakes Choice? 8 Please enter an integer M (from 1 to 7) for Magnitude M+ Earthquakes. 3 Printing States sorted by total numbers of Magnitude 3+ Earthquakes: States, Numbers of Earthquakes MA, 103 PA, 92 CA, 88 ... ...
Provided Library
The earthquakes
library has some helpful tools for managing earthquake
records in your program.
Earthquake Class
The library provides an Earthquake
class, which encapsulates all the
information you’ll need to keep track of one record. You can create an
instance object of this class by calling the Earthquake()
constructor with
each field of information from one row of the earthquakes.txt
file passed in
as separate parameters. The number fields (magnitude, latitude, and longitude)
are floats, and all other fields are strings.
You can call the following methods on a Earthquake
object to retrieve information about it:
-
get_id()
: Get the ID field. (string) -
get_magnitude()
: Get the magnitude field. (float) -
get_location_full()
: Get the full location field. (string) -
get_latitude()
: Get the latitude field. (float) -
get_longitude()
: Get the longitude field. (float) -
get_location()
: Get the short location field. (string) -
get_date()
: Get the date/time field. (string)
For example:
>>> from earthquakes_lab import * >>> record = Earthquake("id12345", 3.5, "Parrish Hall, Swarthmore, Pennsylvania", 39.90523, -75.35425, "PA", "2021-11-13 23:59:59") >>> print(record.get_magnitude()) 3.5 >>> print(record.get_location_full()) Parrish Hall, Swarthmore, Pennsylvania >>> print(record.get_location()) PA >>> print(record) id12345: magnitude 3.5 @ Parrish Hall, Swarthmore, Pennsylvania (39.90523, -75.35425) on 2021-11-13 23:59:59
Comparing Dates
For option 6, you need to sort earthquake objects by date/time. The
library provides a compare_dates(quake1,quake2)
function that you
can call on two earthquake objects that will return a Boolean, where
True
means that quake1
happened more recently than quake2
,
otherwise, it returns False
.
Validating Floats
For option 3, you’ll prompt the user to enter a minimum magnitude, which needs
to be a float. The library provides a is_float
function that you can call on
a string that will return a Boolean, where True
means the string can safely
be converted with float
:
>>> from earthquakes_lab import * >>> is_float("3.1") True >>> is_float("3.1.5") False >>> is_float("big") False
Requirements, Hints, and Tips
-
You should practice good top-down design, incrementally implement and test your solution, and document your code with comments. Start with the main menu and get a skeleton of your code working first. Then add each option from the menu one at a time.
-
When reading the input data file, process the file once and store all the records as a list of
Earthquake
objects. Reading the file once will make it faster to process later. Creating a list ofEarthquake
objects will avoid some list of list headaches. -
To print an instance object of the
Earthquake
class, you can callprint()
directly on the object:record = Earthquake("id12345", 3.5, "Parrish Hall, Swarthmore, Pennsylvania", 39.90523, -75.35425, "PA", "2021-11-13 23:59:59") print(record)
The class knows how to format itself into a string, so you can avoid writing nasty formatting strings.
-
Validate menu selections. If the user doesn’t enter a valid number (1-7), let the user know it wasn’t a valid selection and prompt them again for a new menu item.
-
When reading file input,
split()
only generates a list of strings. Some of the strings will need to be converted to floats.
Submitting
Answer the Questionnaire
Please edit
the Questions-09.txt
file in your cs21/labs/09
directory
and answer the questions in that file.
Once you’re done with that, run handin21
again.
Turning in your labs…
Remember to run handin21
to turn in your lab files! You may run handin21
as many times as you want. Each time it will turn in any new work. We
recommend running handin21
after you complete each program or after you
complete significant work on any one program.
Logging out
When you’re done working in the lab, you should log out of the computer you’re using. First quit any applications you are running, like the browser and the terminal. Then click on the logout icon ( or ) and choose "log out".
If you plan to leave the lab for just a few minutes, you do not need to log out. It is, however, a good idea to lock your machine while you are gone. You can lock your screen by clicking on the lock icon. PLEASE do not leave a session locked for a long period of time. Power may go out, someone might reboot the machine, etc. You don’t want to lose any work!