1. CS21 Lab 8: Searching a Earthquake Dataset
Due Sunday, April 6, before midnight
Please read through the entire lab before starting!
This is a one week lab. For the previous lab, you used a full week to practice the TDD component and a second week to implement your solution. Since the design process should be similar to the previous lab, you will be doing the design and implementation for lab 8 in a single week. If you found the open-ended nature of the last lab challenging, you will want to start this lab early.
1.1. Goals
-
Write a program that uses linear search
-
Write a program that uses binary search
-
Continue practicing top-down design (TDD)
-
Connect CS topics with real data
1.2. Earthquake Data
For this lab, we’ll be using a month’s worth of earthquake data collected by the United States Geological Survey (USGS) in 2016. The dataset is a modified version from the CORGIS educational dataset archive, and it contains 8394 entries. We have defined an earthquake class to represent one earthquake record.
2. Introduction: Searching the Dataset
Your task is to write a program that:
-
Reads in the dataset from a file, storing a list of records. The data is stored in the file:
/usr/local/doc/earthquakes.txt
Note: don’t copy this file to your 08 directory, just use
"/usr/local/doc/earthquakes.txt"
as the file name in your python program. Each line of the file corresponds to one record and contains the following fields (separated by semicolons):-
A unique ID (string)
-
The magnitude of the earthquake (float)
-
The full location of the earthquake (string)
-
The latitude of the earthquake (float)
-
The longitude of the earthquake (float)
-
A shortened location of the earthquake (string)
-
The date and time of the earthquake (string)
Here’s one example:
id000018;2.41;10km W of Spirit Lake, Idaho;47.968;-117.0135;ID;2016-08-02 18:50:57
Each line of data ends with a "newline" character, which causes the following text to start on the next line of the file. When you’re reading in data from the file, you can call the
.strip()
method on each line (prior to calling.split()
) to remove any or newlines or other unnecessary spaces.As you read in earthquake data, each line of the file corresponds to one record, which your program should represent using an earthquake object.
-
-
Prompts the user with a menu of four choices:
Please select one of the following choices: (1) Find by location (2) Find by ID (3) Find by magnitude (4) Quit Choice?
Depending on the user’s selection, you will prompt them for additional information and then do one of:
-
Search the dataset by (full) location, using linear search.
-
Search the dataset by ID, using binary search.
-
Search the dataset by minimum and maximum magnitudes, using linear search.
-
Exit the program.
-
You program should continue prompting the user until they enter a valid choice and then perform the following queries listed below for each menu option. The program should exit if the user selects option 4 (Quit).
3. Find by Location
For option 1 (find by location), you must:
-
Prompt the user for a location string.
-
Use linear search to find all the records for which the user-entered location string appears in the full location. For many locations, you are likely to find more than one matching record, so your search function should produce a list of matching records.
You can use the
in
operator for strings to help you to determine whether or not the user-entered string appears in a record’s full location string as a substring. For example: "co" in "cold" and "old" in "cold" are bothTrue
, but "cld" in "cold" isFalse
. -
Your matching should be case-insensitive. For example, if the user enters "CA", your search should match records with "California", "CA", "cali", "Inca" etc.
-
Print the information for all matching records, one per line, if there were any matches. See the note about printing records in the tips section.
-
If no records match the entered location, let the user know that you couldn’t find any.
Here are some examples:
$ python3 earthquake.py Please select one of the following choices: (1) Find by location (2) Find by ID (3) Find by magnitude (4) Quit Choice? 1 Enter location: Yale Found 2 results: ok00005: magnitude 2.5 @ 12km WNW of Yale, Oklahoma (36.1735, -96.8148) on 2016-07-31 02:40:25 ok00063: magnitude 3.4 @ 7km WSW of Yale, Oklahoma (36.0798, -96.7672) on 2016-08-24 06:08:56 Please select one of the following choices: (1) Find by location (2) Find by ID (3) Find by magnitude (4) Quit Choice? 1 Enter location: penn Found 1 results: pa00001: magnitude 1.37 @ 8km E of Wellsboro, Pennsylvania (41.7358333, -77.196) on 2016-08-22 02:15:45 Please select one of the following choices: (1) Find by location (2) Find by ID (3) Find by magnitude (4) Quit Choice? 1 Enter location: burg Found 7 results: ca00721: magnitude 0.45 @ 37km E of Johannesburg, CA (35.3731667, -117.2306667) on 2016-08-17 03:28:07 ca00891: magnitude 1.8 @ 38km ESE of Johannesburg, CA (35.207, -117.2721667) on 2016-07-29 08:27:58 ca02196: magnitude 1.24 @ 9km ENE of Healdsburg, California (38.6458333, -122.7713333) on 2016-08-06 07:38:40 ca03220: magnitude 1.81 @ 12km NNW of Healdsburg, California (38.7179985, -122.9240036) on 2016-08-24 04:09:28 or00045: magnitude 1.7 @ 3km SE of Coburg, Oregon (44.1128333, -123.0371667) on 2016-07-31 11:02:54 or00067: magnitude 1.1 @ 6km NNW of Roseburg North, Oregon (43.3216667, -123.3376667) on 2016-08-09 14:21:13 tn00008: magnitude 1.88 @ 8km NNW of Dyersburg, Tennessee (36.1005, -89.426) on 2016-07-27 19:32:43 Please select one of the following choices: (1) Find by location (2) Find by ID (3) Find by magnitude (4) Quit Choice? 1 Enter location: Mars Found 0 results: Please select one of the following choices: (1) Find by location (2) Find by ID (3) Find by magnitude (4) Quit Choice? 4
4. Find by ID
For option 2 (find by ID), you must:
-
Prompt the user to enter an ID.
-
Use binary search to find the ID in your list of records. The data file provided lists each earthquake in increasing order by ID.
-
Print the information for the record matching the user-entered ID, if there is one. See the note about printing records in the tips section.
-
If the ID was not in the list, let the user know you couldn’t find it.
Here are some examples (input bolded):
$ python3 earthquake.py Please select one of the following choices: (1) Find by location (2) Find by ID (3) Find by magnitude (4) Quit Choice? 2 Enter ID: pa00001 Found 1 results: pa00001: magnitude 1.37 @ 8km E of Wellsboro, Pennsylvania (41.7358333, -77.196) on 2016-08-22 02:15:45 Please select one of the following choices: (1) Find by location (2) Find by ID (3) Find by magnitude (4) Quit Choice? 2 Enter ID: ca00919 Found 1 results: ca00919: magnitude 1.51 @ 13km WSW of Niland, CA (33.1883333, -115.6478333) on 2016-07-29 17:57:48 Please select one of the following choices: (1) Find by location (2) Find by ID (3) Find by magnitude (4) Quit Choice? 2 Enter ID: puppies Found 0 results: Please select one of the following choices: (1) Find by location (2) Find by ID (3) Find by magnitude (4) Quit Choice? 4 Goodbye!
5. Find by Magnitude
For option 3 (find by magnitude), you must:
-
Prompt the user for a minimum magnitude. You need to ensure that the user enters a valid float. If not, continue to prompt until they do. We have provided an
is_float
function to help you. -
Prompt the user for a maximum magnitude. You need to ensure that the user enters a valid float. If not, continue to prompt until they do. Additionally, the maximum magnitude should be greater than or equal to the minimum magnitude.
Here are some examples:
$ python3 earthquake.py Please select one of the following choices: (1) Find by location (2) Find by ID (3) Find by magnitude (4) Quit Choice? 3 Enter minimum magnitude: 5.5 Enter maximum magnitude: 5.6 Found 8 results: ak02185: magnitude 5.6 @ 45km S of Semisopochnoi Island, Alaska (51.5396, 179.5501) on 2016-08-14 12:28:55 chi0049: magnitude 5.6 @ 155km SSE of San Pedro de Atacama, Chile (-24.286, -67.8647) on 2016-07-27 02:43:45 ind0007: magnitude 5.6 @ 7km SE of Labuhankananga, Indonesia (-8.194, 117.8145) on 2016-07-31 19:40:01 ita0006: magnitude 5.5 @ 4km NE of Norcia, Italy (42.8223, 13.1257) on 2016-08-23 22:33:30 jap0037: magnitude 5.6 @ Izu Islands, Japan region (29.8965, 139.1312) on 2016-08-22 05:33:08 per0009: magnitude 5.5 @ 39km N of Lluta, Peru (-15.6569, -72.0174) on 2016-08-14 22:59:00 sou0009: magnitude 5.5 @ 62km ENE of Bristol Island, South Sandwich Islands (-58.724, -25.6076) on 2016-08-02 03:32:29 sou0059: magnitude 5.6 @ 300km ESE of Grytviken, South Georgia and the South Sandwich Islands (-55.2031, -32.1207) on 2016-08-19 16:37:16 Please select one of the following choices: (1) Find by location (2) Find by ID (3) Find by magnitude (4) Quit Choice? 3 Enter minimum magnitude: 6.1 Enter maximum magnitude: 5.5 Maximum magnitude must be greater than minimum magnitude. Please try again. Enter maximum magnitude: 6.3 Found 5 results: arg0004: magnitude 6.2 @ 53km NW of Abra Pampa, Argentina (-22.3942, -66.0814) on 2016-08-04 10:15:12 ita0002: magnitude 6.2 @ 10km SE of Norcia, Italy (42.714, 13.1719) on 2016-08-23 21:36:33 jap0009: magnitude 6.3 @ 70km ENE of Iwo Jima, Japan (24.9477, 142.0074) on 2016-08-04 12:24:33 sou0007: magnitude 6.1 @ South Indian Ocean (-23.9619, 82.4789) on 2016-08-01 03:42:50 sou0039: magnitude 6.2 @ South of the Fiji Islands (-25.1394, -177.3386) on 2016-08-11 23:29:33 Please select one of the following choices: (1) Find by location (2) Find by ID (3) Find by magnitude (4) Quit Choice? 3 Enter minimum magnitude: 8.0 Enter maximum magnitude: 9.0 Found 0 results: Please select one of the following choices: (1) Find by location (2) Find by ID (3) Find by magnitude (4) Quit Choice? 3 Enter minimum magnitude: puppies Invalid input. Please try again. Enter minimum magnitude: 6.2 Enter maximum magnitude: ducks Invalid input. Please try again. Enter maximum magnitude: 3.14 Maximum magnitude must be greater than minimum magnitude. Please try again. Enter maximum magnitude: 6.3 Found 4 results: arg0004: magnitude 6.2 @ 53km NW of Abra Pampa, Argentina (-22.3942, -66.0814) on 2016-08-04 10:15:12 ita0002: magnitude 6.2 @ 10km SE of Norcia, Italy (42.714, 13.1719) on 2016-08-23 21:36:33 jap0009: magnitude 6.3 @ 70km ENE of Iwo Jima, Japan (24.9477, 142.0074) on 2016-08-04 12:24:33 sou0039: magnitude 6.2 @ South of the Fiji Islands (-25.1394, -177.3386) on 2016-08-11 23:29:33
6. Provided Library
The earthquakes_lab
library has some helpful tools for managing earthquake
records in your program.
6.1. Earthquake Class
The library provides an Earthquake
class, which encapsulates all the
information you’ll need to keep track of one record. You can create an
instance object of this class by calling the Earthquake()
constructor with
each field of information from one row of the earthquakes.txt
file passed in
as separate parameters. The number fields (magnitude, latitude, and longitude)
are floats, and all other fields are strings. You may use any of these methods in your
solution, but you may not need to use all of them.
You can call the following methods on an Earthquake
object to retrieve information about it:
-
get_id()
: Get the ID field. (string) -
get_magnitude()
: Get the magnitude field. (float) -
get_location_full()
: Get the full location field. (string) -
get_latitude()
: Get the latitude field. (float) -
get_longitude()
: Get the longitude field. (float) -
get_location()
: Get the short location field. (string) -
get_date()
: Get the date/time field. (string)
For example:
>>> from earthquakes_lab import * >>> record = Earthquake("id12345", 3.5, "Parrish Hall, Swarthmore, Pennsylvania", 39.90523, -75.35425, "PA", "2021-11-13 23:59:59") >>> print(record.get_magnitude()) 3.5 >>> print(record.get_location_full()) Parrish Hall, Swarthmore, Pennsylvania >>> print(record.get_location()) PA >>> print(record) id12345: magnitude 3.5 @ Parrish Hall, Swarthmore, Pennsylvania (39.90523, -75.35425) on 2021-11-13 23:59:59
6.2. Validating Floats
For option 3, you’ll prompt the user to enter a minimum magnitude, which needs
to be a float. The library provides an is_float
function that you can call on
a string that will return a Boolean, where True
means the string can safely
be converted with float
:
>>> from earthquakes_lab import * >>> is_float("3.1") True >>> is_float("3.1.5") False >>> is_float("big") False
7. Hints and Tips
-
You should practice good top-down design, incrementally implement and test your solution, and document your code with comments. Start with the main menu and get a skeleton of your code working first. Then add each option from the menu one at a time (start with quit as it is easy and will allow you to exit your program when you are testing it). While much of the design is up to you, the requirements below are designed to avoid some headaches in the initial design.
-
To print an instance object of the
Earthquake
class, you can callprint()
directly on the object:record = Earthquake("id12345", 3.5, "Parrish Hall, Swarthmore, Pennsylvania", 39.90523, -75.35425, "PA", "2021-11-13 23:59:59") print(record)
The class knows how to format itself into a string, so you can avoid writing nasty formatting strings.
-
Validate menu selections. If the user doesn’t enter a valid number (1-4), let the user know it wasn’t a valid selection and prompt them again for a new menu item.
-
When reading file input,
split()
only generates a list of strings. Some of the strings will need to be converted to floats.
8. Requirements
The code you submit for labs is expected to follow good style practices, and to meet one of the course standards, you’ll need to demonstrate good style on six or more of the lab assignments across the semester. To meet the good style expectations, you should:
In addition, you’ll need to demonstrate good top-down design practices on two or more lab assignments across the semester. To meet the top-down design expectations, you should:
|
-
When reading the input data file, process the file once and store all the records as a list of
Earthquake
objects. Reading the file once will make it faster to process later. Creating a list ofEarthquake
objects will avoid some list of list headaches. -
Validate menu selections. If the user doesn’t enter a valid number (1-4), let the user know it wasn’t a valid selection and prompt them again for a new menu item.
-
Validate your search inputs before calling your search functions. Your search functions should assume the input parameters (the list of records and the query terms) are valid.
-
Each search function should return a list of matching records as described above. If no records are found, return an empty list.
-
Search by ID must use binary search to find the matching record (if one exists).
-
Write one function that prints a list of Earthquake objects. See the hint above that you can print an Earthquake object
quake
directly usingprint(quake)
. -
After each search, your program must return to the main menu and allow the user to perform another search or quit the program.
Answer the Questionnaire
After each lab, please complete the short Google Forms questionnaire. Please select the right lab number (Lab 08) from the dropdown menu on the first question.
Once you’re done with that, you should run handin21
again.
Submitting lab assignments
Remember to run handin21
to turn in your lab files! You may run handin21
as many times as you want. Each time it will turn in any new work. We
recommend running handin21
after you complete each program or after you
complete significant work on any one program.
Logging out
When you’re done working in the lab, you should log out of the computer you’re using.
First quit any applications you are running, including your vscode editor, the browser and the
terminal. Then click on the logout icon ( or
) and choose "log out".
If you plan to leave the lab for just a few minutes, you do not need to log
out. It is, however, a good idea to lock your machine while you are gone. You
can lock your screen by clicking on the lock icon.
PLEASE do not leave a session locked for a long period of time. Power may go
out, someone might reboot the machine, etc. You don’t want to lose any work!