git@github.swarthmore.edu:cs35-s20/lab10-<username>
Due Friday May 1 at 11:59pm.
The main goal of this lab is to gain experience working with graph algorithms. A secondary goal is to apply them on several different graph applications. We’re providing you with starter code for the graph data structures, some basic unit tests, and a helper function that loads in a graph from a file. You will implement three graph algorithms and then write three short programs that use these algorithms.
This is an individual lab.
The URL for your repository will be
git@github.swarthmore.edu:cs35-s20/lab10-<username>
where username
is your swarthmore user name.
Your starting code can be found in your lab10-<username>
repo. The following is a description of the repository’s initial contents; bolded files are those which you must change to complete this lab.
adts/
: As with your previous labs, the ADT interface declarations are stored here. The most notable additions for this lab are edge.h
, graph.h
and priorityQueue.h
, stlMinPriorityQueue
, stlMaxPriorityQueue
.
directedGraph.h/directedGraph.cpp
: A data structure for directed graphs, using adjacency lists.
undirectedGraph.h/undirectedGraph.cpp
: An undirected graph data structure using adjacency lists.
graphAlgorithms.h
/graphAlgorithms.cpp
: a file containing functions for the three graph algorithms you will implement. This file also contains a helper function loadGraphFromFile
which handles reading in a graph from a file.
test_data
: a directory containing several graphs you can test/run your programs on.
tests.cpp
: The unit testing program you will use to confirm that your graph
algorithms are correct.
manualTests.cpp
: A sandbox test program for you to use while you develop your program.
reachable.cpp
: A program which reads in a graph and determines whether there is a path between a source and destination node.
communication.cpp
: A program which reads in an undirected graph representing a wireless communication network and determines whether there is a short
path between two wireless devices.
outbreak.cpp
: A program which reads in a directed graph modeling potential infectious disease outbreaks and predicts how long it will take for an infectious disease to spread.
All of the graphs you’ll use this lab will come from files, provided in the
test_data
directory in your repository. All graph files are written in the following format:
The first line contains two integers n
and m
, representing the number of
vertices and edges in the graph respectively.
The second line contains either the string weighted
or the string unweighted
, specifying whether the graph is a weighted or unweighted graph.
The third line contains either the string directed
or undirected
, specifying whether or not the edges of the graph are directed.
The next n
lines contain a single string (one word, no spaces) representing the name of a vertex
The next m
lines each contain information about a single edge. First the line lists the source and destination vertices. If the graph is weighted, the line will also have an integer edge weight.
We have provided loadGraphFromFile(<filename>)
in graphAlgorithms.h.,cpp
to read graph files for you. It may still be useful to understand the format in case you want to write new test files. This function takes a the name of a graph file as input, creates the appropriate graph, and returns a pointer to the new Graph
.
In graphAlgorithms.cpp
, you will implement three different graph algorithms.
reachableDFS
Given a source node (src
), destination node (dest
), and a pointer to a graph object (g
), this method should return true
if a path exists
between src
and dest
. Your algorithm must utilize depth-first search to
solve the problem.
shortestLengthPathBFS
Given a source node (src
), destination node (dest
), and a pointer to a graph object (g
), this method should return a list of all vertices in the
shortest length path between src
and dest
. If no path exists, the method
will throw a runtime_error
. The list should the be nodes in the path, in order,
starting with src
. Your algorithm must utilize breadth-first search to
solve the problem.
singleSourceShortestPath
Given a source node (src
), and a pointer to a graph object (g
), this method should return the shortest cost path from src
to each vertex in the graph. The value is returned as a (dynamically allocated) dictionary that maps
a vertex (string) to the shortest path cost (int).
Note: For some graphs it might not be possible to reach every node in the graph starting from the source node. In class we interpreted unreachable nodes as having distance "infinity". However, C++ doesn’t have a predefined value for infinity. Instead, we’ll just not include unreachable nodes in the dictionary. The dictionary you return should contain only vertices reachable from the source node.
Also note: we won’t cover this algorithm until Tuesday’s class. We encourage you to
start with the first two algorithms, then complete the first two applications (reachable.cpp, communication.cpp
), and then implement this algorithm.
To gain experience using the algorithms you implemented in part I, you will write three short programs that use these graph algorithms.
reachable.cpp
For your first application, implement a basic program which loads in a graph, checks whether two vertices are reachable in the graph, and prints the results. Use command-line arguments to specify, in order, the name of the file containing the graph, the source vertex, and the destination vertex.
$ ./reachable test_data/basic_8 1 6
You can get from 1 to 6!
$ ./reachable test_data/basic_8 4 2
You can NOT get from 4 to 2!
Use the file I/O function loadGraphFromFile
to load the graph from the file into a new Graph
object. Assuming you’ve already implemented reachableDFS
, this program should not be much code.
Once you have compiled and tested your reachable application, you’ll be ready to move onto the next application!
You are in charge of product development at WirelessForAll.org, an NGO which provides affordable communication infrastructure in developing countries. You maintain a network of cheap devices which communicate wirelessly. Each wireless device can communicate only with other devices that are within 100m. However, messages can be relayed if there are device(s) in between. For example, if Alice and Carol are 200m apart, but Bob is 100m between each of them, a message from Alice to Carol can "hop" to Bob’s device and get sent on to Carol.
This communication network can be modeled as an undirected graph, where vertices correspond to wireless devices, and there is an edge between vertices if the devices are within 100m.
Unfortunately, each time a message is relayed, it gets garbled a bit! From experience, you know that once a message gets relayed four times, it becomes too garbled to be useful.
Write a program to help you understand when messages will get garbled. Your inputs should be an undirected, unweighted graph, a source vertex, and a destination vertex. Output (i) whether its possible to send any communication from source to destination, (ii) if so, what is the minimum number of steps it will take, and (iii) what the steps are along the shortest path.
You should not need to design new algorithms beyond those included in
graphAlgorithms.h
. Assuming you have already implemented and tested
the first two graph algorithms (reachableDFS, shortestLengthPathBFS
)
you should not have to write extensive code for this problem.
Here is some sample output:
$ ./communication test_data/communication200 B01 D47
destination is 2 steps from the source vertex!
here is the path: [B01 O48 D47]
$ ./communication test_data/communication200 B82 A03
destination is 1 steps from the source vertex!
here is the path: [B82 A03]
$ ./communication test_data/communication2000 A01 J98
destination is 4 steps from the source vertex!
here is the path: [A01 A81 C72 X97 J98]
$ ./communication test_data/communication2000 E08 G72
destination is far from the source vertex.
here is the path: [E08 E28 H71 G13 G50 S42 A26 G72]
As part of a summer research project, you are collaborating with epidemiologists (infectious disease experts) at the CDC who want to model an infectious disease outbreak. The epidemiologists used sophisticated statistical modeling to predict how quickly one person would catch the disease from another. Specifically, the data set consists of a list of entries e.g. (Alice, Bob, 15)
which indicates that if Alice was infected with the novel disease, then it would take Bob 15 days to become infected from Alice.
However, the epidemiologists have a problem---Bob might get infected sooner if he catches the disease quickly from someone else who catches it from Alice! How can you find the shortest infection path
from Alice to Bob?
This input can be modeled as a problem on a weighed directed graph, where vertices represent people, and a directed edge (u,v)
with weight w
represents that v
can become infected directly from u
in w
days.
Write a program outbreak.cpp
that takes a graph containing projected outbreak data, as well as a start person, and uses singleSourceShortestPath
to compute and output the minimum time before each other person is infected.
See this example output.
One way to minimize the spread of a infectious disease is to practice social distancing
.
Extend your outbreak.cpp
program to model the effects of social distancing. One way to do this is to take each vertex in the graph and remove 50% of its edges. How does that effect the infection time? What if you only removed 10% of the edges? What if you removed 90% of the edges?
You might want to use randomness to select which edges get removed. C++ has a library called random
to handle random number generation. To include this library, add #include <random
.
See this page for information about the random library
This lab involves writing a moderate amount of code. Much of it is modular, allowing you to work on and test small pieces of the assignment at a time. Completing this lab assignment will be much easier if you practice incremental development.
Below is a strategy which should organize your efforts; you are not required to proceed in this order, but it might help.
Begin by reading the graph ADT (adts/graph.h
) and understanding the DirectedGraph and Undirected Graph classes. Also read graphAlgorithms.h
to make sure you understand what is expected of your graph algorithms.
Run make
and ./tests
. Note that many unit tests will fail.
Implement reachableDFS
and run unit tests again. Expect to have many failed unit tests, but fewer than before.
Implement and test shortestLengthPathBFS
. At this point, you should only fail unit tests because you haven’t implemented singleSourceShortestPath
.
Implement and test the applications reachable.cpp
and communication.cpp
.
Implement and test the singleSourceShortestPath
algorithm. We’ll see pseudocode for this algorithm in lecture on Tuesday, so it’s productive to implement this later.
Finally, implement and test outbreak.cpp
.
Once you are finished, you should be able to compile and run the three graph applications. For your convenience, a few test graphs have been provided in the test_data directory.
For this lab, your program is required to run without memory errors or leaks. You should use valgrind
as you proceed in your testing to track memory errors. When you have a complete first draft of your implementation:
Commit and push what you have (in case something goes wrong).
Run valgrind ./tests
to make sure that the test program does not cause memory errors.
Once memory errors are cleared, run valgrind --leak-check=full ./tests
to identify and correct memory leaks.
Note: while your program is required to run without memory leaks, this is not the central pedagogical goal of this assignment. As with previous labs, memory leaks will result in minor point deductions unless the leaks are truly egregious.
As usual, you will also be required to observe good coding practices:
Your C++ code must have proper and consistent indentations.
You must have proper and consistent usage of spacing and braces for if
, else
, for
, and while
conditions as well as class definitions and code blocks.
When you are finished, you should have
A completed implementation of the three graph algorithms specified in graphAlgorithms.h
.
A completed implementation of the three graph applications specified in reachable.cpp
, communication.cpp
, outbreak.cpp
.
No memory errors
No memory leaks
Code which conforms to the style requirements
$ ./outbreak test_data/outbreak20 abagail
infection time from abagail to:
--> abagail: 0
--> ariona: 4
--> bettyjane: 5
--> chaye: 3
--> correen: 5
--> dakayden: 7
--> hodosh: 5
--> jhaniyah: 5
--> krispin: 7
--> kyris: 4
--> marzana: 3
--> nantambu: 5
--> pyles: 4
--> quirijn: 4
--> rhyson: 5
--> roandy: 7
--> rumell: 6
--> sretan: 6
--> turmaine: 5
--> yandier: 7
$ ./outbreak test_data/outbreak20 yandier
infection time from yandier to:
--> abagail: 9
--> ariona: 6
--> bettyjane: 9
--> chaye: 7
--> correen: 7
--> dakayden: 7
--> hodosh: 7
--> jhaniyah: 8
--> krispin: 8
--> kyris: 5
--> marzana: 7
--> nantambu: 9
--> pyles: 6
--> quirijn: 7
--> rhyson: 9
--> roandy: 6
--> rumell: 7
--> sretan: 7
--> turmaine: 8
--> yandier: 0