Lab 11: Social Meatia
Due on Sunday, May 1st at 11:59 PM. This is a team lab. You and your assigned lab partner(s) will complete this lab together. Make sure that you are familiar with the Partner Etiquette guidelines. You may discuss the concepts of this lab with other classmates, but you may not share your code with anyone other than course staff and your lab partner(s). Do not look at solutions written by students other than your team. If your team needs help, please post on the Piazza forum or contact the instructor or ninjas. If you have any doubts about what is okay and what is not, it’s much safer to ask than to risk violating the Academic Integrity Policy.
Overview
In this lab, you will write an implementation of the graph ADT using the data structures we discussed previously in this class. You will then write a collection of algorithms for that data structure. These algorithms will be used by several small programs to perform social-media-related tasks for the people of the fictional town of Hampton.
Note that this lab relies upon many of the previous data structures we have discussed in class. The algorithms at hand are not overly complex, but this lab may require a greater debugging effort than previous labs and will test your knowledge of how to use the data structures we have discussed thus far. You can find your starting code in the usual place.
Starting Code
Your starting code involves the usual data structure template and support files. It also includes implementations of several previous data structures as well as several small programs which make use of the graph ADT. Your task is to implement the graph data structure and its corresponding algorithms so that these small programs produce the expected output.
As usual, the starting code appars in your team’s repository. Notable files from that repository appear below. Those files which you may need to edit are bolded.
- Previous data structures appearing in the
library/
directory. You can include these data structures in your code by describing their relative paths, e.g.#include "library/hash_table.h"
. - Files to define directed and undirected graphs.
graph.h
: The definition of theGraph
interface.directed_graph.h
/directed_graph__definitions.h
: The definition of theDirectedGraph
data type, an implementation ofGraph
.undirected_graph.h
/undirected_graph__definitions.h
: The definition of theUndirectedGraph
data type, a subclass ofDirectedGraph
.
- Algorithms on graphs.
graph_algorithms.h
/graph_algorithms__definitions.h
: The definition of the graph algorithms used in this assignment.
- Files to implement the various small programs.
- Each program
PROG
has filesPROG_program.cpp
,PROG_main.h
, andPROG_main.cpp
. You should not have to change any of these files.
- Each program
You will not have to edit any files other than the definition of the Graph
-related classes and algorithms. You may not need to edit their definitions (if you are comfortable working with the fields and methods that are already present on the graph data structure), but you are permitted to do so.
C++ Standard Library
For this assignment, we are not using the STL. It is valuable to be familiar with the C++ STL, but the STL is quite complex and somewhat differently structured than the ADTs we’ve discussed in class. These differences are significant for engineering purposes only; the STL data structures’ theories are the same. The burden of learning this new interface, however, is significant and not the focus of this lab.
Step 1: Writing the Graph
Your first step should be to implement the basic features of the graph data structure: adding vertices and edges, finding the neighbors of a given vertex, and so on. Think carefully about memory management in this stage and make sure to test your data structure thoroughly! Any bugs in your graph are likely to cost you significant effort in the future.
The starter code uses two fields for the DirectedGraph
class: one to store all of the vertices and the other to store all of the edges. We store the edges as a mapping from each source vertex to a pointer to a list of Edge
objects from that source vertex; the Edge
class is defined in directed_graph.h
alongside the DirectedGraph
declaration. You are not required to keep these fields – you may change them if you like – but your graph must have significantly better performance than a linear search through a list of edges. We recommend that you use the fields that have been defined for you.
Debugging with GraphViz
In addition to the usual testing tools, this lab provides a utility file called graph_io.h
. That file contains a function render_graph
which is likely to be extremely helpful to you. Given a Graph
object, the render_graph
function can render that graph into a file written in the language DOT. DOT is a language used by GraphViz, a graph visualization software package, to draw graphs into file formats like PNG or PDF.
For example, suppose we wanted to create a graph with two vertices and a single edge between them. We might write the following test:
TEST(create_single_edge_directed_graph) {
Graph<int,float>* graph = new DirectedGraph<int,float>();
graph->add_vertex(1);
graph->add_vertex(2);
graph->add_edge(1,2,0.5);
render_graph<int,float,string>(
"create_single_edge_directed_graph.dot", // filename
graph, // the graph object to render
true); // because this is a directed graph
delete graph;
}
Once this test runs, a file named create_single_edge_directed_graph.dot
will be created in the root directory of the project. This file can then be converted to a PDF using the following command:
dot -Tpdf <create_single_edge_directed_graph.dot >create_single_edge_directed_graph.pdf
Here, the -T
tells dot
which file format to use. The <
tells your shell to give the program the contents of create_single_edge_directed_graph.dot
as its input (instead of taking input from your terminal). The >
tells your shell to write its output to the PDF file rather than to your terminal. The resulting PDF file will look like the figure to the right of the code above.
The code above gives explicit type parameters to render_graph
, which is defined as a template function. To call render_graph
, you should give type arguments in the following order:
- The type of vertex in the graph. Here, it is
int
. - The type of edge weights in the graph. Here, it is
float
. - The type used to name the vertices. We have not named the vertices in this example, so we can use any type; we chose
string
here.
Further Features
The render_graph
function provides additional behavior to name and highlight vertices as well as highlight paths. This behavior occurs when you pass additional parameters to the render_graph
function. For instance, the following test creates a three vertex graph and then highlights one of the edges it contains. It also provides names for the vertices.
TEST(create_three_vertex_directed_graph) {
Graph<int,float>* graph = new DirectedGraph<int,float>();
graph->add_vertex(1);
graph->add_vertex(2);
graph->add_vertex(3);
graph->add_edge(1,2,0.5);
graph->add_edge(1,3,0.75);
graph->add_edge(2,3,0.1);
Dictionary<int,string>* vertex_data = new HashTable<int,string>();
vertex_data->put(1, "A");
vertex_data->put(2, "B");
vertex_data->put(3, "C");
List<int>* highlight_vertices = new CircularArrayList<int>();
List< Edge<int,float> >* highlight_edges =
new CircularArrayList< Edge<int,float> >();
highlight_edges->insert_at_back(Edge<int,float>(1,2,0.5));
highlight_edges->insert_at_back(Edge<int,float>(2,3,0.1));
render_graph<int,float,string>(
"create_three_vertex_directed_graph.dot", // filename
graph, // the graph object to render
true, // because this is a directed graph
vertex_data, // names for the vertices
highlight_vertices, // the vertices to highlight (here, none)
highlight_edges, // the edges to highlight
true); // additionally, highlight vertices on the highlighted edges
delete graph;
delete vertex_data;
delete highlight_vertices;
delete highlight_edges;
}
Generating these graph images takes a not-insigificant amount of code but is ultimately worthwhile: it’s usually easier to spot a problem in a picture than it is in text. You should also note that the main
functions for each of the programs you will be working on generates these data structures in the course of its execution, so you can easily modify those functions to generate debugging output for you!
UndirectedGraph
We stated in class that an “undirected graph” is a graph in which edges have no particular direction: an edge from vertex A to vertex B is equivalent to an edge from vertex B to vertex A. We can approximate that behavior defining a subclass of DirectedGraph
called UndirectedGraph
. UndirectedGraph
overrides the methods used to add edges to the graph, giving them different behavior. Specifically, whenever a caller adds an edge from A to B, we simply add both edges to our DirectedGraph
data structure. The code for add_edge
has been written for you. You must write the code for get_edges
that makes the UndirectedGraph
object seem as if it only has one such edge (even though the DirectedGraph
implementation actually has two).
You are also responsible for understanding these concepts of polymorphism and overriding. The graph algorithms we write will operate on any Graph*
as long as the object maintains the invariants of the Graph
interface. This means that we can write algorithms (such as depth-first search) without knowing whether the graph is directed or not.
Step 2: Depth-First Searching for Oinked In
With implementations of the Graph
ADT complete, we can begin writing the graph algorithms that will make our programs work.
Those familiar with the social networking site Oinked In know that it primarily provides the service of guessing whether or not two people know each other. You will write a simple depth-first search algorithm which, given a graph, will be able to answer this pressing question.
The graph algorithm is_reachable
is called by the program oined_in
. You can build oined_in
either by running make
as usual or by compiling it specifically (make oinked_in
). That program expects to receive four command-line arguments:
- The name of the input graph file.
- The name of the results file.
- The IDs of the two people to check.
The oinked_in
program expects to receive a social network graph; several such graphs can be found in the test_data
directory with the prefix social_network
. Social network graphs are undirected. Although their edges technically have integer weights (so that they fit into our framework), those weights are always 1
. You should begin by testing your program on some very small social networks to see if your depth-first search is working correctly.
This simple tip may help you run the program from the command-line more effectively: instead of providing a normal filename to contain the results of the search, give /dev/stdout
as the filename. This will cause the results to be printed to the terminal instead of a file, allowing you to skip the intermediate file step for one-off tests.
Using Dictionaries as Sets
Remember that a depth-first search on a graph must contend correct with cycles. This is especially important here, since the graph is undirected (and so every edge represents a cycle!). In order to avoid revisiting a node, you’ll need some way of keeping track of the nodes you’ve visited before.
Fortunately, dictionary data structures can be employed as sets! To keep track of a set of V
, for instance, you can create a dictionary mapping V
to any other type (e.g. Dictionary<V,bool>
). Whenever an element is added to the set, you put
that element with an arbitrary boolean. When an element is removed from the set, you remove
that element.
A better idea would be to define a HashSet
class which uses a HashTable
field to perform these tasks. Then, you could write a series of operations such as add
, remove
, and even union
so that the code which uses the Set
would be more readable. You may do this, but you are not required to do so; just using a dictionary will work for this assignment.
Step 3: Breadth-First Searching and the Degrees of Kevin Bacon
The Six Degrees of Kevin Bacon is a simple parlor game in which, given an entertainment personality, one must use six or fewer steps to connect that person to Kevin Bacon, a particular actor. This problem can be solved by representing the relationships between people as a social network graph and then performing a breadth-first search to find the path between them. The graph algorithm shortest_path
will fulfill this requirement.
Once you have written shortest_path
, you can utilize it by running the program bacon
. This program expects three command-line arguments:
- The name of the input graph file.
- The name of the results file.
- The ID of the person to connect to Kevin Bacon.
Kevin Bacon appears in all of the social network graphs that you have been provided. His ID is always 0. As with the oinked_in
program above, bacon
expects to receive a social network graph.
Step 4: Network Latency Testing via Dijkstra’s Algorithm for Instaham
Instaham is a social network designed primarily for the purpose of distributing pictures of food. A user in this network subscribes to a feed and receives these images when they are posted by the feed’s owner. In order to maximize food image distribution potential, the network engineers of Hampton need to determine the speed with which an image can be sent from one point in town to another.
Your test_data
contains a collection of graphs which describe the town’s network. Town network graphs are undirected graphs weighted by float
s describing the delay of the connection. Given a point in town, the network engineers need a list of other town locations and the minimum time to reach that location. To accomplish this task, you will implement the single_source_shortest_path_costs
graph algorithm using Dijkstra’s algorithm.
Once you have implemented that function, you can use the program instaham
to test it. This program expects three command-line arguments:
- The name of the input graph file.
- The name of the results file.
- The ID of the location in town from which network latency information is required.
Step 5: Software Compilation for Facepork Using Topological Sort
In many programming languages, source code must be compiled in a particular order: a source file cannot be compiled until the libraries that it uses have been compiled. You will assist the developers of Facepork, a social networking site, in compiling their newest rendition of their software. (They may have recently discovered some minor security flaws.) Given a graph describing the dependencies of various source files on other source files, you will produce an order in which these files can be compiled.
The topological_sort
graph algorithm can complete this task. You should implement the topological sort using the recursive depth-first search algorithm we discussed in class. Topological sort only works on graphs without cycles, so you will test your algorithm on software project graphs, which appear in your test_data
with the prefix software_project
. Once you’ve implemented topological_sort
, you can test it using the facepork
program. This program only expects two arguments – the input graph and the output file – since a topological sort is a graph-wide operation which starts from an arbitrary position.
Note that the topological sort test results are generated using the approach we described in class, looping over the vertices in order from first to last and exploring each one. If you perform your operations in a different order, it is possible for you to produce a valid topological sort which does not match the provided test data.
Step 6: Network Design for Stynet Using Minimum Spanning Trees
For your final task, you must assist Hampton in designing Stynet, a new fiber-optic network for their town. The network should be able to reach all places in the town but should be as cheap as possible. You will load a town map (as in Instaham above) and take the weights of the graph to represent the cost of laying the cable between two points. Your objective will be to identify how to connect the nodes in the cheapest way possible.
Thankfully, you can complete this task by implementing the minimum_spanning_tree
graph algorithm. You may use any MST algorithm, although it’s probably easiest to use Prim’s algorithm from lecture. Once you’ve completed this task, you can test your implementation by running the stynet
program. As with facepork
, this program only requires an input graph file and the file to which results should be written.
Testing
In previous assignments, you were provided a single RUN_MAIN
routine. For this assignment, you have been provided one such routine for each of the above programs: RUN_OINKED_IN_MAIN
, RUN_BACON_MAIN
, etc. You’ve also been provided CHECK_FILES_EQUAL
, as usual.
Make sure you write thorough tests for your Graph
implementations before writing the graph algorithms; it will be extremely difficult to debug graph logic errors and your graph algorithms simultaneously! While you are not required to test the main
functions for each program, you are required to find some way to test your graph algorithms; running the main
functions in numerous different ways is sufficient to accomplish this (and is fairly terse).
Memory
Your program is required to run without any memory leaks or errors; you must free up all of the memory that you allocate and you must only use memory that you have allocated and initialized. We will use valgrind
to determine if you have done this correctly, so you should as well. You should also consider using valgrind
to find bugs in your program as you proceed; it’s one of two very powerful C++ debugging tools that we’ve discussed.
Coding Style Requirements
You are required to observe some good coding practices:
-
You should pick meaningful variable names.
// Good int* pixels = new int[size]; // Bad int* p = new int[size];
-
You should use correct and consistent indentation. Lines of code within a block (that is, surrounded by
{
and}
) should be indented four spaces further than the lines surrounding them.// Good if (condition) { cout << "Test" << endl; } // Bad if (condition) { cout << "Test" << endl; }
-
You should use a block whenever possible, even if it’s not necessary. This helps you avoid subtle or messy bugs in the future.
// Good if (condition) { cout << "Something" << endl; } // Bad if (condition) cout << "Something" << endl;
-
Any new methods or fields in your header files should have comments explaining their purpose and behavior. You are permitted to omit documentation for methods that are inherited from other classes; that is, if your class has a
foo
method because its superclass has afoo
method, you don’t need to document that method.// Good public: /** * Saves this image object to a file. * @param file The already-open stream into which this image should be written. * @return The number of pixels in the saved image. * @throws runtime_error If the image could not be saved due to an I/O error. */ int save(ofstream& file); // Bad public: int save(ofstream& file);
Your method/field documentation does not have to be in the format above, but you must describe the method’s behavior, its parameters, its return value, and any exceptions that it may throw. (If you’re indifferent, the above syntax is a good one to know; it’s a de facto standard used by Javadoc, Doxygen, and other tools that automatically process source code comments into other formats like searchable webpages.)
Peer Review
As this is a team assignment, you are required to complete a peer review of your lab partner or partners. You must do this even if everything is fine. You should complete this peer review after you have turned in your assignment. The peer review comes in the form of a simple Google Form and can be accessed here; in most cases, it will likely take less than a minute to complete. If you have trouble accessing this peer review, please make sure that you are logged into Google using your Swarthmore credentials.
Your peer review will be kept private; only course staff will have access to this information. You can also update your peer review after you have submitted it. However, if you do not submit a peer review, you will lose points on your lab. Please don’t forget!
Summary of Requirements
When you are finished, you should have
- Completed implementations of
DirectedGraph
andUndirectedGraph
- Completed implementations of the five graph algorithms described above
- Appropriate tests
- No memory errors!
- Code which conforms to the style requirements above
- A completed peer review