CS35 Lab 03: QuickSort and big-O

Due 11:59pm Sunday, 1 October 2017

The goals of this lab are to give you experience implementing nontrivial algorithms and to give you practice with asymptotic analysis and formal proofs. Concepts you will be familiar with after this lab include:

For this assignment, you will be working with a partner. Both partners should be present and working on the code together. You will both be responsible for understanding all concepts, so dividing and conquering is not an option. The academic integrity policy applies to the entire pair; you cannot work or share code with anyone outside your partner.

You and your lab partner will share the same git repository, which is named lab03-<user1>-<user2>.git Please access your lab by cloning the appropriate git repository, e.g.:

 $ git clone git@github.swarthmore.edu:CS35-F17/lab03-adanner1-mgagne1.git ./lab03
You can also get the ssh git url from CS35 github org.
Introduction
This lab consists of three parts: As with Lab 02, you and your partner will share a git repository. Please remember that, as with all partnered labs, both partners must learn this material. Do not divide up this lab to give one team member the theory material and another member the sorting algorithm.

Part 1: Big-O Analysis
Use the formal definition of Big-O notation to prove the following statements:
  1. $2n^2 + n + 5$ is $O(n^2)$
  2. $3n^3 - 17$ is $O(n^3)$
In order to complete this assignment, you will need to produce a document containing some mathematical expressions and figures. You will commit and push this document to your GitHub repository; the document may take the following forms: Your team must submit the document electronically by adding it to your git repository. In particular, you are not allowed to submit in one of the following forms:

Using LaTeX

LaTeX (pronounced lah-tek or lay-tek) is a document preparation system frequently used to produce conference papers, dissertations, and other academic works (in addition to other material). With LaTeX, you code your document much as you would an HTML document or a Python program. For instance, the LaTeX code
I'm \textit{not} making this up: $e^{i\pi}=-1$

produces

$\text{I'm } \textit{not} \text{ making this up: } e^{i\pi} = -1\ .$

You are not required to learn LaTeX for this course. However, it may be easier to use basic LaTeX to complete your homework than to use something like Microsoft Word's equation editor. In case you're interested, the following files are already part of your repository:

update: If you get an error using \begin{proof} in WrittenLab.tex, try adding \usepackage{amsthm} near the top of the .tex file.

Part 2: Mystery Functions
In this part of the lab, you will first analyze six simple loop structures and determine their asymptotic runtime complexity in terms of big-O notation. Then, using the provided program function_timer, you will graph the empirical runtimes of the functions.

The Functions

Begin by identifying the Big-O runtime for each of the following functions. Provide a brief justification for each answer — full proofs are not necessary. Give the strictest comparison you can find; e.g., don't give $O(2^n)$ if the function is also $O(n)$, and leave out any second-order terms: give $O(n^2)$ instead of $O(n^2 + n))$.
Ex1(n):
  for i=0...n
    a = i

Ex2(n):
  for i=0...2*n
    a=i

Ex3(n)
  for i=0...n
    for j=0...i
      a=i

Ex4(n):
  for i=0...n*n
    a=i

Ex5(n):
  for i=0...n
    for j=0...i*i
      a = i

Ex6(n):
  k=1
  for i=0...n
    for j = 0...k
      a = i
    k = 2*k
Hint: Analyze double-for-loops as we did in class. If the analysis is still hard to figure out, try to count the number of inner loop iterations for both small and large values. For example, for Ex3, the number of inner loop iterations when $i=0,1,2$ is $1,2,3$. When $i=n-2,n-1,n$ the number of inner loop iterations is $n-1, n, n+1$. Then, try to identify a pattern.

Using function_timer to Inspect the Functions

Each of the above functions has been implemented and packaged into a single executable function_timer that we provided for you. To access it, if you have not already done so, create a symbolic link to the function using the following commands from your lab03 directory:
[lab03]$ ln -s /usr/local/bin/function_timer_f17 ./function_timer
However, to make things interesting, we scrambled the order of the functions. Your job will be to match each function with its corresponding name. The function_timer program will provide empirical runtime data for each of the above functions, in a form that can be graphed by another tool called gnuplot.

To use this program, you must pick the following:

For instance, you can graph functions 3 and 4 within the range 10<=n<=80 by running this command:

$ ./function_timer -3 -4 -n 10 -m 80 | gnuplot
This will produce the following graph (actual results may vary):
To save a graph as a png file, add a -s <name of file>
 $ ./function_timer -3 -4 -n 10 -m 80 -s 34.png | gnuplot
Your writeup should include at least two graphs justifying your answer. Instructions for how to include images in a .tex file are in WrittenLab.tex.

Part 3: QuickSort
For the coding part of this lab, you will implement the quickSort sorting algorithm we described in lab. Your starting repository includes the following files: As always, files that you'll need to change are marked in blue. However, you'll be responsible for understanding all of the code.

To complete this lab, you should implement the quickSort function in quickSort.cpp, as we discussed in class. You'll need to be careful to follow the algorithm closely and to manipulate array indices accurately. Expect to have bugs in your first attempt. That's OK. Understanding what your bugs are and how to go about debugging/fixing them is a big part of becoming an excellent coder!

Unit Tests

To assist in the debugging process for this (and future!) labs, we're introducing a debugging tool called unit tests. A unit test is a small C++ function that tests a small piece or aspect of your code. Test Driven Development is a software engineering practice that uses extensive unit tests to ensure that software acts as expected. In principle, it makes code more reliable and makes bugs easier to identify and isolate.

We'll be using a unit testing framework called UnitTest++. This will let us easily write short effective unit tests for labs this semester. Write your unit tests in tests.cpp. We've given you two examples to get you started.

Use make to compile your tests. You will get a compilation message something like this:

$ make tests
clang++ -g -std=c++11 -Werror -c -o tests.o tests.cpp
clang++ -g -std=c++11 -Werror -o tests tests.o quickSort.o

Then execute your tests:

$ ./tests
Success: 6 tests passed.
Test time: 0.00 seconds.
If you have bugs in your sorting algorithm, the test program will tell you which unit tests fail.

You are required to write at least four more tests that investigate different aspects of sorting. Some ideas for additional tests include:

Although you have a minimum number of tests to write, you should ideally write as many as it takes for you to be confident in your code. Additionally, the ideal time to write your tests is prior to writing your code. Remember to rerun make tests each time you change tests.cpp.

Commenting and Coding Style Requirements

For this and future labs, graders will assign minor penalties for poor commenting and coding style. Here are some style tips for this lab:

Extra Challenge
We've seen multiple different sorting algorithms at this point. How fast is quickSort really? Can you write a faster sorting algorithm? If you're looking for an extra challenge, try one of the following: These additional features are optional – we'll look at them, but they won't be for credit. Make sure you complete and push the main lab before starting any extra challenge, and note in your README file that you've done extra challenges so the graders know which version of your lab to grade. (There's a chance your added features will break our grading scripts; we'll grade the base version of your lab to ensure you don't get penalized in the off-chance your extra features break our grading scripts)
Survey
When you have completed your lab, answer a few short survey questions in the file README.md

Summary

To summarize the requirements of this lab:

Submit

Once you are satisfied with your code, hand it in using git. Remember the add, commit, push development cycle. You can push as many times as you like, but only the most recent submission will be graded. You may want to run git status to confirm all modifications have been pushed.