CS35: Data Structures and Algorithms

Week 02 Lab Notes

Lab 01 Agenda

Clone lab01

This is the same as lab00. It is a good habit to run ssh-add when you log in, so you don't get repeatedly asked for you ssh password. If you go to the CS35 github org, you can find the clone link for your repo on the web interface. We've included a sample below, but copying the link directly from the web interface may reduce typos.
$ cd ~/cs35/
$ git clone git@github.swarthmore.edu:CS35-F17/lab01-<user>.git ./lab01
Replace <user> with your username.

Make a sym link

This lab has a number of image files that you can use to test. Since the input files do not need to change for each user, we can store a read only copy of the files in a common location and have each student set up a shortcut or symbolic link to this shared location.
$ cd ~/cs35/lab01
$ ln -s /usr/local/doc/lab01-images/  ./test_data
Note the path /usr/local/doc/lab01-images/ is local only to the machines on the CS network and will not work if you clone your code to your personal computer.

Header (.h) files

Last week, we showed how you can declare a function in one part of a file and define the function body in a later part of the file. In fact, you can do this across multiple files. In this lab, we introduce the header file which usually ends with a .h or sometimes a .hpp extension. If you look at ppmio.h or image.h, you will see function declarations along with a comment describing their intended purpose. Reading the header file is usually enough to learn how to call and use a function, and what you should expect the function to do and return. The implementation of these function is usually in a corresponding .cpp file, e.g., ppmio.cpp or image.cpp. In one case this week, ppmio.cpp contains a lot of ugly details about how to read and write ppm files. You do not need to understand ppmio.cpp, but you should understand how to use ppmio.h.

To call a function in another program, you need to include the header file, but not the .cpp file. Note that ppmio.cpp contains the line

#include "ppmio.h"
as one of its includes. We use quotes "" when we are including local header files we wrote, and angle brackets <> when including common system-wide headers.

Our main program in picfilter.cpp includes both of our local header files

#include "image.h"
#include "ppmio.h"

Using header files can break larger programs into smaller, more modular pieces and can separate easy to understand function declarations in header files from sometimes messy implementations in .cpp files.

Compiling multiple files

With multiple files, compiling gets a little trickier, but not too much. The clang compiler will automatically include header files, and we can specify multiple .cpp files to compile on the same line.
$  clang++ -std=c++11 -o picfilter picfilter.cpp image.cpp ppmio.cpp 
The name of the output file is specified with the -o flag, in this case -o picfilter. This option can go either near the front, or at the end of the clang++ command.

Compiling with -std=c++11

As mentioned in class, the C++ language is constantly evolving. Occasionally, the community will define a fixed standard that all compilers should try to enforce. A recent standard is c++11 which added some nice features not available in previous versions of C++. In this lab, we would like to use one of those features. We can specify to clang to use this standard with the -std=c++11 illustrated above.

What super fancy feature are we using that requires the latest and greatest of C++? Don't get too excited, it's just the fstream::open(string filename) method buried in ppmio.cpp. And since it's in a file you don't need to edit/read/understand, you don't need to worry too much about this feature. But if you forget the -std=c++11 you'll get the following weird error:

ppmio.cpp:25:13: error: no viable conversion from 'string' (aka
      'basic_string<char>') to 'const char *'
file.open(filename);

Pulling changes in examples repo

Last week, we talked about the add, commit, push process for git. While push sends changes from the CS servers to github, we occasionally send updates to your repos. You can get these changes locally using git pull
cd ~/cs35/examples
git pull

Arrays

We covered the basic types in C++ last week in class. This week, we will learn how to create new classes and objects using OOP in C++. C++ arrays are not a type themselves, but rather a way to create an indexed container for storing multiple items of the same type. While they may look like python lists in how individual elements are accessed, arrays in C++ are not objects and have no methods like .sort(), .reverse() or even len().

To declare a static array, in which you know the exact size of the array at compile time, use the following syntax.

int a[10];
In the example above, the variable a is an array of ten integers. Individual elements can be accessed for reading/writing by their index, using the example syntax below.
a[0] = 7; //write
cout << a[0] << endl; //read value in first index

C++ does not check if you try to access an array out of bounds and has no way of determining the length of the array given only its name. If you want a data structure with these features, keep taking this course.

See simpleArray.cpp for a full program using an array.

Pointers

What if you don't know the size of an array at compile time? We can create an array dynamically at run-time by using a pointer type. A pointer type is a variable that stores a location in memory. Recall that a variable is a named container that stores a value. For any given type, a variable of that type needs a fixed amount of bytes to store its value. The compiler/computer ensures that each variable has its own reserved space in the computer's memory and that space reserved for two variables do not overlap. Each location in memory has fixed address so the compiler/computer can keep track of which memory locations are available/in use. A pointer type can store the value of an address of a location in memory. We will talk about pointers in much greater detail in the next week or two, but for now, we simply demonstrate how to use pointer to dynamically allocate arrays.

First, declare a variable that is a pointer to an int. This syntax is int* dynArr. The type of the variable dynArr is int*, which can either be the address of a single integer, or the address of the first integer in an array of integers. We will be using our pointer to allocate an array dynamically once we know the size using the new keyword.

int* dynArr; /* a pointer to (eventually) a dynamically allocated array */
int size = 10;

dynArr = new int[size];
At this point, we can use dynArr just like a statically allocated array. However, since the compiler knows how big a static array is at compile time, it can reserve enough spots for the array on the function call stack. When a static array goes out of scope, C++ can automatically free this reserved space for later reuse. Since dynamic arrays are created at runtime, the compiler can only reserve space for the pointer/address, not the entire array, on the stack. The new command allocates space dynamically in a separate region of the computer's memory called the heap. We will talk more about pointers, and the heap later in the semester, but for this week, what you need to know is that calls to new require a matching call to delete to free dynamically allocated memory. This is done when the memory for the array is no longer needed using the sample syntax:
delete [] dynArr;  // free the memory of dynamically allocated array
In both incArray.cpp in your examples folder and picfilter.cpp for lab 01, the calls to new and delete have already been implemented in the proper place. We do not expect you to fully understand pointers or dynamic memory in this lab, but simply wanted to highlight a new C++ feature you will encounter in this lab.

The file incArray.cpp also shows how to pass arrays in functions, a feature you will use in lab 01. Note the syntax in the function declaration, int a[]. Here we are saying a is an array of 0 or more integers. We want the program to be flexible enough to work with any size array, but since arrays do not know their own size, we additionally pass a second parameter with the integer size of the array.

Try to implement the function void increment(int a[], int size) in incArray.cpp. Note the return type of this function is void. If we modify the contents of a locally inside increment, will the changes be visible in main or printArray? Let's find out.