1. Due Date
-
Due: Tuesday, October 8, before 11:59 PM
Your assigned partner for this lab is: Lab 4 Partners
2. Lab Goals:
-
Gain experience using pointers and dynamic memory allocation (malloc) in C.
-
Practice using
gdb
andvalgrind
to debug C programs. -
Convert C code with conditionals and loops to equivalent IA32 assembly code.
3. Lab Overview
This lab consists of 2 parts. In the first part you will write a C program that computes some statistics on a set of values that it reads in from a file. The program will use C pointers and dynamic memory allocation to allocate enough space for the set of values it reads in. In the second part you will write a sum function in IA32 assembly that is compiled into a program that you can then use test your function.
4. Lab Starting Point Code
4.1. Getting Your Lab Repo
Both you and your partner should clone your Lab repo into
your cs31/labs
subdirectory:
-
get your Lab ssh-URL from the CS31 git org. The repository to clone is named
Lab04-user1-user2
, where user1 and user2 are the user names of you and your Lab partner. -
cd into your
cs31/labs
subdirectory:$ cd ~/cs31/labs $ pwd
-
clone your repo
$ git clone [your_Lab04_repo_url]
There are more detailed instructions about getting your lab repo from the "Getting Lab Starting Point Code" section of the Using Git for CS31 Labs page.
4.2. Starting Point Files
Cloning the repository will give you the following starting point code files (note: the exact number of .txt input files in your repo may differ from this example listing):
$ ls
Makefile large.txt prog.c readfile.h stats.c
README.md larger.txt readfile.c small.txt sum.s
These files are:
-
Makefile:
builds two executable files,stats
for Part 1 andprog
for Part 2 of this lab assignment. -
readfile.h
andreadfile.c
: a library for file I/O. This is the same library used in Lab 2. Your program will make calls to functions in this library to read values from an input file. The instructions for using this library are explained below. Do not modify any code in these two files. -
stats.c
: starting point code for Part 1, the C stats program. It contains some code to get the file name from the command line argument and the start of theget_values
function that you will complete. your Part 1 solution should be implemented in this file. -
small.txt
,large.txt
, …: some example input files to use to test your stats program. You will want to create your own additional input files to more extensively test your program before you consider it finished. -
prog.c
: a main program for Part 2. You do not need to modify this program. -
sum.s
: starting point code for Part 2. Your Part 2 solution should be implemented in this file.
5. Part 1. C Pointers
Experimental scientists often want to compute some simple statistical analyses over the set of data. A useful tool would be a program that could compute these statistical results for any an arbitrarily sized data set (i.e. it would work for 10 or for 10 million data values without re-compilation).
5.1. Compiling and Running
For this lab, you will implement the program started in stats.c
that takes a
single command line argument, which is the name of a file of data values
(floats, one per line), and computes and prints out a set of
statistics about the data values:
$ ./stats small.txt
Results:
-------
num values: 16
min: 0.500
max: 3.000
mean: 1.812
median: 2.000
std dev: 1.031
unused array capacity: 4
Run make
to compile the stats
program. Make compiles your solution
in the stats.c
file, and also compiles and links in the readfile.o
library, and links in the C math library (-lm
). The
math library has a sqrt
function, which you will need for
the standard deviation calculations.
5.2. Program Structure and Statistics Functions
When run, your program should do the following:
-
Make a call to
getvalues
, passing in the filename of the file containing the data values, and passing in two pointers: the address of an int variable to store the size of the array (number of values read in) and the address of an int variable to store the total array capacity. This function is started for you. See the Requirements Section for details on how to allocate the array that this function returns. -
The
getvalues
function will return an array of float values that stores the values read in from the file, orNULL
on error (e.g.,malloc()
fails or the file cannot be opened). -
Sort the array of values. (see the note in Tips section about re-using your sorting function from Lab 2).
-
Compute the min, max, mean (average), median, and standard deviation of the set of values and print them out. (see notes below about median)
-
Print out the statistical results, plus information about the number of values in the data set and the amount of unused capacity in the array storing the values.
The statistics your program to compute on the set of values are the following:
-
num values: total number of values in the data set.
-
min: the smallest value in the data set.
-
max: the largest value in the data set.
-
mean: the average of the set of values. For example, if the set is: 5, 6, 4, 2, 7, the mean is 4.8 (24.0/5).
-
median: the middle value in the set of values. For example, if the set of values read in is: 5, 6, 4, 2, 7, the median value is 5 (2 and 4 are smaller and 6 and 7 are larger). If the total number of values is even, just use the (total/2) value as the median. For example, for the set of 4 values (2, 5, 6, 7), the median value is 6 because 4/2 is 2 and the value in position 2 of the sorted ordering of these values is 6 (2 is in position 0, 5 in position 1, 6 in position 2, 7 in position 3).
-
stddev: is given by the following formula:
Where N is the number of data values, xi is the ith data value, and X-bar is the mean of the values.
-
unused array capacity: this is really a statistic about the amount of the total capacity of your array that is not used to store this set of values.
5.3. File Format and File I/O
Your program will take as a command line argument the name of an input file that contains the data set. Each files contains some number of float values, one per line. The total number can vary from input file to input file, so it is up to your program to determine when it has read in all the values.
For example, a file with 4 values might look like this:
4.4
3.0
20.8
5.6
Included with the starting point code are a few sample input files
(ex. small.txt
, large.txt
) that you can use to run and test your program.
You will use the readfile library (that you also used in
Lab 2) for doing file I/O. Documentation about
library functions and how to use the library is in the library
header file (readfile.h
), which you can open in vim to view.
You should not change any code in readfile.c
or readfile.h
.
5.4. Lab Requirements
-
Implement your solution in
stats.c
, which includes some starting point code. -
The getvalues function: takes the name of a file containing input values, reads in values from the file into a dynamically allocated array, and returns the dynamically allocated array to the caller. The array’s size and capacity are "returned" to the caller through pass-by-pointer parameters:
float *get_values(int *size, int *capacity, char *filename);
-
The array of values must be dynamically allocated on the heap by calling
malloc
. You must start out allocating an array of 20 float values. As you read in values into the current array, if you run out of capacity:-
Call
malloc
to allocate space for a new array that is twice the size of the current full one. -
Copy values from the old full array to the new array (and make the new array the current one).
-
Free the space allocated by the old array by calling
free
.There are other ways to do this type of alloc and re-alloc in C. However, this is the way we want you to do it for this assignment: make sure you start out with a dynamically allocated array of 20 floats, then each time it fills up, allocate a new array of twice the current size, copy values from the old to new, and free the old.
-
-
When all of the data values have been read in from the file, the function should return the filled, dynamically allocated, array to the caller (the function’s return type is (
float *
). The array’s size and total capacity are "returned" to the caller through the pass-by-pointer parameterssize
andcapacity
.
-
-
After reading in the values from the file, your program should sort the array (and note that your program should be able to easily compute min, max and median on a sorted array of values). See the Tips section about how to re-use your sort function from the Lab 2.
-
For full credit, your program must be free of valgrind errors. Do not forget to close the file you opened! If you don’t, valgrind will likely report a memory leak.
-
Your code should be commented, modular, robust, and use meaningful variable and function names. This includes having a top-level comment describing your program and listing your and your partner’s names and the date. In addition, every function should include a brief description of its behavior. You may not use any global variables for this assignment.
-
It should be evident that you applied top-down design when constructing your submission (e.g., there are multiple functions, each with a specific, documented role). You should have at least 4 function-worthy functions.
-
You should not assume that we will test your code with the sample input files that have been provided.
-
When run, your program’s output should look like the output shown below from a run of a working program. To make my job of grading easier, please make your output match the example as closely as possible.
$ ./stats large.txt Results: ------- num values: 94 min: 0.333 max: 3.000 mean: 1.161 median: 1.000 std dev: 0.788 unused array capacity: 66
Note: just like you can use
\n
in the printf format string to insert a new line, you can use\t
to insert a tab character for pretty formatting. Also, you can specify formating of each placeholder in the printf string. For example, use%10.3f
to specific printing a float/double value in a field with of 10 with 3 places after the decimal point. Section 2.8 of the textbook has more information about formatting printf placeholders. -
For increased precision, use the C
double
type to store and compute the mean and the square root. -
Your code should be commented, modular, robust, and use meaningful variable and function names. Each function should include a brief description of its behavior. Look at the C code style guide for examples of complete comments, tips on how not to wrap lines, good indenting styles and suggestions for meaningful variable and function names in your program.
-
Code you submit should have my "TODO" comments removed (these are my notes to you, as you implement them remove the "TODO").
5.5. Tips
-
Before even starting to write code, use top-down design to break your program into manageable functionality.
-
Try getting getvalues to work without the re-allocation and copying part first (for fewer than 20 values). Once that works, then go back and get it to work for larger numbers of input values that require mallocing up new heap space, copying the old values to the new larger space, and freeing up the old space.
-
Sort function: You can use your sort function from Lab 2 in this program. To do so, just copy your sort function from the previous lab’s .c file into this file, and then change the array parameter to a pointer to float. And that is it — you can use the same sort function to sort the dynamically allocated array!
// change the prototype and function definition of your sorting function // so that its first parameter is float *array void sort(float *values, int size);
You can copy a chunk of code from one file by selecting it with the left mouse button held down and dragging to what you want to copy. In the other file in vim, enter insert mode and the choose the left mouse button to paste. If the indentation of the pasted function is weird. In command mode (Esc) in vim, visually select a section of code to reformat ( Shift-V
and then use the arrow keys (orj
andk
) to visually select the set of lines you wan to reformat), and then choose=
to fix indenting (vim figures this out by matching { }). -
The C math library has a function to compute square root:
double sqrt(double val);
. You can passsqrt
a float value and C will automatically convert the float value to a double. -
See the Lab 2 lab assignment for documentation about using the
readfile
library (and also look at thereadfile.h
comments for how to use its functions). -
Take a look at the textbook, weekly lab code and in-class exercises to remind yourself about malloc, free, pointer variables, dereferencing pointer variables, dynamically allocated arrays, and passing pointers to functions.
-
Use CNTRL-C to kill a running program stuck in an infinite loop.
6. Part 2: IA32 Coding
For this part, you will implement a sum function in IA32. You should
implement your program in sum.s
, which has a starting point of this
function. The starting point handles the stack set-up and function
return. As a result, you just need to implement the IA32 translation of
the function body. See the comments in sum.s
about where on the
stack is space for local variables, and where the parameter n
is located.
The C function you will implement in IA32 is:
/* computes the sum of the values 1-n
* n: an int value
* returns: the sum of values from 1-n, or -1 if n is not positive
*/
int sum(int n) {
int i, res;
if( n <= 0 ) {
return -1;
}
res = 0;
for(i=1; i <= n; i++) {
res = res + i;
}
return res;
}
The program prog.c
makes a call to this sum function. You can compile
(make
) and run prog
to test out your sum function. Here is
an example run:
make
./prog
This program computes the sum of 1-N
Enter an value for n: 10
The sum of 1 to 10 is 55
6.1. Tips
-
Write C-goto versions of sum function, the if and for loop parts in particular, to help you with the translations of those parts.
-
Try adding a little bit of IA32 code to
sum.s
at a time, then compile and test it out by runningprog
, and add some more. Parts that you test out do not even have to be parts of the solution, but testing if you can perform some of the steps like returning some value, and initializing a local variable. For example:-
first see if you can just return the value of n.
-
next, see if you can initialize res to 0 and return the value of res.
-
next, try the if stmt, and test passing positive and negative values to see if your function returns -1 when n is not positive.
-
then try implementing the for loop.
-
-
If you have errors, trace through the instruction’s on paper drawing memory contents and register contents as you go.
-
You can also try using gdb (or ddd) to debug your program at the assembly level. Set a breakpoint in
sum
and useni
to step through instruction execution. print to (ex.print $eax
) to print out register values.
7. Submitting
Please remove any debugging output prior to submitting.
To submit your code, commit your changes locally using git add
and
git commit
. Then run git push
while in your lab directory.
Only one partner needs to run the final git push
, but make sure both
partners have pulled and merged each others changes.
Also, it is good practice to run make clean
before doing a git add and
commit: you do not want to add to the repo any files that are built by gcc
(e.g. executable files). Included in your lab git repo is a .gitignore file
telling git to ignore these files, so you likely won’t add these types of files
by accident. However, if you have other gcc generated binaries in your repo,
please be careful about this.
Here are the commands to submit your solution in the sorter.c file (from
one of you or your partner’s cs31/labs/Lab4-partner1-partner2
directory:
$ make clean
$ git add stats.c
$ git add sum.s
$ git commit -m "lab 4 completed and well commented"
$ git push
If you have difficulty pushing your changes, see the "Troubleshooting" section and "can’t push" sections at the end of the Using Git for CS31 Labs page. And for more information and help with using git, see the git help page.
8. Handy Resources
-
Class piazza page for questions and answers about lab assignment
-
C Debugging: gdb Guide, Valgrind Guide, Chapter 3 of the textbook Week 5 in-lab examples
-
Misc: Some Useful Unix Commands, vi (and vim) quick reference, make and makefiles