This lab should be done with your new assigned Lab 4 partner:
Here is the Lab 4 partner list for Labs A, B, C
Expectations for Working with Partners
Lab 4 Goals:
- To understand and use C pointer variables, dynamic heap memory allocation (malloc/free) and C style "pass by reference"
- To use gdb and valgrind to debug C programs, particularly for
memory access errors.
- To use a C program that makes use of: command line
arguments; file I/O; multiple .c and .h files; and library linking.
- To use make and a Makefile to build multiple executable files.
Contents:
Lab 4 Starting Point Code
Both you and your partner should do:
- Get your Lab04 ssh-URL from the GitHub server for our class:
CS31-F17
- On the CS system, cd into your cs31/labs subdirectory
cd ~/cs31/labs
- Clone a local copy of your shared repo in your private
cs31/labs subdirectory:
git clone [your_Lab04_URL]
Then cd into your Lab04-you-partner subdirectory.
If all was successful, you should see the following files when you run ls:
Makefile grades.c grades1.txt readfile.c
QUESTIONNAIRE grades0.txt grades2.txt readfile.h
If this didn't work, or for more detailed instructions see the
the
Using Git page.
As you and your partner work on your joint solution, you will want to
push and pull changes from the master into your local repos frequently.
C Pointers
Educators often want statistical analysis of a set of exam scores.
These types of analyses could be done for an exam from a single class
or an exam from an entire school district.
A useful tool would be a program
that computes statistical results for any size data set (i.e. it
would work for ten data values or for thousands without re-compilation).
You will implement the program started in grades.c
that takes a single command line argument, which is the name of a file of
grade values (floats, one per line), and computes and prints out a set of
statistics about the data values and prints out a histogram of the
grade distribution.
The starting point code comes with three input files that you can use to
test your solution.
This program needs the readfile library and the math library. We have given
you a Makefile that links in these libraries, so you should use the Makefile
to compile. By reading the Makefile, you can see how the executable
is built from a .c file and a .o file. The -lm flag tells gcc to link
in the math library.
grades.c has the starting point code for your program.
It contains a prototype for the getvalues function
that you need to implement, and it has some code in main that copies
the filename given at the command line into a string local variable.
Your program should do the following:
- Make a call to getvalues, passing in the name of the
file containing the data values, and passing in two values by reference:
the address of an int variable to store the size of the array (number
of values read in); and the address of an int variable to store the total
capacity of the array (number of values allocated).
getvalues returns an array of float values initialized to the values
read in from the file, or returns NULL on error (like if malloc fails or
if the file cannot be opened). It dynamically allocates the array
it returns and uses a doubling re-allocation algorithm as it needs
more space (see the Requirements section for
details about this algorithm).
- It then computes the max and min grades, the mean (average),
the median (the middle value), and the standard deviation of the set
of grade values. It prints out the total number of grades and each
of these computed statistics.
- It creates a histogram from the set of grade values, and prints
out the grade histogram.
A note on histogramming values: Each histogram bucket
counts the number of exam scores in a particular range: 0-9, 10-19, 20-29, etc.
Exam grades that have a fractional component (e.g. 89.75) should be counted
as being in a histogram bucket range based on their whole number part only
(e.g. 89.75 is a grade in the 80's not a grade in the 90's).
- It prints out information about the amount of unused capacity in
the array storing the grade values.
Statistic Functions
The statistics you need to compute on the set of values are the following:
- mean: the average of the set of values. For example, if
the set is: 5, 6, 4, 2, 7, the mean is 4.8 (24.0/5).
- median: the middle value in the set of values. For example,
if the set is: 5, 6, 4, 2, 7, the median value is 5 (2 and 4 are smaller
and 6 and 7 are larger). If the number of values is even, just use
the num/2 value as the median.
- stddev: is given by the following formula:
Where N is the number of data values, Xi is the ith data value, and
X-hat is the mean (average) value.
Sample Output
Here is
Sample Output
from a working program run on the three input files that were
included with the starting point code. You should also test
your program on other input files that you create.
Requirements
- getvalues: The array of values must be dynamically allocated
on the heap by calling malloc. You should start by allocating
an array of 20 float values. If you run out of capacity as you are
reading in values:
- Call malloc to allocate space for a new array that is
twice the size of the current full one.
- Copy values from the old full array to the new array (and make
the new array the current one).
- Free the space allocated by the old array by calling free.
When all of the data values have been read in from the file, the function
should return the filled or partially filled array to the caller
(the function's return type is float *).
The size and capacity of the array should be "passed" back to the caller
via the pointer parameters that are used for pass-by-reference values.
NOTE: there are other ways to do this
type of alloc and re-alloc in C. However, this is the way I want you to
do it for this assignment: make sure you start out with a dynamically
allocated array of 20 floats, then each time it fills up, allocate a
new array of twice the current size, copy values from the old to new,
and free the old.
- Your program must have at least four function-worthy functions, be
well commented, and use meaningful variable names.
- All TODO comments in the starting point code should be removed in
the code your submit (these are my comments to you, not your comments
describing what your code does).
- You may assume that all grade values are between 0.0 and 100.0.
Grade values such as 67.5 or 70.25 are valid (don't assume whole number
grades only).
- The histogram counts the number of grades within each range. You may
assume range widths of 10 values: 0-9, 10-19, etc.
- You may use a statically declared array for the histogram, or you
can dynamically allocate it if you'd like. If you use a statically declared array,
you should use #define to define a constant for its length.
- Your program must be free of valgrind errors.
- If a function returns a value, the calling code should get the return value
and do something with it. For example, if a function returns an "error
or success" return value, the calling code should check for and handle
error return values.
- Your program should use good design, including being well commented and
having no wrapped lines. (See my C Style Guide)
Hints, Tips and Resources
- Try writing getvalues to work without the re-allocation and
copying part first (for fewer than 20 values). Once that works, then
go back and get it to work for larger input files that require
mallocing up new heap space, copying the old values to the new larger space,
and freeing up the old space.
- You may have written code for lab 2 that would be useful, or useful
as a reference, for this lab assignment.
- Use double variables to store and compute the mean and the square root.
- The C math library (in math.h), has a function to compute the square root:
double sqrt(double val);
- We haven't given you the full algorithm for finding the median. Devise your own
algorithm on paper before attempting to implement it.
- See the lab 2 assignment and the readfile.h comments
for information about using the readfile library.
- Take a look at the weekly lab code and in-class exercises to
remind yourself about malloc, free, pass-by-reference, pointer variables,
dereferencing pointer variables, and dynamically allocated arrays.
- Make use of my C programming resources and links for C references (C pointer
references in particular), and the C Style Guide for tips on good
commenting, avoiding line wrapping, and other programming style tips.
- Printing out the histogram does not require using C string variables, nor
does it require using the C string library functions to manipulate C strings.
Think about what you need to do for each bucket
in the histogram: print out the range it represents and then print out a star
some number of times corresponding to the value stored in that bucket. Remember
printf won't print a newline character unless you explictly include
it (\n) in the format string.
- Make use of valgrind and gdb in debugging your program. Also,
look at the Wednesday in-lab exercises for some examples of how to use these tools.
This is NOT a required part of the lab assignment.
DO NOT try this until you have finished all other parts of this lab.
Extra challenges are just a way to try something
a little more advanced for fun. We will award a nominal amount of
extra credit to your lab score for successful completion of
these (and it is all or nothing grading), but they will have
little to no impact on your lab grade. It is much, much better to
finish the required parts completely and to finish them well, than
to try the extra challenge but not complete the required parts.
The Extra Challenge
Lab Questionnaire
With every lab assignment is a file named QUESTIONNAIRE for you to fill out and submit with your lab solution. In this file you will answer some questions about the lab assignment. You should fill this out and submit it with your lab solution.
Submit
Before the Due Date
Only one of you or your partner needs to push your solution from
your local repo to the GitHub remote repo.
(It doesn't hurt if you both push, but the last pushed version before the
due date is the one we will grade, so be careful that you are pushing
the version you want to submit for grading.)
From one of your local repos (in your
~you/cs31/labs/Lab4-partner1-partner2 subdirectory):
git push
Troubleshooting
If git push fails, then there are likely local changes you haven't committed.
Commit those first, then try pushing again:
git add grades.c
git add QUESTIONNAIRE
git commit
git push
Another likely source of a failed push is that your partner pushed, and you
have not pulled their changes. Do a git pull. Compile and test that your
code still works. Then you can add, commit, and push.
If that doesn't work, take a look at the "Troubleshooting" section of the
Using git
page. You may need to pull and merge some changes from master into your
local. If so, this indicates that your partner pushed changes that you have
not yet merged into your local. Anytime you pull into your local, you need
to check that the result is that your code still compiles and runs before
submitting.