CS31 Lab 4: C Pointers and IA32

All parts due before 11:59pm Thursday, Oct 9

This lab should be done with a partner of your choosing.

The setup procedure for this lab will be similar to Labs 2 and 3. First, both you and your partner should run setup31 to grab the starting point code for this assignment. Suppose users molly and tejas which to work together. Molly (mdanner1) can start by running

  [~]$ setup31 labs/04 tdanner1
Once the script finishes, Tejas (tdanner1) should run
  [~]$ setup31 labs/04 mdanner1

For the next step only one partner should copy over the starting code

  [~]$ cd ~/cs31/labs/04
  [04]$ cp -r ~lammert/public/cs31/labs/04/* ./
  [04]$ ls
  Makefile loop.s mainloop.c readfile.c small.txt swap.s
  large.txt loop_C_goto_version mainswap.c readfile.h stats.c
Now push the changes to your partner
[04]$ git add *
[04]$ git commit -m "lab 4 start"
[04]$ git push
Your partner can now pull the changes. In this case if Tejas wishes to get files Molly pushed, he would run
[~]$ cd ~/cs31/labs/04
[04]$ git pull


Lab 4 Goals:


Part 1. C Pointers and Memory Allocation
Experimental scientists collect data from experiments and often want to compute some simple statistical analyses over the set of experimental data. A useful tool would be a program that could compute these statistical results for any size data set (i.e. it would work for 10 data values or for 10 million without re-compilation).

For this program you will implement the program started in stats.c that takes a single command line argument, which is the name of a file of data values (floats, one per line), and computes and prints out a set of statistics about the data values.

The starting point code comes with two input files that you can use to test your solution:

./stats small.txt
Results:
--------
num values:         16
      mean:      1.812
    median:      1.500
   std dev:      1.031
unused array capacity: 4

./stats large.txt
Results:
--------
num values:         94
      mean:      1.161
    median:      1.000
   std dev:      0.788
unused array capacity: 66
This program includes the readfile library code that it links in as well as linking in the math library: use the makefile to compile. You can see how the executable is built from a .c, a .o, and explicitly linking in the math library (-lm), by reading the Makefile.

In stats.c is the starting point into which you will put your code. It contains the prototype for the getvalues function that you need to implement, and has some code to get the filename command line argument.

You should structure your program in the following way:

  1. Make a call to getvalues, passing in the filename of the file containing the data values, and passing in two values by reference: the address of an int variable to store the size of the array (number of values read in); and the address of an int variable to store the total array capacity.

    Note: The function will return an array of float values initialized to the values read in from the file, or NULL on error (like malloc fails or the file cannot be opened).

  2. Compute the mean (average), the median (the middle value) and the standard deviation of the set of values and print them out.

    Note: You will likely need to sort the values prior to computing the median.

  3. Print out the results, plus information about the number of values in the data set and the amount of unused capacity in the array storing the values.

Statistic Functions

The statistics you need to compute on the set of values are the following:
  1. mean: the average of the set of values. For example, if the set is: 5, 6, 4, 2, 7, the mean is 4.8 (24.0/5).
  2. median: the middle value in the set of values. For example, if the set is: 5, 6, 4, 2, 7, the median value is 5 (2 and 4 are smaller and 6 and 7 are larger).
  3. stddev: is given by the following formula:

    $$s = \sqrt{\frac{1}{N-1}\sum_{i=1}^N(x_i - \bar{x})^2}$$
    Where $N$ is the number of data values, $x_i$ is the $i$th data value, and $\bar{x}$ is the mean value of the $N$ input values.

Feel free to discuss the math on Piazza or in person with your classmates. While we're using it to ensure that your pointers are working correctly, the math itself is not the focus of this assignment...

Requirements

Hints & Tips

  • Try getting getvalues to work without the re-allocation and copying part first (for fewer than 20 values). Once that works, then go back and get it to work for larger numbers of input values that require mallocing up new heap space, copying the old values to the new larger space, and freeing up the old space.

  • Use doubles to store and compute the mean and the square root.

  • The C math library (in math.h), has functions to compute the square root:
    double sqrt(double val);

  • Your program must be free of valgrind errors.

  • See Lab 2 for documentation about using the readfile library (and also look at the readfile.h comments for how to use its functions).

  • Take a look at the weekly lab code and in-class exercises to remind yourself about malloc, free, pass-by-reference, pointer variables, dereferencing pointer variables, and dynamically allocated arrays.

  • Make use of my C programming resources and links for C references (C pointer references in particular), and the C Style Guide for tips on good commenting, avoiding line wrapping, and other programming style tips.

    Part 2. Writing a Loop in IA32
    In the file named loop.s finish the implementation of the following function in IA32:
    // this function computes the sum of the first x positive int values
    //   x: the top of the range of values to sum
    //   returns: the sum of the values 0 to x inclusive
    int loop(int x) {
      int res, i;
      res = 0;
      for(i=1; i <= x; i++) {
        res = res + i;
      }
      return res;
    }
    
    In mainloop.c is a main program that makes a call to this function. If you run make, you can build an executable, mainloop, that links in your loop.s solution. Use this to test your solution for correctness.

    Hints and Requirements

    • In the file loop_C_goto_version you should write your C goto translation of the loop function above (you do not need to compile this, I just want to set your application of Step 1 in converting a C for loop to IA32).
    • The basic stack set-up and return code is provided for you in loop.s, including space for the two local variables res and i.
    • The parameter value x is in the stack in the caller's stack frame. It can be accessed relative to the %ebp pointer:
      esp ---->    # loop's stack frame:
      
      
        ebp-0x8    # space for one local variable
        ebp-0x4    # space for one local variable
      ebp ---->
                   # main's stack frame:
        ebp+0x8:   # space for the parameter (x)
      
    • The return value must be stored in register %eax right before the leave, ret instruction sequence.
    • Comment your IA32 instructions with what they are doing in terms of the C code. As an example:
      movl 0x8(%ebp), %eax   # load x into R[%eax] 
      addl $1, %eax          #  x + 1
      
      Help:
    • Draw a picture of the stack, draw where parameter and argument values are and trace through the instruction execution to help you determine what IA32 instructions you need to use to implement loop.
    • Try implementing and testing incrementally. For example, first see if you can add code to return the value of the parameter. Next, try returning some function of the parameter (like x+10).
    • You can use gdb (or ddd) to debug your solution. Set a breakpoint in loop, and use disass, ni, and print $reg to see values as you go. If you want to see the value referred to by an address, you need to tell gdb what type the address points to. For example, if you want to see an int value at address 0x1234, do:
      # re-cast address as an int pointer (int *), then dereference the pointer *
      (gdb) *(int *)(0x1234)
      
    • look in mainloop.c for the call to your loop function to help you see how it is called.


    Part 3. Writing a Swap Function in IA32
    In the file named swap.s finish the implementation of the a swap function that swaps two int values:
    void swap(int *x, int *y);
    
    This function takes two ints passed by reference (the parameters point to the storage location of their argument variables). After the call, the two int variables passed to swap should have their values swapped. A call to swap would look like (see mainswap.c for another example):
    int a, b;
    a = 10;
    b = 8;
    printf("%d %d\n", a, b);  // prints: 10 8 
    swap(&a, &b);  // swap the values stored in a and b
    printf("%d %d\n", a, b);  // prints: 8 10 
    
    The swap.s file already contains some IA32 instructions that set up up and tear down the stack frame associated with the swap function. You will add IA32 instructions to the body of this IA32 function to perform swapping operation.

    In mainswap.c is code to test your swap implementation. Run make to build an executable, mainswap, that links in your loop.s solution and calls it. Use this to test your solution for correctness.

    Requirements

    • As its name implies, your swap function should swap values stored in the given input pointers.
    • Your code should access the function's parameter values of x and y on the stack in the caller's stack frame. They can be accessed relative to the %ebp pointer:
      esp ---->    # swap's stack frame:
      
      ebp ---->
                   # main's stack frame:
        ebp+0x8:   first parameter value
        ebp+0xc:   second parameter value
      
    • Briefly comment your IA32 instructions with what they are doing in terms of the C code. As an example:
      movl 0x8(%ebp), %eax   # load x into R[%eax] 
      addl $1, %eax          #  x + 1
      

    Hints & Tips

    • Start by figuring out what the C code version of swap would look like. Draw a picture of the stack, draw where parameter and argument values are and trace through the instruction execution to help you determine what IA32 instructions you need to use to implement swap.
    • Implement and test incrementally. For example, first see if you can set the value pointed to by x to the value pointed to by y. Once that works, then see if you can complete the function.
    • You can use gdb (or ddd) to debug your solution. Set a breakpoint in swap, and use disass, ni, and print $reg to see values as you go. If you want to see the value referred to by an address, you need to tell gdb what type the address points to. For example, if you want to see an int value at address 0x1234, do:
      # re-cast address as an int pointer (int *), 
      then dereference the pointer * (gdb) *(int *)(0x1234)
      

    Handy References
    Submit

    To submit your code, simply commit your changes locally using git add and git commit. Then run git push while in the labs/04 directory. Only one partner needs to run the final push, but make sure both partners have pulled and merged each others changes. See the section on using a shared repo on the git help page.