This lab has two parts:
Checkpoint: Friday, December 2 push to git repo before 11:59pm.
Complete lab: Due by 11:59 pm, Thursday, December 8, 2022
By the checkpoint deadline, your program should implement functionality up to and including step 3(b) in Section 7.2.
Check the Lab 9 Partners link for your lab partner.
1. Lab Goals
-
Parallelize a shared-memory application using the
pthreads
library. -
Protect threaded code with synchronization primitives.
-
Revisit 2D arrays and file I/O in C.
-
Gain more expertise with C pointers and dynamic memory allocation.
-
Design input files to test the correctness of your solution,
-
Practice debugging threaded programs.
2. Lab Overview
This lab is designed to give you some practice writing and debugging multi-threaded pthreads programs, using synchronization primitives, and to give you some experience designing and running some parallel performance experiments.
You’ll implement a parallel version of Conway’s Game of Life using your Lab 6 sequential solution as a starting point for this lab. See the Lab 6 assignment to remind yourself about the sequential version of the program and the rules of the game.
Your Lab 9 solution will use the same method as Lab 6 for initializing a dynamically allocated board from an input file given as a command line argument. In addition to supporting the same command line options as Lab 6, this version of GOL will add three new command line options related to the parallelization. These command line options are outlined in detail below.
You will begin Lab 9 by first fixing any errors in your Lab 6 sequential solution.
Don’t worry about fixing the torus parts (wrapping at the edge cells) if you didn’t
have that working in Lab 6. As long as your Lab 6 gol
implementation for all the middle (non-edge) cells works, that’s enough to
move on to starting to parallelize your gol
implementation. And then changing
it to get ready for parallelization.
During the lab meeting time, we’ll also help you analyze your Lab 6 solution for any design choices that may make parallelization difficult, so that you can fix those before parallelizing.
After reading through the lab details and requirements, see Section 7 for tips on how to get started.
3. Lab Starting Point Code
3.1. Getting Your Lab Repo
Both you and your partner should clone your Lab repo into
your cs31/labs
subdirectory:
-
get your Lab ssh-URL from the CS31 git org. The repository to clone is named Lab9-userID1-userID2, where user1 and user2 are the user names of you and your lab partner.
-
cd into your
cs31/labs
subdirectory:$ cd ~/cs31/labs $ pwd
-
clone your repo
$ git clone [your Lab9-userID1-userID2 url] $ cd Lab9-userID1-userID2 $ ls
3.2. Starting Point Files
The files included in your repo are:
-
Makefile
: builds gol executable file -
colors.h
: contains an array of some different color3 values for use with ParaVis (you can use this or not in your solution) -
*.txt
: one or more example input files. You can copy more test files from your Lab 6 repo into this one, and you should create new input files too (particularly big ones to test the performance of your parallel solution). -
design_worksheet
: start your lab assignment by filling in this worksheet with the design of your parallelization of gol. You should bring this filled out worksheet with you to the ninja session, and to office hours and other times you are seeking help on your lab assignment. To print a copy of this worksheet to a CS lab printer, use thelpr
command:lpr design_worksheet
. See the "Getting Started" tips in Section 7.2 for information about how and when to use this worksheet.
One thing to note, is that there is no gol.c
file in the starting point code.
You’ll add this in from the Lab 6 solution that you or your partner
previously built.
3.3. Add gol.c to your repo and modify it
You need to copy over your gol.c
from the sequential game of life lab into
this repo as the starting point for this lab:
$ cd ~/cs31/labs/Lab6-userID1-userID2/
$ git pull
$ cd ~/cs31/labs/Lab9-userID1-userID2/
$ cp ~/cs31/labs/Lab6-userID1-userID2/gol.c .
$ git add gol.c
3.3.1. modifications to sequential gol.c before parallelizing
-
The first thing to do is to make sure that your gol.c file has the following includes (you may have already added
colors.h
in Lab 6):#include <pthreadGridVisi.h> #include <stdlib.h> #include <stdio.h> #include <unistd.h> #include <sys/time.h> #include <time.h> #include <string.h> #include <pthread.h> #include "colors.h"
-
Next, you will need to change your
play_gol
function prototype to match that of the thread function argument passed topthread_create
:void *play_gol(void *args)
This means you will have to modify it to return a value (returning
NULL
is fine), and you need to explicitly re-cast thevoid *
parameter to astruct gol_data *
inside this function in order to access field values of the passed argument. I recommend doing this by changing your function parameter name to a name not used inside your function (args
for example) and then adding a local variable with your old parameter name (data
for example) that you initialize to the re-castvoid *
parameter. This way your function code that refers todata
will work un-changed. For example:// change sequential play_gol function prototype from this: void play_gol(struct gol_data *data); // to this: void *play_gol(void *args); // be sure to change the return type too! // change name of parameter in function definition and local variable // of old paramter name and types that is assigned re-cast void* param void *play_gol(void *args) { struct gol_data *data; // add local variable of the type we know is passed in ... data = (struct gol_data *)args; // initialize it to the re-cast args value // function code can use data to reference fields in passed struct ... return NULL; // this function now needs to return some value!! }
-
For the
OUTPUT_VISI
mode in the sequential version, we hid a call topthread_create
that was needed by the ParaVis library. This code was hidden in a wrapper function namedconnect_animation
that was called frommain
. Now that your code is going to create its own threads, you will not be able to useconnect_animation
in this lab.As part of completing the lab, you will call
pthread_create
on your own in a loop, like we did in class and the in-lab example code. Those calls topthread_create
will do the same job asconnect_animation
, so you no longer need that function in your program.In other words:
// this line that currently exists main connect_animation(play_gol, &data); // will eventually be replaced by something like this, once you get // around to creating threads yourself: for (...) { pthread_create(..., NULL, play_gol, ...); // some details for you to figure out }
At this point in the lab, we’re not ready to make the changes we need to make the VISI code work seemlessly, so for now, just comment out the calls to
connect_animation
andrun_animation
. This means your VISI mode is broken, but we will talk about how restore that functionality when we get to Section 4.7.
After making these changes, compile and run your program, and test that this change to your sequential version of gol still works as before (with the exception of VISI mode being broken). |
4. Lab Details
4.1. Command line arguments
Your program will support the same command line arguments as the Lab 6 version, and will add three new command line arguments:
-
An integer to specify the number of threads (the degree of parallelism).
-
A 0/1 flag to specify how to parallelize the GOL program (0: row-wise grid cell allocation, 1: column-wise grid cell allocation). More details below.
-
A 0/1 flag to specify if the per-thread board allocation should be printed out or not at start up (0:don’t print configuration information, 1: print allocation information).
$ ./gol
usage: ./gol infile.txt output_mode[0,1,2] num_threads partition[0,1] print_partition[0,1]
Here are some example command lines:
# run with config values read from file1.txt, run in run mode OUTPUT_NONE (0)
# create 8 threads, use row-wise partitioning, print per-thread partitioning details:
$ ./gol file1.txt 0 8 0 1
# run with config file file2.txt, run in run mode OUTPUT_ASCII (1),
# create 15 threads, use column-wise partitioning, no partitioning details printed:
$ ./gol file2.txt 1 15 1 0
The print per-thread board allocation command line option will help you debug the grid cell allocation scheme to ensure that you are correctly allocating rows or columns across the worker threads. We suggest you try running with print partitioning when using run mode 0 (OUTPUT_NONE), as OUTPUT_ASCII will clear the terminal too quickly to view the partitioning information. See more details below about partitioning and the print partitioning option.
Your program should handle badly formed command lines and check for bad values entered (e.g., a negative number of threads). It should print out an error message and exit for badly formed command lines and handle most bad input values similarly.
4.2. Structure of parallel game of life
-
The main thread should begin by initializing the game board by reading the input file in the same way as the sequential version. The input files format has not changed. Your timing code should not include this step but should include all other parts of the execution.
-
After initializing the game board, the main thread spawns off worker threads that will play multiple rounds of the game, with each thread computing just its portion of cells of the new board each round (based on the partitioning information passed as a command line option). Grid cells are allocated to threads using either a row-wise or column-wise partitioning specified by a command line argument. If a command line is entered with more threads then the number of rows/columns to partition by, your program can print out an error message and exit. In other words, you don’t need to handle a case in which there are threads that have no rows/columns assigned to them. The total number of live cells each round should be calculated in parallel with each thread contributing to its portion of live cells to the total value. Note: If the print_partitioning option is specified, threads should print out their partition information before running rounds of gol.
-
If running in OUTPUT_ASCII mode, you should designate one thread to be the printing thread, and only that thread should call
print_board
to print the entire board after each round. Having threads print out just their part of the board will result in crazy interleaved output: a race condition! Synchronizing parallel board printing so that the board is printed in order is beyond what you need to support, and it would be less efficient than if just a single thread prints the entire board. -
If running in OUTPUT_VISI mode, each thread should update its part of the color3 image buffer in parallel and then make a call to
draw_ready
when it is done updating its part. -
At the end of the simulation’s execution, the main thread should print out the elapsed time (if run in mode OUTPUT_NONE and OUTPUT_ASCII). You will not be graded on absolute performance, but your code should run faster when run in parallel. As a sanity check, try running a large board instance (e.g., 1500x1500) for many rounds (e.g., 100+). You should see a substantial improvement between running with one thread and a handful of threads (2-8). It’s that sort of relative improvement that we’ll be looking for. Note: for comparison timing, run in mode OUTPUT_NONE, and make sure it does not include any extra synchronization or calls to
usleep
that may be part of the other run modes.
4.3. Synchronization
You will need to think about where and how to synchronize thread
actions to remove race conditions. For example, you need to ensure
that no thread starts the next round of play until all threads are
done with the current round, otherwise threads will not read
consistent values for the given round. If printing is enabled, you
need to ensure that the printing thread prints a consistent version of
the game board. If run in OUTPUT_ASCII
mode, you need to ensure
that the printing thread prints a consistent version of the game
board. You will also need to ensure that you don’t have a race
condition when updating the global live cell counter. Any
synchronization steps that are only necessary for a specific command
line option should not be included in runs that do not specify that
option.
4.4. Partitioning
You will implement two different ways of parallelizing the computation: one
partitions the board’s rows across threads, the other partitions the board’s
columns across threads. As an example, suppose that the board is 8x8, and that
there are 4 threads. Here is how the threads (with logical tids
0-3) would be
assigned to the grid using row and column partitioning, respectively:
row partitioning (option 0) column partitioning (option 1)
---------------- ----------------
0 0 0 0 0 0 0 0 0 0 1 1 2 2 3 3
0 0 0 0 0 0 0 0 0 0 1 1 2 2 3 3
1 1 1 1 1 1 1 1 0 0 1 1 2 2 3 3
1 1 1 1 1 1 1 1 0 0 1 1 2 2 3 3
2 2 2 2 2 2 2 2 0 0 1 1 2 2 3 3
2 2 2 2 2 2 2 2 0 0 1 1 2 2 3 3
3 3 3 3 3 3 3 3 0 0 1 1 2 2 3 3
3 3 3 3 3 3 3 3 0 0 1 1 2 2 3 3
When the number of threads does not evenly divide the dimension by which you
are partitioning, divide the rows (or columns) up so that there is at most a
difference of 1 assigned row (or column) between any two threads. For example,
for an 8x7 board and 3 threads, you would partition rows and columns like this
among the 3 threads (with tids
0-2):
row partitioning (option 0) column partitioning (option 1)
---------------- ----------------
0 0 0 0 0 0 0 0 0 0 1 1 2 2
0 0 0 0 0 0 0 0 0 0 1 1 2 2
0 0 0 0 0 0 0 0 0 0 1 1 2 2
1 1 1 1 1 1 1 0 0 0 1 1 2 2
1 1 1 1 1 1 1 0 0 0 1 1 2 2
1 1 1 1 1 1 1 0 0 0 1 1 2 2
2 2 2 2 2 2 2 0 0 0 1 1 2 2
2 2 2 2 2 2 2 0 0 0 1 1 2 2
In this partitioning scheme, a single row (or column) is assigned to exactly one thread; a single row (or column) is never split between two or more threads. Thus, in the above example threads 0 and 1 have one more row each of work to do than thread 2 in the row-wise partitioning, and thread 0 has one more column of work to do than threads 1 and 2 in the column-wise partitioning. Note how the imbalance of rows/columns is assigned to threads in this example. Your solution should assign them in the same way: every thread should either be assigned the same number of rows or columns or differ by at most one.
4.5. Printing Partitioning Information
When the 5th command line option is non-zero, your program should print thread partitioning information. Each thread should print:
-
Its thread id (the logical id number that you assign, not the
pthread_self
value). -
Its start and end row index values.
-
The total number of rows it is allocated.
-
Its start and end column index values.
-
The total number of columns it is allocated.
The threads will run in different orders, so you may see each thread’s output
in any order, which is fine. If you’d like, you can try to avoid some
interleaving of individual threads output by calling fflush
after printf
:
printf("tid %d my values ...\n", mytid, ...);
fflush(stdout); // force the printf output to be printed to the terminal
4.5.1. Example Partitioning Output
Here are some example runs with different numbers of threads partitioning a
100x100 grid (I’m using a config file I created named b100.txt
that defines
a 100x100 grid). The total number of rows and columns per thread are shown
in parentheses. Your output does not need to be identical to this, but it must
include all the required parts.
# 9 threads, column-wise partitioning
$ ./gol b100.txt 0 9 1 1
tid 0: rows: 0:99 (100) cols: 0:11 (12)
tid 1: rows: 0:99 (100) cols: 12:22 (11)
tid 2: rows: 0:99 (100) cols: 23:33 (11)
tid 4: rows: 0:99 (100) cols: 45:55 (11)
tid 3: rows: 0:99 (100) cols: 34:44 (11)
tid 5: rows: 0:99 (100) cols: 56:66 (11)
tid 6: rows: 0:99 (100) cols: 67:77 (11)
tid 7: rows: 0:99 (100) cols: 78:88 (11)
tid 8: rows: 0:99 (100) cols: 89:99 (11)
# 6 threads, row-wise partitioning
$ ./gol b100.txt 0 6 0 1
tid 0: rows: 0:16 (17) cols: 0:99 (100)
tid 1: rows: 17:33 (17) cols: 0:99 (100)
tid 3: rows: 51:67 (17) cols: 0:99 (100)
tid 2: rows: 34:50 (17) cols: 0:99 (100)
tid 4: rows: 68:83 (16) cols: 0:99 (100)
tid 5: rows: 84:99 (16) cols: 0:99 (100)
# 17 threads, row-wise partitioning
$ ./gol b100.txt 0 17 0 1
tid 0: rows: 0: 5 (6) cols: 0:99 (100)
tid 1: rows: 6:11 (6) cols: 0:99 (100)
tid 3: rows: 18:23 (6) cols: 0:99 (100)
tid 2: rows: 12:17 (6) cols: 0:99 (100)
tid 4: rows: 24:29 (6) cols: 0:99 (100)
tid 5: rows: 30:35 (6) cols: 0:99 (100)
tid 6: rows: 36:41 (6) cols: 0:99 (100)
tid 7: rows: 42:47 (6) cols: 0:99 (100)
tid 10: rows: 60:65 (6) cols: 0:99 (100)
tid 11: rows: 66:71 (6) cols: 0:99 (100)
tid 12: rows: 72:77 (6) cols: 0:99 (100)
tid 13: rows: 78:83 (6) cols: 0:99 (100)
tid 8: rows: 48:53 (6) cols: 0:99 (100)
tid 15: rows: 90:94 (5) cols: 0:99 (100)
tid 14: rows: 84:89 (6) cols: 0:99 (100)
tid 9: rows: 54:59 (6) cols: 0:99 (100)
tid 16: rows: 95:99 (5) cols: 0:99 (100)
Note that for the run with 17 threads, there are enough threads to see "out of order" execution due to the exact scheduling of threads on the CPUs. This is fine and expected behavior (the exact ordering of thread printing depends on when threads are scheduled to run on cores by the OS CPU scheduler).
4.6. Correctness
In addition to printing out per-thread board partitioning information to verify correct portioning, you should also ensure that your parallel version solves GOL correctly. To do this run both your parallel version and your sequential version on some of the same input files in OUTPUT_ASCII run mode (run mode 1). The results should be identical.
If you see incorrect and/or odd output, either your partitioning of the GOL game board is incorrect, or you are missing synchronization, or both. You can remove the call to system("clear") to see the history of each round to help you debug incorrect GOL computation (of course run for a small number of iterations).
4.7. Changes to ParaVis animation for pthreads
The ParaVis library supports animating pthread programs. You only need to make
a few small changes to get the OUTPUT_VISI
mode to work for the pthreads of
gol. The first few changes are all in main
:
-
The
init_pthread_animation
function takes the number of threads as the first parameter:visi_handle init_pthread_animation(int num_tids, int rows, int cols, char *name);
This number is the number of threads that will be spawned to simulate the parallel game of life and does not include the main thread. Only the main thread should call this function. The sequential version of
gol.c
hid this function from you in thesetup_animation
function. There, we passed 1 toinit_pthread_animation
since only 1 pthread was created to runplay_gol
(which we hid in theconnect_animation
function).In the parallel version, you should not be calling
setup_animation
anymore. Instead, you should callinit_pthread_animation
in main, update the first parameter to be the number of threads your program is running:// Previously in setup_animation, you had 1 thread hardcoded data->handle = init_pthread_animation(1, data->rows, data->cols, visi_name);
// Now, you should call init_pthread_animation in main and update it to be the
// number of threads you have in your program. (The variable names used below
// might not match yours.)
data.handle = init_pthread_animation(num_threads, data.rows, data.cols, visi_name);
+
-
You should move the code that gets the animation buffer as is from
setup_animation
to main. However, just like with theinit_pthread_animation
function above, only the main thread should call this function. Note that you’ll need to adjust the way you’re accessing thedata
struct in main, since it’s declared on main’s stack. -
As discussed in Section 3.3.1, you will replace the call to
connect_animation
with the loop that creates your threads. -
You should leave the line to
run_animation
unchanged, but that function call must be called before join your threads. -
In addition to the changes above that you made in
main
, your sequential code had the following comment inplay_gol
:// if ParaVis animation: // (a) call your function to update the color3 buffer // (b) call draw_ready(data->handle) // (c) call usleep(SLEEP_USECS) to slow down the animation
You need to modify the function that updates the
color3
buffer so that it is multi-threaded. As it’s currently written, this function colors every pixel of the image buffer. Since every thread calls this function, we don’t want every thread coloring every pixel. Instead, we only want each thread to color its subset of the pixels. Update this function so that it uses the partitioning information to only color the pixels that the thread is responsible for. -
In
play_gol
, be sure that every thread updates the color3 buffer as described above and every thread callsdraw_ready
when it is done updating the color3 buffer.
Further documentation about using the ParaVis interface is available as a comment at the top of the pthreadvisi_example.c
program from
Week 12. Also, more detailed documentation is available
in the pthreadGridVisi.h
header file that you can open in an editor:
$ vim /usr/local/include/qtvis/pthreadGridVisi.h
5. Sample Output
Here is output from a run of my gol program showing:
-
Example column-wise and row-wise partitioning output for different numbers of threads is shown in Section 4.5.1. Please follow our output format for your printing partitioning functionality.
Other program ouput should be identical to that from Lab 6.
-
Example runs showing the time improvement from the parallelization is shown below. This output was generated on a lab machine with 12 cores, using a configuration file (
big.txt
) that specified a 4000x4000 board and 500 iterations. The program was run for different numbers of threads (1, 4, 8, 12, and 20). Notice the time improvement with increased parallelization up to the number of cores (from 298 seconds for 1 thread to 50 seconds for 12 threads). Beyond the total number of cores (12, in the case), adding more threads does not help to improve total runtime.$ ./gol big.txt 0 1 0 0 Total time: 297.943 seconds After 500 rounds on 4000x4000, number of live cells is 17 $ ./gol big.txt 0 4 0 0 Total time: 78.617 seconds After 500 rounds on 4000x4000, number of live cells is 17 $ ./gol big.txt 0 8 0 0 Total time: 71.223 seconds After 500 rounds on 4000x4000, number of live cells is 17 $ ./gol big.txt 0 12 0 0 Total time: 50.128 seconds After 500 rounds on 4000x4000, number of live cells is 17 $ ./gol big.txt 0 20 0 0 Total time: 53.965 seconds After 500 rounds on 4000x4000, number of live cells is 17
These are runs with just the normal Makefile gcc command (there is no optimiztion compiler flag used to build this code). With compiler optimization each run would be much faster, but the trend of improvement from the multithreaded would remain.
6. Lab Requirements
-
Your program must satisfy all the requirements of the sequential gol lab, except that if you did not get the torus to work correctly, you do not need to fix that part (i.e. if your edge cells are not quite in your sequential gol, that is okay for your parallel solution).
-
Your program should handle the new command line arguments in addition to the previous ones.
-
The
total_live
global variable that should be updated each round to the number of live cells. Its value should be computed in parallel each round with each thread updating the portion oftotal_live
value for a round from its portion of the grid. -
The only other global variables you may add are any needed for pthread synchronization. Any values that threads need should be passed to them via the (
void *arg
) parameter. Look at the example pthread programs from Week 12, from in class, and in the textbook for some examples of how to do this. You will likely need to add fields to yourstruct gol_data
type that are needed for threads to perform gol on their part of the grid (e.g. which rows and columns this thread is operating on). -
The control flow in your sequential
main
function doesn’t need much change. You need to parse the additional command line arguments and make calls topthread_create
. If you find you are adding a lot of code tomain
, stop and create a function to do this work and call that function frommain
:main
should not get a lot longer, and its main control flow really should not change. -
For full credit, your solution should be correct, robust, and free of valgrind errors (when run in modes OUTPUT_ASCII or OUTPUT_NONE). "Still reachable" errors are okay.
-
You are NOT responsible for running valgrind in the OUTPUT_VISI mode (the ParaVis visualization mode). "Still reachable" errors are okay.
-
All
TODO
comments in the starting point code should be removed in your final submission. These are notes to you for what you need to add into the starting point code and where you need to add it. -
Use good modular design, and comment your code well.
-
Last lab! Good luck!
7. Getting Started
7.1. Get sequential gol.c ready for parallelizing
-
If you are working with a different partner than you had for Lab 6, start by deciding whose
gol.c
solution you will use for a starting point for this lab. -
After copying your Lab 6
gol.c
into your repo, make the modifications described in Section 3.3.1. Compile and run to make sure it works. -
Then fix any any parts of your Lab 6
gol.c
that will make parallelizing it difficult, and that are memory inefficient. As necessary, we may have included comments as part of your gradesheet that are in your Lab 6 repo. We can also look at your solution in lab on Wednesday and help you figure out changes that you may need to make.
7.2. Get Started on Parallelization
-
Start by reviewing Section 14.2 and 14.3 of the textbook and the in-lab examples.
-
Next, print out the
design_worksheet
from your repo and follow the directions for outlining the design of your solution (see Section 3.2 for printing and directions for how to this). -
After filling out the worksheet with your design. I recommend implementing and testing functionality in this order:
-
Without spawning any threads, write the partitioning function, i.e. figure out which section of the board would be assigned to a thread. Test the function’s implementation first without spawning threads by adding some calls in main to test different sizes and partitioning schemes (and remove these calls before moving on to the next step).
-
Implement the main thread create-join control flow, and test each thread’s row/column partitioning. Spawn the number of threads specified at the command line and have each thread just print its partitioning information (including its
tid
value) inplay_gol
and just return before actually running the remainder of the code inplay_gol
. (Be sure to remove thisreturn
after you’ve tested this and before moving on to the next step). -
Add fields to the
struct gol_data
that you will need for parallelization and add any code to initialize these fields. Some of these fields may be initialized by the main thread before spawning and other fields may be initialized by the threads themselves before they start playing rounds of gol). You can add some debug output to test this. As with the previous step, you should return from theplay_gol
function before executing the rest of the code onplay_gol
to test this step. When you’re confident it’s working, you should remove thereturn
before moving on to the next step. -
Add code so that each thread prints out its own partitioning information only when the final command line argument is non-zero.
-
Write and test the code to compute one round of the GOL simulation, with the work divided among the threads. This will include support for all run modes. Start with testing OUTPUT_ASCII on a small file with just one round. Once that is working, test OUTPUT_VISI to be sure things are working correctly.
-
Compute all rounds of the gol simulation, avoiding race conditions.
-
Count the number of live cells in a multi-threaded way, avoiding race conditions.
-
Check timing for multi-threaded in OUTPUT_NONE mode. You should see performance improvements with multi-threaded vs. sequential for large boards.
-
Make sure all other requirements have been met.
-
8. Tips
-
Look at the Lab 6 page for lots of tips related to sequential GOL that may be useful here too.
-
Implement and test incrementally!
-
Most of your changes will be to the
play_gol
function. You should also add helper functions to implement any new functionality.main
should not get much longer, so if you need to add alot of code to main, or you are adding the same code to multiple parts, functionize it and call it from main. Plus you can independently test this code by adding debugging calls to this function from main! -
You will need to add fields to your
struct gol_data
type that are needed by threads to perform gol on their part of the grid. -
Use gdb as you incrementally implement and test to find and fix bugs as you go. Use valgrind as you go to catch memory access errors as you make them.
-
As you test and debug, you may find it useful to use config files with small numbers of iterations and to comment out the call to system("clear") so you can examine the results of every intermediate iteration.
-
Running in OUTPUT_ASCII is helpful for finding and debugging partitioning errors and other correctness errors.
-
Running in OUTPUT_VISI and OUTPUT_ASCII are helpful for finding and debugging partitioning errors and other correctness errors.
-
ParaVis with colors.h. If you’d like to use some of the colors I have defined in
colors.h
you can set an RGB value in thecolor3
image array to a value incolors
like this:// assume buff is name of the color3 array: buff[buff_i] = colors[3]; // set this pixel to the 3rd color (Yellow)
If you want to use the colors to help with debugging row and column partitioning you can create the rainbow pattern of my examples by having each thread be assigned a color based on its thread id, and have it print its color for 0 (or 1) cell values in its partition. For example, if
data→tid
stores the value of a thread’s logical id, you could do something like this:if(live) { buff[buff_i] = c3_black; // set live cells to black } else { // assumes your thread id is stored in your struct as tid; // modify the line below if you called it something different. buff[buff_i] = colors[((data->tid)%8)]; // dead to my tid color }
Another thing to remember about the ParaVis display is that it displays coordinate (0,0) at the bottom left of the visualization, and coordinate (rows-1, 0) at the top left. This means you will need to perform some calculation on the row index into the color3 array to "flip" the visualization horizontally to match your view of it (so the ascii and the ParaVis display the world in the same way for the same input file).
-
The man pages for the various pthread functions are very helpful.
-
You can use the
perror()
function to print out error messages from failed system and pthread library calls.perror
prints out detailed, human-readable information about the error that occurred. You call it passing in a string with program-specific error message. For example:big_buff = malloc(...); if (big_buff == NULL) { perror("malloc of big buffer failed"); exit(1); }
See the man page for
perror
for more information and for the include files you need to add to your program to use it. -
You will need to use some synchronization in your solution. You may use any of the pthread synchronization primitives, including mutexes and barriers. Remember that before using any of these, you need to initialize them. You also need to make sure that all threads have access to the shared synchronization variables (either pass a reference to them at thread creation or declare them as static global variables). Here are some code snippets for using the barrier synchronization primitive:
// declare a global pthread_barrier_t static pthread_barrier_t barrier; // initialize it in a function (e.g., main) before using it: if(pthread_barrier_init(&barrier, NULL, num_threads)){ perror("pthread_barrier_init"); exit(1); } // threads than can call pthread_barrier_wait to synchronize: ret = pthread_barrier_wait(&barrier); if(ret != 0 && ret != PTHREAD_BARRIER_SERIAL_THREAD) { perror("pthread_barrier_wait"); exit(1); }
9. Submitting
Here are the commands to submit your solution
(from one of you or your partner’s cs31/labs/Lab9-userID1-userID2
directory:
$ git add gol.c *.txt
$ git commit -m "Lab 9 completed"
$ git push
And of course, if you want to submit any nifty starting point world config files you come up with, please do:
$ git add coolworld.txt
$ git commit -m "cool world"
$ git push
If you have difficulty pushing your changes, see the "Troubleshooting" section and "can’t push" sections at the end of the Using Git for CS31 Labs page. And for more information and help with using git, see the git help page.
10. Handy Resources
General Lab Resources
-
Class EdSTEM page for questions and answers about lab assignment
-
man and Manual pages documentation for libraries and commands
C programming Resource
-
C Programming Resources and Links including I/O in C, cmdline args, arrays
-
Week 8 in-lab examples: ParaVis, and other sequential GOL related examples.