This lab has two parts:
Checkpoint: Thursday, December 7 push to git repo before 11:59pm.
Complete lab: Due by 11:59 pm, Thursday, December 14, 2023
By the checkpoint deadline, your program should implement functionality up to and including step 3(d) in Section 7.2.
Check the Lab 9 Partners link for your lab partner.
1. Lab Goals
-
Parallelize a shared-memory application using the
pthreads
library. -
Protect threaded code with synchronization primitives.
-
Revisit 2D arrays and file I/O in C.
-
Gain more expertise with C pointers and dynamic memory allocation.
-
Design input files to test the correctness of your solution.
-
Practice debugging threaded programs.
2. Lab Overview
This lab is designed to give you some practice writing and debugging
multi-threaded pthreads
programs, using synchronization primitives, and to give you some experience designing and running some parallel performance experiments.
You’ll implement a parallel version of Conway’s Game of Life using your Lab 06 sequential solution as a starting point for this lab. See the Lab 06 assignment to remind yourself about the sequential version of the program and the rules of the game.
Your Lab 9 solution will use the same method as Lab 06 for initializing a dynamically allocated board from an input file given as a command line argument. In addition to supporting the same command line options as Lab 06, this version of GOL will add three new command line options related to the parallelization. These command line options are outlined in detail below.
You must begin Lab 9 by first fixing any errors in your Lab 06 sequential solution. You can start on this lab if the only problem with your previous solution is that the wrapping at the edge doesn’t work properly: as long as your Lab 06 implementation for all the middle (non-edge) cells works, that’s enough to move on to starting to parallelize your implementation. You can go back and fix the wrapping at the edge at the end.
During the lab meeting time, we may help you analyze your Lab 06 solution for any design choices that may make parallelization difficult, so that you can fix those before parallelizing.
After reading through the lab details and requirements, see Section 7 for tips on how to get started.
3. Lab Starting Point Code
3.1. Getting Your Lab Repo
Both you and your partner should clone your Lab repo into
your cs31/labs
subdirectory:
-
Get your lab ssh-URL from the CS31 GitHub organization. The repository to clone is named Lab9-userID1-userID2, where user1 and user2 are the user names of you and your lab partner.
-
cd into your
cs31/labs
subdirectory:$ cd ~/cs31/labs $ pwd
-
Clone your repo
$ git clone [your Lab9-userID1-userID2 url] $ cd Lab9-userID1-userID2 $ ls big.txt colors.h design_worksheet.txt Makefile square100.txt testedges.txt
3.2. Starting Point Files
The files included in your repo are:
-
Makefile
: builds thegol
executable file -
colors.h
: contains an array of some differentcolor3
values for use with ParaVis (you can optionally use this in your solution) -
design_worksheet.txt
: a worksheet to help you think through some of design challenges in this lab. -
big.txt
,testedges.txt
,square100.txt
: example input files. You can copy more test files from your Lab 06 repo into this one, and you should create new input files too (particularly big ones to test the performance of your parallel solution).
One thing to note, is that there is no gol.c
file in the starting point code!
You’ll add this in from the Lab 06 solution that you or your partner
previously built.
3.3. Add gol.c to your repo and modify it
You need to copy over your gol.c
from the sequential game of life lab into
this repo as the starting point for this lab:
$ cd ~/cs31/labs/Lab6-userID1-userID2/
$ git pull
$ cd ~/cs31/labs/Lab9-userID1-userID2/
$ cp ~/cs31/labs/Lab6-userID1-userID2/gol.c .
$ git add gol.c
3.3.1. modifications to sequential gol.c
before parallelizing
-
The first thing to do is to make sure that your
gol.c
file has the following includes (you may have already addedcolors.h
in Lab 6):#include <pthreadGridVisi.h> #include <stdlib.h> #include <stdio.h> #include <unistd.h> #include <sys/time.h> #include <time.h> #include <string.h> #include <pthread.h> #include "colors.h"
-
Next, you will need to change your
play_gol
function prototype to match that of the thread function argument passed topthread_create
:void *play_gol(void *args)
This means you will have to modify it to return a value (returning
NULL
is fine), and you need to explicitly re-cast thevoid *
parameter to astruct gol_data *
inside this function in order to access field values of the passed argument. I recommend doing this by changing your function parameter name to a name not used inside your function (args
for example) and then adding a local variable with your old parameter name (data
for example) that you initialize to the re-castvoid *
parameter. This way your function code that refers todata
will work un-changed. For example:// change sequential play_gol function prototype from this: void play_gol(struct gol_data *data); // to this: void *play_gol(void *args); // be sure to change the return type too! // change name of parameter in function definition and local variable // of old paramter name and types that is assigned re-cast void* param void *play_gol(void *args) { struct gol_data *data; // add local variable of the type we know is passed in ... data = (struct gol_data *)args; // initialize it to the re-cast args value // function code can use data to reference fields in passed struct ... return NULL; // this function now needs to return some value!! }
-
For the
OUTPUT_VISI
mode in the sequential version, we hid a call topthread_create
that was needed by the ParaVis library. This code was hidden in a wrapper function namedconnect_animation
that was called frommain
. Now that your code is going to create its own threads, you will not be able to useconnect_animation
in this lab.As part of completing the lab, you will call
pthread_create
on your own in a loop, like we did in class and the in-lab example code. Those calls topthread_create
will do the same job asconnect_animation
, so you no longer need that function in your program.In other words:
// this line that currently exists main connect_animation(play_gol, &data); // will eventually be replaced by something like this, once you get // around to creating threads yourself: for (...) { pthread_create(..., NULL, play_gol, ...); // some details for you to figure out }
At this point in the lab, we’re not ready to make the changes we need to make the VISI code work seemlessly, so for now, just comment out the calls to
connect_animation
andrun_animation
. This means your VISI mode is broken, but we will talk about how restore that functionality when we get to Section 4.7.
After making these changes, compile and run your program, and test that this change to your sequential version of gol still works as before (with the exception of VISI mode being broken). |
4. Lab Details
4.1. Command line arguments
Your program will support the same command line arguments as the Lab 06 version, and will add three new command line arguments:
-
An integer to specify the number of threads (the degree of parallelism).
-
A 0/1 flag to specify how to parallelize the GOL program (0: row-wise grid cell allocation, 1: column-wise grid cell allocation). More details below.
-
A 0/1 flag to specify if the per-thread board allocation should be printed out or not at start up (0:don’t print configuration information, 1: print allocation information).
$ ./gol
usage: ./gol infile.txt output_mode[0,1,2] num_threads partition[0,1] print_partition[0,1]
Here are some example command lines:
# run with config values read from file1.txt, run in run mode OUTPUT_NONE (0)
# create 8 threads, use row-wise partitioning, print per-thread partitioning details:
$ ./gol file1.txt 0 8 0 1
# run with config file file2.txt, run in run mode OUTPUT_ASCII (1),
# create 15 threads, use column-wise partitioning, no partitioning details printed:
$ ./gol file2.txt 1 15 1 0
The print per-thread board allocation command line option will help you debug
the grid cell allocation scheme to ensure that you are correctly allocating
rows or columns across the worker threads. We suggest you try running
with print partitioning when using run mode 0 (OUTPUT_NONE
), as
OUTPUT_ASCII
will clear the terminal too quickly to view the partitioning
information. See more details below about partitioning and the print
partitioning option.
Your program should handle badly formed command lines and check for bad values entered (e.g., a negative number of threads). It should print out an error message and exit for badly formed command lines and handle most bad input values similarly.
4.2. Structure of the parallel game of life
-
The main thread should begin by initializing the game board by reading the input file in the same way as the sequential version. The input files format has not changed. Your timing code should not include this step but should include all other parts of the execution.
-
After initializing the game board, the main thread spawns off worker threads that will play multiple rounds of the game, with each thread computing just its portion of cells of the new board each round (based on the partitioning information passed as a command line option). Grid cells are allocated to threads using either a row-wise or column-wise partitioning specified by a command line argument. If a command line is entered with more threads then the number of rows/columns to partition by, your program can print out an error message and exit. In other words, you don’t need to handle a case in which there are threads that have no rows/columns assigned to them. The total number of live cells each round should be calculated in parallel with each thread contributing to its portion of live cells to the total value. Note: If the print_partitioning option is specified, threads should print out their partition information before running rounds of gol.
-
If running in
OUTPUT_ASCII
mode, you should designate one thread to be the printing thread, and only that thread should callprint_board
to print the entire board after each round. Having threads print out just their part of the board will result in crazy interleaved output: a race condition! (Synchronizing parallel board printing so that the board is printed in order is beyond what you need to support, and it would be less efficient than if just a single thread prints the entire board.) -
If running in
OUTPUT_VISI
mode, each thread should update its part of thecolor3
image buffer in parallel and then make a call todraw_ready
when it is done updating its part. -
At the end of the simulation’s execution, the main thread should print out the elapsed time (if run in mode
OUTPUT_NONE
orOUTPUT_ASCII
). You will not be graded on absolute performance, but your code should run faster when run in parallel. As a sanity check, try running a large board instance (e.g., 1500x1500) for many rounds (e.g., 100+). You should see a substantial improvement between running with one thread and a handful of threads (2-8). It’s that sort of relative improvement that we’ll be looking for. Note: for comparison timing, run in modeOUTPUT_NONE
, and make sure it does not include any extra synchronization or calls tousleep
that may be part of the other run modes.
4.3. Synchronization
You will need to think about where and how to synchronize thread
actions to remove race conditions. For example, you need to ensure
that no thread starts the next round of play until all threads are
done with the current round, otherwise threads will not read
consistent values for the given round. If printing is enabled, you
need to ensure that the printing thread prints a consistent version of
the game board. If run in OUTPUT_ASCII
mode, you need to ensure
that the printing thread prints a consistent version of the game
board. You will also need to ensure that you don’t have a race
condition when updating the global live cell counter. Any
synchronization steps that are only necessary for a specific command
line option should not be included in runs that do not specify that
option.
4.4. Partitioning
You will implement two different ways of parallelizing the computation: one
partitions the board’s rows across threads, the other partitions the board’s
columns across threads. As an example, suppose that the board is 8x8, and that
there are 4 threads. Here is how the threads (with logical tids
0-3) would be
assigned to the grid using row and column partitioning, respectively:
row partitioning (option 0) column partitioning (option 1)
---------------- ----------------
0 0 0 0 0 0 0 0 0 0 1 1 2 2 3 3
0 0 0 0 0 0 0 0 0 0 1 1 2 2 3 3
1 1 1 1 1 1 1 1 0 0 1 1 2 2 3 3
1 1 1 1 1 1 1 1 0 0 1 1 2 2 3 3
2 2 2 2 2 2 2 2 0 0 1 1 2 2 3 3
2 2 2 2 2 2 2 2 0 0 1 1 2 2 3 3
3 3 3 3 3 3 3 3 0 0 1 1 2 2 3 3
3 3 3 3 3 3 3 3 0 0 1 1 2 2 3 3
When the number of threads does not evenly divide the dimension by which you
are partitioning, divide the rows (or columns) up so that there is at most a
difference of 1 assigned row (or column) between any two threads. For example,
for an 8x7 board and 3 threads, you would partition rows and columns like this
among the 3 threads (with tids
0-2):
row partitioning (option 0) column partitioning (option 1)
---------------- ----------------
0 0 0 0 0 0 0 0 0 0 1 1 2 2
0 0 0 0 0 0 0 0 0 0 1 1 2 2
0 0 0 0 0 0 0 0 0 0 1 1 2 2
1 1 1 1 1 1 1 0 0 0 1 1 2 2
1 1 1 1 1 1 1 0 0 0 1 1 2 2
1 1 1 1 1 1 1 0 0 0 1 1 2 2
2 2 2 2 2 2 2 0 0 0 1 1 2 2
2 2 2 2 2 2 2 0 0 0 1 1 2 2
In this partitioning scheme, a single row (or column) is assigned to exactly one thread; a single row (or column) is never split between two or more threads. Thus, in the above example threads 0 and 1 have one more row each of work to do than thread 2 in the row-wise partitioning, and thread 0 has one more column of work to do than threads 1 and 2 in the column-wise partitioning. Note how the imbalance of rows/columns is assigned to threads in this example. Your solution should assign them in the same way: every thread should either be assigned the same number of rows or columns or differ by at most one.
4.5. Printing Partitioning Information
When the 5th command line option is 1, your program should print thread partitioning information. Each thread should print:
-
Its thread id (the logical id number that you assign, not the
pthread_self
value). -
Its start and end row index values.
-
The total number of rows it is allocated.
-
Its start and end column index values.
-
The total number of columns it is allocated.
The threads will run in different orders, so you may see each thread’s output
in any order, which is fine. If you’d like, you can try to avoid some
interleaving of individual threads output by calling fflush
after printf
:
printf("tid %2d: ...\n", mytid, ...);
fflush(stdout); // force the printf output to be printed to the terminal
4.5.1. Example Partitioning Output
Here are some example runs with different numbers of threads partitioning a
100x100 grid. (I’m using a config file I created named square100.txt
included in
your repo.) The total number of rows and columns per thread are shown
in parentheses. Your output does not need to be identical to this, but it must
include all the required parts.
# 9 threads, column-wise partitioning
$ ./gol square100.txt 0 9 1 1
tid 0: rows: 0:99 (100) cols: 0:11 (12)
tid 1: rows: 0:99 (100) cols: 12:22 (11)
tid 2: rows: 0:99 (100) cols: 23:33 (11)
tid 4: rows: 0:99 (100) cols: 45:55 (11)
tid 3: rows: 0:99 (100) cols: 34:44 (11)
tid 5: rows: 0:99 (100) cols: 56:66 (11)
tid 6: rows: 0:99 (100) cols: 67:77 (11)
tid 7: rows: 0:99 (100) cols: 78:88 (11)
tid 8: rows: 0:99 (100) cols: 89:99 (11)
# 6 threads, row-wise partitioning
$ ./gol square100.txt 0 6 0 1
tid 0: rows: 0:16 (17) cols: 0:99 (100)
tid 1: rows: 17:33 (17) cols: 0:99 (100)
tid 3: rows: 51:67 (17) cols: 0:99 (100)
tid 2: rows: 34:50 (17) cols: 0:99 (100)
tid 4: rows: 68:83 (16) cols: 0:99 (100)
tid 5: rows: 84:99 (16) cols: 0:99 (100)
# 17 threads, row-wise partitioning
$ ./gol square100.txt 0 17 0 1
tid 0: rows: 0: 5 (6) cols: 0:99 (100)
tid 1: rows: 6:11 (6) cols: 0:99 (100)
tid 3: rows: 18:23 (6) cols: 0:99 (100)
tid 2: rows: 12:17 (6) cols: 0:99 (100)
tid 4: rows: 24:29 (6) cols: 0:99 (100)
tid 5: rows: 30:35 (6) cols: 0:99 (100)
tid 6: rows: 36:41 (6) cols: 0:99 (100)
tid 7: rows: 42:47 (6) cols: 0:99 (100)
tid 10: rows: 60:65 (6) cols: 0:99 (100)
tid 11: rows: 66:71 (6) cols: 0:99 (100)
tid 12: rows: 72:77 (6) cols: 0:99 (100)
tid 13: rows: 78:83 (6) cols: 0:99 (100)
tid 8: rows: 48:53 (6) cols: 0:99 (100)
tid 15: rows: 90:94 (5) cols: 0:99 (100)
tid 14: rows: 84:89 (6) cols: 0:99 (100)
tid 9: rows: 54:59 (6) cols: 0:99 (100)
tid 16: rows: 95:99 (5) cols: 0:99 (100)
Note that for the run with 17 threads, there are enough threads to see "out of order" execution due to the exact scheduling of threads on the CPUs. This is fine and expected behavior (the exact ordering of thread printing depends on when threads are scheduled to run on cores by the OS CPU scheduler).
4.6. Correctness
In addition to printing out per-thread board partitioning information to verify
correct portioning, you should also ensure that your parallel version solves
GOL correctly. To do this run both your parallel version and your sequential
version on some of the same input files in OUTPUT_ASCII
run mode (run mode 1).
The results should be identical.
If you see incorrect and/or odd output, either your partitioning of the
GOL game board is incorrect, or you are missing synchronization, or both.
You can remove the (multiple) calls to system("clear")
to see the history
of each round which should help you debug incorrect GOL computation.
Of course, try this first using a small number of iterations!
4.7. Changes to ParaVis animation for pthreads
You should not be focusing on this section at all until after you have the checkpoint complete.
The ParaVis library supports animating pthread
programs. You only
need to make a few small changes to get the OUTPUT_VISI
mode to work
for the parallel version of GOL.
-
In the sequential version of
gol.c
, you called the functionsetup_animation
. You will need to call that function in the parallel version too, but we will need to modify thesetup_animation
function to handle the fact that we now have many threads running the simulation.Find the
setup_animation
function in your code. Notice that there is a call to function namedinit_pthread_animation
. This function takes the number of threads as the first parameter:visi_handle init_pthread_animation(int num_tids, int rows, int cols, char *name);
In the sequential version of
gol.c
, we hard-coded the number of threads to be 1 since only one pthread was created to runplay_gol
(which we hid in theconnect_animation
function, to be discussed in a moment):// Previously in setup_animation, you had 1 thread hardcoded in setup_animation int num_threads = 1; data->handle = init_pthread_animation(num_threads, data->rows, data->cols, visi_name);
In the parallel version, you need the
num_threads
variable to equal the number of threads you are running in your simulation:// Be sure to set the variable num_threads equal to the number of threads // running the simulation. int num_threads = data->num_threads; // this name may differ in your implementation data->handle = init_pthread_animation(num_threads, data->rows, data->cols, visi_name);
-
As discussed in Section 3.3.1, you will comment out your call to
connect_animation
because that is now replaced with the loop that creates your threads. -
You should leave the line to
run_animation
inmain
unchanged, but that function call must be called before you join your threads. You may need to reorganize the structure in this part ofmain
to be sure you callrun_animation
before joining the threads. -
In addition to the changes above that you made in
main
, your sequential code had a call toupdate_colors
that set thecolor3
buffer. (You may have called this function something different, though we recommended you use the nameupdate_colors
at the time.)You need to modify the
update_colors
function so that it is multi-threaded. As it’s currently written, this function colors every pixel of the image buffer. Since you callupdate_colors
fromplay_gol
, every thread calls this function. But, we don’t want every thread coloring every pixel. Instead, we only want each thread to color the pixels in its partition of the board. Update this function so that it uses the partitioning information to only color the pixels that the thread is responsible for.In addition, you should have threads use different colors to color the board. The
colors.h
file (which you can open in your editor to view) contains an array calledcolors
that you can use so that, e.g. thread 0 colors the board usingcolors[0]
, and thread 1 colors the board usingcolors[1]
, etc. You will need to consider how to use this array when there are more than 8 threads. -
In
play_gol
, in addition to making sure that every thread callsupdate_colors
, you need to make sure every thread also callsdraw_ready
when it is done updating the color3 buffer.
Further documentation about using the ParaVis interface is available as a comment at the top of the pthreadvisi_example.c
program from
Week 12. Also, more detailed documentation is available
in the pthreadGridVisi.h
header file that you can open in an editor:
$ vim /usr/local/include/qtvis/pthreadGridVisi.h
5. Sample Output
Here is output from a run of my gol program showing:
-
Example column-wise and row-wise partitioning output for different numbers of threads is shown in Section 4.5.1. Please follow our output format for your printing partitioning functionality.
Other program ouput should be identical to that from Lab 06.
-
Example runs showing the time improvement from the parallelization is shown below. This output was generated on a lab machine with 12 cores, using a configuration file (
big.txt
) that specified a 4000x4000 board and 500 iterations. The program was run for different numbers of threads (1, 4, 8, 12, and 20). Notice the time improvement with increased parallelization up to the number of cores (from 298 seconds for 1 thread to 50 seconds for 12 threads). Beyond the total number of cores (12, in the case), adding more threads does not help to improve total runtime.$ ./gol big.txt 0 1 0 0 Total time: 297.943 seconds After 500 rounds on 4000x4000, number of live cells is 17 $ ./gol big.txt 0 4 0 0 Total time: 78.617 seconds After 500 rounds on 4000x4000, number of live cells is 17 $ ./gol big.txt 0 8 0 0 Total time: 71.223 seconds After 500 rounds on 4000x4000, number of live cells is 17 $ ./gol big.txt 0 12 0 0 Total time: 50.128 seconds After 500 rounds on 4000x4000, number of live cells is 17 $ ./gol big.txt 0 20 0 0 Total time: 53.965 seconds After 500 rounds on 4000x4000, number of live cells is 17
These are runs with just the normal Makefile gcc command (there is no optimiztion compiler flag used to build this code). With compiler optimization each run would be much faster, but the trend of improvement from the multithreaded would remain.
6. Lab Requirements
-
Your program must satisfy all the requirements of the sequential gol lab, except that if you did not get the edges to work correctly, you do not need to fix that part (i.e. if your edge cells are not quite in your sequential gol, that is okay for your parallel solution).
-
Your program should handle the new command line arguments in addition to the previous ones.
-
The
total_live
global variable that should be updated each round to the number of live cells. Its value should be computed in parallel each round with each thread updating the portion oftotal_live
value for a round from its portion of the grid. -
The only other global variables you may add are any needed for
pthread
synchronization. Any values that threads need should be passed to them via the (void *arg
) parameter. Look at the examplepthread
programs from Week 12, from in class, and in the textbook for some examples of how to do this. You will likely need to add fields to yourstruct gol_data
type that are needed for threads to perform gol on their part of the grid (e.g. which rows and columns this thread is operating on and the logical thread id number). -
The control flow in your sequential
main
function doesn’t need much change. You need to parse the additional command line arguments and make calls topthread_create
. If you find you are adding a lot of code tomain
, stop and create a function to do this work and call that function frommain
:main
should not get a lot longer, and its main control flow really should not change. -
For full credit, your solution should be correct, robust, and free of valgrind errors (when run in modes
OUTPUT_ASCII
orOUTPUT_NONE
). "Still reachable" errors are okay. -
You are NOT responsible for running valgrind in the
OUTPUT_VISI
mode (the ParaVis visualization mode). "Still reachable" errors are okay. -
All
TODO
comments in the starting point code should be removed in your final submission. These are notes to you for what you need to add into the starting point code and where you need to add it. -
Use good modular design, and comment your code well.
-
Last lab! 🎉 Good luck! 🥳
7. Getting Started
7.1. Get sequential gol.c ready for parallelizing
-
If you are working with a different partner than you had for Lab 6, start by deciding whose
gol.c
solution you will use for a starting point for this lab. -
After copying your Lab 06
gol.c
into your repo, make the modifications described in Section 3.3.1. Compile and run to make sure it works. -
Then fix any any parts of your Lab 06
gol.c
that will make parallelizing it difficult, and that are memory inefficient. As necessary, we may have included comments as part of your gradesheet that are in your Lab 6 repo. We can also look at your solution in lab on Wednesday and help you figure out changes that you may need to make.
7.2. Get Started on Parallelization
-
Start by reviewing Section 14.2 and 14.3 of the textbook and the in-lab examples.
-
The
design_worksheet.txt
file will not be graded, but it contains helpful ideas for outlining the design of your solution. You should consider reviewing this before starting to write code. -
We recommend that you implement and test functionality in this order:
-
Add fields to the
struct gol_data
that you will need to let each thread know which part of the partitioned board it is responsible for and any code needed to initialize these fields. Some of these fields may be initialized by the main thread before spawning and other fields may be initialized by the threads themselves before they start playing rounds of gol. (You can always go back and add more fields to the struct later if you need to.) -
Without spawning any threads, write the partitioning function, i.e. figure out which section of the board would be assigned to each thread. Test the function’s implementation first without spawning any threads by adding some calls in main to test different sizes and partitioning schemes (and remove these calls before moving on to the next step).
-
Implement the main thread create-join control flow, and test each thread’s row/column partitioning. Spawn the number of threads specified at the command line and have each thread print its partitioning information (including its logical
tid
value) inplay_gol
and then return before actually running the remainder of the code inplay_gol
. (Be sure to remove thisreturn
after you’ve tested this and before moving on to step (e) below). -
Add code so that each thread prints out its own partitioning information only when the final command line argument is non-zero.
-
Write and test the code to play one round of the GOL simulation, with the work divided among the threads. This will include support for all run modes. Start with testing
OUTPUT_ASCII
on a small file with just one round. Once that is working, testOUTPUT_VISI
to be sure things are working correctly. -
Compute all rounds of the GOL simulation, avoiding race conditions.
-
Count the number of live cells in a multi-threaded way, avoiding race conditions.
-
Check timing for multi-threaded in
OUTPUT_NONE
mode. You should see performance improvements with multi-threaded vs. sequential for large boards. -
Make sure all other requirements have been met.
-
8. Tips
-
Look at the Lab 06 page for lots of tips related to sequential GOL that may be useful here too.
-
Implement and test incrementally!
-
Most of your changes will be to the
play_gol
function. You should also add helper functions to implement any new functionality.main
should not get much longer, so if you need to add alot of code to main, or you are adding the same code to multiple parts, functionize it and call it from main. Plus you can independently test this code by adding debugging calls to this function from main! -
You will need to add fields to your
struct gol_data
type that are needed by threads to perform gol on their part of the grid. -
Use gdb as you incrementally implement and test to find and fix bugs as you go. Use valgrind as you go to catch memory access errors as you make them.
-
As you test and debug, you may find it useful to use config files with small numbers of iterations. You may also want to comment out the call to
system("clear")
so you can examine the results of every intermediate iteration. -
Running in
OUTPUT_VISI
andOUTPUT_ASCII
are helpful for finding and debugging partitioning errors and other correctness errors. -
ParaVis with colors.h: If you’d like to use some of the colors I have defined in
colors.h
you can set an RGB value in thecolor3
image array to a value incolors
like this:// assume buff is name of the color3 array: buff[buff_i] = colors[3]; // set this pixel to the 3rd color (Yellow)
If you want to use the colors to help with debugging row and column partitioning you can create the rainbow pattern of my examples by having each thread be assigned a color based on its thread id, and have it print its color for dead cell values in its partition and print a common color (eg black) for cells that are alive. For example, if
data→tid
stores the value of a thread’s logical id, you could do something like this:if(live) { buff[buff_i] = c3_black; // set live cells to black } else { // assumes your thread id is stored in your struct as tid; // modify the line below if you called it something different. buff[buff_i] = colors[((data->tid)%8)]; // dead to my tid color }
Don’t forget that the ParaVis library displays coordinate (0,0) at the bottom left of the visualization, and coordinate (rows-1, 0) at the top left.
-
The man pages for the various
pthread
functions are very helpful. -
You can use the
perror()
function to print out error messages from failed system and pthread library calls.perror
prints out detailed, human-readable information about the error that occurred. You call it passing in a string with program-specific error message. For example:big_buff = malloc(...); if (big_buff == NULL) { perror("malloc of big buffer failed"); exit(1); }
See the man page for
perror
for more information and for the include files you need to add to your program to use it. -
You will need to use some synchronization in your solution. You may use any of the
pthread
synchronization primitives, including mutexes and barriers. Remember that before using any of these, you need to initialize them. You also need to make sure that all threads have access to the shared synchronization variables (either pass a reference to them at thread creation or declare them as static global variables). Here are some code snippets for using the barrier synchronization primitive:// declare a global pthread_barrier_t static pthread_barrier_t barrier; // initialize it in a function (e.g., main) before using it: if(pthread_barrier_init(&barrier, NULL, num_threads)){ perror("pthread_barrier_init"); exit(1); } // each thread can then call pthread_barrier_wait to synchronize: ret = pthread_barrier_wait(&barrier); if(ret != 0 && ret != PTHREAD_BARRIER_SERIAL_THREAD) { perror("pthread_barrier_wait"); exit(1); }
9. Submitting
Here are the commands to submit your solution
(from one of you or your partner’s cs31/labs/Lab9-userID1-userID2
directory:
$ git add gol.c *.txt
$ git commit -m "Lab 9 completed"
$ git push
And of course, if you want to submit any nifty starting point world config files you come up with, please do:
$ git add coolworld.txt
$ git commit -m "cool world"
$ git push
If you have difficulty pushing your changes, see the "Troubleshooting" section and "can’t push" sections at the end of the Using Git for CS31 Labs page. And for more information and help with using git, see the git help page.
10. Handy Resources
General Lab Resources
-
Class EdSTEM page for questions and answers about lab assignment
-
man and Manual pages documentation for libraries and commands
C programming Resource
-
C Programming Resources and Links including I/O in C, cmdline args, arrays
-
Week 8 in-lab examples: ParaVis, and other sequential GOL related examples.