This challenge isn’t for any extra points, but once you complete your implementation, see if you can tune your program’s performance to run faster than mine.
Try running a large sized board for a large number of iterations. For example:
1024
1024
500
3
9 8
9 9
9 10
My program (compiled with -g -Wall
), took
about 16.5 seconds to run
on the CS lab machine named basil
:
./gol challenge_1024.txt 0
Total time: 16.509 seconds
After 500 rounds on 1024x1024, number of live cells is 3
Note: always run timing experiments with command line option 0 that does not include the animation part (i.e. there is no clear, printing the board, nor usleep between iterations). This experiment was run on 11/10/19.
If you program is much slower, then try to improve your program’s
performance to see if you can get closer to mine, or beat it.
However, try to improve your solution in a different copy of
its source (in faster.c
), so that you do not accidentally
break your correct and working solution in gol.c
.
1. Making Modifications
First, copy your existing solution to a new
file named faster.c, so that you don’t lose your correct, working gol
solution in gol.c
:
cp gol.c faster.c
Then copy over a new version of the Makefile
, and some test input
files (enter y to the question about replacing Makefile
):
cp ~newhall/public/cs31/golchallenge/* ./
2. Modify program for speed
Then try to improve the performance of the version of your
program that you have in faster.c
(leave gol.c
as your
first working version).
If yours is much slower than mine, think about your program’s
memory usage and see if you can improve the time by improving how
it uses memory.
When you change your code you need to ensure that it still solves the
correct problem. Verify that your changes still correctly implement gol
by running the faster.c
version on a small example in the ascii run mode to ensure this.
3. Running Experiments
When timing code you always want to run large enough experiments to
compare times (1 second vs. 2 seconds doesn’t necessarily say anything
about the two run times). You also want to time runs without output
statements (printfs
) and without calls to sleep
. You should
run each experiment on the same machine, and
run multiple times to see check if there is a lot of variation between
the measured run times of the same size. If there is a lot of variation,
then you would want to try to explain why. For this problem there should
be very little variation between the time of different runs of the same
size and number of iterations. If you see an unusually long run, it is
due to something else being run on the machine while you are timing your
program.
To ensure that you are running experiments without interference from other processes using up a lot of CPU and/or RAM, you can see what else is going on using these commands:
The who
command will list who else is logged into a machine.
The top
command will show cpu and memory usage of processes
running on a machine. Professor Newhall’s help pages have a lot more information about tools for examining system state.
3.1. Using basil:
To keep basil free as much as possible for other groups trying to do final timing to beat me, please first run timed runs on another machine and see how close you are getting.
You can, and should, do almost all your timing on any machine, just to see if you beat mine, you should run yours on the same machine as I ran my tests (not all CS lab machines are equally powerful).
You can find the specs of different CS lab machines on the: lab machine specs page
3.2. bash for loop
At the bash shell prompt (bash is the name of the Unix shell program)
you can write a bash loop to tell
bash to repeat an action some number of times. The syntax is almost like
a C for loop except do
and done
are in place of {
and }
, and you need double parens. Here is an example for loop to
run gol
on the same input 5 times (hit the enter key after each line,
the "$" is the bash prompt):
$ for ((i=0; i< 5; i++))
> do
> time ./faster challenge_1000.txt 0
> done
3.3. time command
The above example bash loop, uses the time command to run ./faster.
time times the entire execution of faster (in addition to the
output from my program using gettimeofday
timers in my code that do not
include board initialization time):
time ./gol challenge_1000.txt 0
Total time: 15.773 seconds
After 500 rounds on 1000x1000, number of live cells is 3
real 0m15.794s
user 0m15.773s
sys 0m0.016s
real
is wall time (total time), user
and system
are the portion of time the process spent running user-level code and
operating system-level code. I may talk a bit about what these mean,
but if you take the Operating Systems course, then these two times
will make a lot more sense.
4. Compiler Optimization
Make sure you compare a version of your code that is compiled using
the same compiler flags as used in the times shown (-g
in the above
times). You can often greatly improve your program’s execution time
by turning on compiler optimization during compilation. For example,
if I compile my code without debugging information (no -g
flag) and with
the highest level of compiler optimization (-O3
), I see a huge increase
in its performance just from the compiler optimization.
For example, here is my time running on basil for for my
optimized gol
program (this is the exact same C code as the results
above, the only difference is compiling with -03
):
./faster challenge_1000.txt 0
Total time: 2.285 seconds
After 500 rounds on 1000x1000, number of live cells is 3
The Makefile
you copied over builds faster with -O3
already. If you
need to debug it your faster program, in the Makefile change
the definition of FASTCFLAGS
to the one commented out (with
-g -Wall
).
5. Sample Times
5.1. Unoptimized GOL
Here are runtimes for different sized boards and iterations of my solution run on basil (times for each are the average of 5 runs):
# these are timed runs compiled with gcc flags -g -Wall
# gcc -g -Wall -o gol gol.c
#
total time for 500 iterations of 500x500 is ~4 secs
total time for 500 iterations of 1000x1000 is ~18.5 secs
total time for 500 iterations of 2000x2000 is ~73 secs
total time for 500 iterations of 4000x4000 is ~293.5 secs
In these experiments the number of iterations is the same for different size boards. These show how the runtime grows with the problem size: this shows linear increase in time with increase in problem size (each 4 times increase in the problem size results in about a 4 times increase in execution time). If you think about the complexity of GOL, this is what you would expect—it is an \(O(n)\) algorithm (where \(n\) is the number of grid cells) for a fixed number of iterations.
The timings were for these command lines (each run 5 times):
time ./gol challenge_500.txt 0
time ./gol challenge_1000.txt 0
time ./gol challenge_2000.txt 0
time ./gol challenge_4000.txt 0
5.2. Optimized GOL
Here are my times for my optimized gol
program (faster.c
).
(The C code for my gol.c
and faster.c
is identical,
so the improvement only reflects the compiler optimization flags.
# these are timed runs of my program compiled with the gcc flag -O3
# gcc -O3 -o faster gol.c
#
total time for 500 iterations of 500x500 is ~0.6 secs
total time for 500 iterations of 1000x1000 is ~3 secs
total time for 500 iterations of 2000x2000 is ~12 secs
total time for 500 iterations of 4000x4000 is ~48 secs
Just compiling the code with -03
compiler optimization flag,
results in code that is roughly 7 times faster than the version
that is compiled with -g
.
The timings were for these command lines (each run 5 times):
time ./faster challenge_500.txt 0
time ./faster challenge_1000.txt 0
time ./faster challenge_2000.txt 0
time ./faster challenge_4000.txt 0
In general, you want to do development with the -g
flag so that you
can easily debug your program. The -g
flag and the -O
flags are not
compatible, and -g
takes precedent. If you are running experiments on code you
have already debugged and tested, then you may want to enable compiler
optimization to improve its runtime.
And your fast solution should be correct regardless of the input file on which it is run…try out timing it on other files too.
6. Submit
If you improve your faster.c
version (beyond just the improvement
from compiling it with optimization -O3
that is in the Makefile
for building faster), then add, commit, and push it to your
git repo and also add a ChallengeNotes.adoc
file telling me how you
improved its performance and how well it performed (list some timing
results).
git add faster.c Makefile ChallengeNotes.adoc
git commit -m "faster program is faster"
git push
To compare your faster.c
to your gol.c
solution, you do not need to run on
basil, just run both on the same machine. If your solution is significantly
faster than mine, then make sure to tell me that in your ChallengeNotes.adoc
file
(and let me know in person or via email too).