A Mini Lab is one that I anticipate that you can complete in a couple hours; finish or be close to finishing by the end of a Thursday lab session. The purpose of Mini labs are to introduce you to a parallel or distributed programming language/utility without having you solve a larger problem using the language/utility.
I give you only about 24 hours to complete a mini-lab because I want you to stop working on it and get back to focusing your effort on the regular lab assignment.
If you don't get a mini lab fully working, just submit what you tried. If you don't submit a solution, it is not a big deal. Mini Labs do not count very much towards your final grade, and not nearly as much as regular Labs--they are mini. If you submit a solution to Lab3, make sure to add the submission string in the README.md file and push it with our solution (see Submit instructions).
Your job in this lab is to use OpenMP to parallelize the code.
You should be careful to stick with the fork-join, fork-join, fork-join model of OpenMP; don't do things in the parallel parts that are really not parallel or you will get some weird/unexpected behavior. Do not try to "optimize" your code by reducing fork-join blocks. You should, however, think about minimizing other parallel overheads as you design a solution; your goal is a solution designed such that there is a performance improvement from parallelization. If your 1 thread execution wins out over the multi-thread ones, think about how you can remove some parallel overhead (think of space/time trade-offs, think about synchronization costs, ...). Make sure you are comparing runs for large enough problem sizes (N and M) with enough iterations.
I encourage you to try different partitioning of all or some of the matrices and see if you get different timed results. For example, see if you can partition one or more matrices by rows or by columns across threads:
row column --- ------ 1 1 1 1 1 1 1 1 1 1 2 2 3 3 4 4 1 1 1 1 1 1 1 1 1 1 2 2 3 3 4 4 2 2 2 2 2 2 2 2 1 1 2 2 3 3 4 4 2 2 2 2 2 2 2 2 1 1 2 2 3 3 4 4 3 3 3 3 3 3 3 3 1 1 2 2 3 3 4 4 3 3 3 3 3 3 3 3 1 1 2 2 3 3 4 4 4 4 4 4 4 4 4 4 1 1 2 2 3 3 4 4 4 4 4 4 4 4 4 4 1 1 2 2 3 3 4 4
cd cs87/labs git clone [your_Lab03_URL]Then cd into your Lab03-you subdirectory.
Makefile README.md matrixmult.cIf this didn't work, or for more detailed instructions on git see: the Using git page.
cp -r ~newhall/public/cs87/openMP_examples .
With the starting point code, the sizes of N and M are tiny and the DEBUG definition is on. This will print out matrices and debug info as the code runs. Once you have something working, comment out DEBUG and make N and M big and try some timed runs to see if you get performance improvements with your parallel solutions. For example:
time ./mm_par 1000 0 time ./mm_seq 1000 0Note: these executables take at least two command line options, the first is the number of iterations, the second specifies row-wise or column-wise partioning, and an optional third takes a partitioning block size. The row/column-wise and the block-size options are there if you want to use them, you don't have to; it is to make the starting point code have a few more command line options that you can use if you'd like
cp -r ~newhall/public/openMP_examples .
@@@@@ WE ARE SUBMITTING THIS FOR GRADING: your names
git add README.md git add matrixmult.c git commit git push
If you have git problems, take a look at the "Troubleshooting" section of the Using git page.