CS87: Lab 4

For this lab assignment you will implement a parallel sorting algorithm in MPI, and run and test it out on both our system and on the SDSC Comet cluster.

This week you will focus on running your MPI application on our system, and setting up your XSEDE and Comet account.

Starting next week and after break you will try out some large runs on Comet.

You will work on this lab with your lab 4 partner

Lab Goals

Learn MPI programming.
Run an MPI application on our system using openMPI.
Implement a parallel sorting algorithm
Answer complexity questions about your implementation.
Learn how to use an XSEDE resource.
Test your MPI implementation on our system and on a large XSEDE system.
Answer some questions about the complexity of a parallel algorithm.

Getting Started
Project Details
Complexity Questions
Test Runs: Our system and XSEDE Runs
Resources
Submission

Getting Started

I suggest looking at the following information: Running OpenMPI on our system , and try out running the simple example code:

cp -r ~newhall/public/openMPI_examples .

Lab Starting Point Repo

Get your LabO4 ssh-URL from the GitHub server for our class: cs87-s20
On the CS system, cd into your cs87/labs subdirectory
Clone a local copy of your shared repo in your private cs87/labs subdirectory:
```
git clone [your_Lab04_URL]
```
Then cd into your Lab04-you-partner subdirectory.
If all was successful, you should see the following files when you run ls:
```
Makefile   RESULTS	 hostfile     oddeven.sb     questions.tex
README.md  check_up.sh*  hostfilebig  oddevensort.c  run.sh*
```

If this didn't work, or for more detailed instructions on git see: the Using Git page (follow the instructions for repos on Swarthmore's GitHub Enterprise server).

Starting Point files

Makefile: builds an openMPI parallel version of oddevensort
README.md: see notes about #defines and sizes
questions.tex: your answers to the complexity questions
oddevensort.c: the starting point file for your implementation.
hostfile*: example hostfiles for running on our system
check_up.sh: an script to check if all the machines in a hostfile are up
run.sh: an example run script for running on our system
RESULTS: for your timing results on our system
oddeven.sb: an example slurm run script for running on comet

Project Details

For this lab you will implement parallel Odd-Even Sort in MPI. Odd-Even is similar to Bubble Sort in that it compares and swaps adjacent elements However, it does the comparisons and swaps in an order that is more easily parallelizable than Bubble Sort.

Sequential Algorithm

The sequential Odd-Even Sort algorithm works as follows:

Swapping adjacent elements is done in N rounds.
In each round, pairs of elements are compared and swapped if not in sorted order. The pairs compared and swapped depend on if it is an odd or and even round:
- If round i % 2 == 0 compare even/odd pair of elements
- If round i % 2 == 1 compare odd/even pair of elements

Here is an example sorting N=6 values: 3 8 9 5 6 2

round
  0:   3  8  9  5  6  2   (even/odd)
       ^--^  ^--^  ^--^
  1:   3  8  5  9  2  6   (odd/even)
          ^--^  ^--^
  2:   3  5  8  2  9  6   (even/odd)
       ^--^  ^--^  ^--^
  3:   3  5  2  8  6  9   (odd/even)
          ^--^  ^--^  
  4:   3  2  5  6  8  9   (even/odd)
       ^--^  ^--^  ^--^
  5:   2  3  5  6  8  9   (odd/even)
          ^--^  ^--^

Note: the end elements are not involved in compares and swaps in every round.

Parallel Algorithm

In the parallel version, the assumption is that P (the number of processes) is much smaller than N. The parallelization is done in terms of P not N.

Assume Pi is the id of the ith process, and that i=0-P-1, represents the rank of the processes participating the parallelization.

Each Pi is allocated a contiguous portion of the list.
Some number of sorting rounds occur where Pi exchanges its sorted list of values with one of its neighbors.
At each round, process Pi does the following:
1. Pi sorts its portion of the list locally using any sorting algorithm.
2. If round i%2 == 0, and even/odd exchange happens
  - If Pi is even, it sends its sorted portion to Pi+1
    Pi keeps the smallest half of the items for the next round, Pi+1 the largest half for the next round.
  - If Pi is odd, it sends its sorted portion to Pi-1
    Pi keeps the largest half of the items for the next round, Pi-1 the smallest half.
3. If round i%2 == 1, and odd/even exchange happens
  - If Pi is even, it sends its sorted portion to Pi-1
    Pi keeps the largest half of the items for the next round, Pi-1 keeps the smallest half for the next round.
  - If Pi is odd, it sends its sorted portion to Pi+1
    Pi keeps the smallest half of the items for the next round, Pi+1 the largest half.
Remember that the end-most Pis may not participate in every round.

Implementing Odd-Even Sort in MPI

You will implement parallel Odd-Even Sort in MPI. Your implementation will ignore two issues often associated with sorting:

How to distribute the N elements over the P processes.
How to merge the P sorted parts back into a single sorted array.

In your implementation:

Each process will generate a set of SIZE random int values as its portion of the N items to sort.
Processes will compute all rounds of Odd-Even sort so that at the end, process 0 has the smallest SIZE elements in sorted order, process 1 the next smallest SIZE elements in sorted order, ..., process P-1 has the largest SIZE elements in sorted order.
The sorted portions DO NOT need to be assembled together into a single sorted array on one of the nodes. In other words, your MPI program can exit with the sorted list distributed in SIZE chunks over the P processes such that P0 has the smallest SIZE values in sorted order, P1 then next smallest sorted SIZE values, and so on.

Starting point code

In the starting point oddevensort.c file is a definition for SIZE. When you run your mpi program you will specify the number of process to spawn (e.g. -np 8 spawns 8 MPI processes for a run). The number of values a particular run is sorting (N) is implicitly specified by the values of SIZE and the number of processes.

The starting point code has an optional command line argument that specifies a different value for the size of elements each process should sort:

# sort 8*SIZE values such that each of the 8 processes gets SIZE elements 
mpirun -np 8 --hostfile hostfile  ./oddevensort 

# run with optional command line argument:
# sort 800 values such that each of the 8 processes gets 100 elements 
mpirun -np 8 --hostfile hostfile  ./oddevensort 100

As you first develop your solution, uncomment the DEBUG definition in the starting point file, and try running with small SIZE values and smallish different number of processes to test correctness. With DEBUG defined, each MPI process will print out its portion of the sorted array before performing odd-even sort and then after where you should see process 0 has smallest SIZE values in sorted order, process 1 the next, and so on.

Try different sized SIZE and number of process values to make sure your sort is correct (correct number of iterations, correct Pi's exchanging at each step).

You can also try odd-even sorting in the other direction after doing it one way first (i.e. try a descending sort after you do the ascending sort). Sorting an array in the opposite starting sorted order is often a good way to see if you are missing catching an edge case. To do this you may want to functionize all comparisons so that it is easy to change your code to sort in either direction.

Sample Output

Here is output from my program. It shows a few runs with different number of processes and sizes run in debug mode where each process prints out its portion of the unsorted array before sorting and prints out its portion after.

It is good to add this debug printing functionality to yours to help you verify correctness of your implementation.

Additional Requirements

In addition to a correct implementation, your solution should be well designed, well commented, with good error detection and handling, and free of memory access errors. Use constants, meaningful variable names, and good modular design. See my C Style guide and other Other C programming documentation.

You should also not use global variables in your solution.

Complexity Questions: Due Friday before 5pm

In addition to implementing and testing Odd-Even Sort in MPI, you should answer the following questions about Odd-Even Sort. Write-up your answers in the questions.tex file, and push the .tex file (not the .pdf) to your git repo before the due date. In the .tex file are some examples of using latex math mode and verbatim, which may be useful for your write-up (you are not required to use either).

What is the big-O complexity of the Sequential Odd-Even Sort Algorithm on a list of N items? Show how, give a detailed explain of how, you got your answer.
Given P processors, what is the big-O runtime complexity of the Parallel Odd-Even Sorting Algorithm? Show how, give a detailed explain of how, you got your answer.
Given P processors, how much space is needed to perform the parallel sort of N values? Explain your answer. Your answer will likely depend on how you do the exchange step, so explain how you did it in your answer.
IS Odd-Even sort a good algorithm for sorting on a GPU? Why or Why not?
Is Odd-Even sort a good algorithm for sorting on a cluster using MPI? Why or Why not? Feel free to list pluses and minuses comparing GPU and MPI .

Test runs on our system and XSEDE Experiments on SDSC Comet

Due Thursday: Results from some initial runs, including some largish-sized runs on our system are due Thursday.

Due After Break (TBA): Results from larger-sized runs on our system and on XSEDE are due after break.

You should run some large-sized runs on our system, see the run.sh script included in the starting point, for an example of how to start a bunch of runs of different sizes.

Try varying the problem size is a few (both large sizes of data to sort and large numbers of processes and hosts).

With your lab submission you should submit a file RESULTS that lists runtimes for different sizes that you ran on our system. See the comment at the top of the file about what and how to list.

If you do some runs on a lot of nodes, make use of autoMPIgen to pick a good hostfile for you, and run the check_up.sh script to remove unreachable nodes from this file. Also please be mindful that machines are heavily used for lab sessions, classes, and ninja sessions and please avoid running computational intensive experiements during these times. Smaller runs should be okay.

Test out a large sized run on one node and make sure that your oddeven processes running are not allocating more space than can fit into RAM (run htop to see). If so, back off the size until the cummulative memory fits into RAM (with some to spare). You should see no swap space usage.

More details about larger runs and experiments on XSEDE will be assigned next week.

This Week do the following:

Try some timed runs of different sizes on our system. These should include some large sized runs, but not huge. You will run some really large ones over spring break.
Make sure to set up your XSEDE and your Comet account this week (follow all the directions under "XSEDE and Comet Account Set-up"): XSEDE and Comet accounts.
Try ssh'ing into Comet, and try out scp'ing over my MPI hello world example and try running in on Comet (follow the directions under "Using Comet and submitting jobs").

Useful Functions and Resources

Running OpenMPI on our system . This includes information about setting up ssh keys and ssh-agent so that mpi can ssh into hosts without you haveing to give your password, and also information about my simple MPI example programs you can copy over and try out on our system:
```
cp -r ~newhall/public/openMPI_examples .
```
Remember that MPI message send and receives have associated fixed-size buffers. On a regular send, the sending process typically doesn't block on the send call after the data are copied to the buffer (you can configure different send semantics, but this is the default). Sometimes, however, a process may block on send if the buffer is full and the receiving process has not yet received the buffered data.
check_up.sh and run.sh: scripts with starting point code to check if the machines in a hostfile are up, and to run a bunch of experiments. To run check_up script:
```
./check_up hostfile
```
autoMPIgen (and smarterSSH): these are tools that are useful for finding machines that are less loaded on our system. autoMPIgen will generate an mpi host file for you, picking good machines to run on based on command line criteria. smarterSSH picks a good machine to ssh into on our system, and can also be used to list load information about lab machines. If you use autoMPIgen to generate a hostfile, keep in mind that things change, and you don't want to just keep using this one same hostfile over and over. Instead, run it to regenerate a good set of hosts periodically.
Here are some examples of how to run smarterSSH to get statistics:
```
smarterSSH -h 

# run smarterSSH in -i mode to list machine statistics
smarterSSH -i  -v -n 50 
smarterSSH -i  -v -n 50 -c
smarterSSH -i  -v -n 50 -m

# ssh's you into a good machine 
smarterSSH  
```
Here are some examples of how to run autoMPIgen to generate MPI host files in different ways:
```
autoMPIgen -h 

# generate a host file (hostfile) of 16 hosts selected 
# based on a function of available cpu and memory 
autoMPIgen -n 16 hostfile
# generate a host file of 16 hosts that includes the cpu count
autoMPIgen -n 16 -q hostfile
# generate a host file, choosing 8 hosts based on memory usage
autoMPIgen -n 8 -m hostfile
# generate a host file, choosing 32 hosts based on cpu usage
autoMPIgen -n 32 -c hostfile
# generate a host file, randomly choosing 64 hosts
autoMPIgen -n 64 -r hostfile
```

machines just for our class:

cardamom 8 core   
crabby 8 core     
cucumber 8 core   
elderberry 8 core 
hyssop 8 core     
mint 8 core       
parsley 8 core    
stew 8 core       
zucchini 12 core

srand, rand: C random number functions:

# srand: seed the random number generator using current time:
#        (only seed one time)
srand(time(NULL));

# then call rand to get the next value in the random sequence
# (returns value from 0-MAXINT, use % to get smaller range):
next_val = rand() % MAXVAL;

Using Xsede to run on Comet. My directions and examples for running MPI programs on Comet using your xsede account.
The Xsede portal: https://portal.xsede.org/
Lab Machine specs page contains information about most of the lab machines, including number of cores.
More Xsede links
Links to other MPI references and tutorials

Submit

Before the Due Date, one of you or your partner should push your solution to github from one of your local repos to the GitHub remote repo. (it doesn't hurt if you both push, but the last pushed version before the due date is the one I will grade, so be careful that you are pushing the version you want to submit for grading):

From one of your local repos (in your ~you/cs87/labs/Lab05-partner1-partner2 subdirectory)

git add questions.tex 
git add oddevensort.c 
git add RESULTS
git commit
git push

If you have git problems, take a look at the "Troubleshooting" section of the Using git page.

CS87 Lab4: MPI and XSEDE