87Lab 4

Due Dates

Our System and Strelka Runs: before noon, Saturday Nov. 11

ACCESS Anvil Runs: before noon, Saturday Nov. 11

I’m giving you a long time to complete it because of your project proposal. Also, runs may take a long time, but you can start them or submit them and come back later to see the results.

You should not spend more than 4 hours on both parts, and strive for fewer than that. The goal is for your to do a few large runs, and get some practice using the strelka cluster and an ACCESS machine.

Overview and Goals

This is a continuation of your Lab 4 assignment that involves running a few some large runs of your program on different systems. The goal is to give you some practice running on different systems, and practice running some large size runs.

You should spend no more than 4 hours total of your time to complete this part (really, do not spend much more than that on this…just stop, it is not worth it).

With your Lab 4 partner, you will run some large runs of your MPI odd-even sort on the following systems and submit the timed results of a few of the largest runs (the results of 2-3 big runs on each system is fine, a few more on our system and Strelka is fine, but don’t run a large number on Anvil):

CS lab machines
the Swarthmore Stelka cluster. Submit the results of a few large runs, be careful to not allocate sizes larger than total available RAM (i.e., don’t trigger VM paging)
the ACCESS Anvil cluster. Try out some small tests first to see that it runs correctly, and do not do any runs that have debug printf stmts. Your runs should have no output other than timing and possibly printing out initial size and number Pis).

I encourage you try out some runs on Strelka before you do on ACCESS to get some practice with sbatch and {slum}[slurm batch scripts].

Goals:

Remind yourself of the vim editor that you can use when ssh’ed into cs, Strelka, and ACCESS systems.
Try out some large-size runs on CS lab machines and two clusters.
Practice with slurm batch scheduler, and using sbatch and writing and using slurm batch scripts.
Practice with scp to transfer files between remote hosts and CS lab machines.

Big runs on the CS system

Run these late at night or during other times when the CS machines are mostly idle. Do not run during times when classes, labs, or ninja sessions are in the labs, and also avoid times in the evenings when labs tend to be full.

Run some large runs with lots of hosts, large -np values, and large problem sizes (N) on CS lab machines. In particular, do some runs that distribute processes over a large number of nodes and sort large size N.

First test out some large size data (N) runs (-np doesn’t have to be that big), and make sure your program doesn’t have deadlock whe the MPI message buffer is smaller than the data size you send.
- Try your largest sized run (in terms of N and P on a single node) and make sure that its execution does not exceed physical RAM size (run htop on a node running an MPI process(s) and make sure it doesn’t use the swap partition). If your sizes are too big to fit in memory, scale them back a good bit.
- Also, it is very important that you are certain that your runs finish. Make sure there is no infinite loop in your code that makes your program steal all the cpu cycles on all the lab machines!
Next, write a run script(s) to run a few big experiments.
- Test out your script before running (you can comment out the actual calls to time mpirun … to just check the script variable values and its output). Then uncommet to actually run.
- You do not need to run a whole bunch of runs, just a few different large runs is fine.
Finally, start your experiments running (ssh in to start, use screen or tmux, and script, or use at cron to run them), and check back after they should be done to check that they are. This should be a quick 10 minutes of your time (ssh in, check that hosts are up, start your script in a screen or tmux session, then ssh in later to see that it finished).

See running OpenMPI on our system for more information about ssh-agent, mpirun, creating hostfiles, finding machines on our system, and using autoMPIgen.

Use the check_up.sh script to test that all machines are reachable

./check_up.sh hostfilename

You can copy a version of the check_up.sh script that works with hostfiles with or without slot=X annotations with each host here:

cp ~newhall/public/cs87/check_up.sh .

Here is more information about hostfiles in openMPI: about hosts and hostfiles

If you want to limit the number of MPI processes spawned per node specify the number of slots after each host in the hostfile. I have some large hostfiles with different numbers of slots you are welcome to use as starting points:

cp ~newhall/public/cs87/hostfileHUGE* .

Once you have a large hostfile, you can create experiment script of some large size runs using this hostfile.

IMPORTANT: before running a huge set of experiments, make sure your largest size run completes (make sure it does not deadlock and never finishes). If your oddeven sort deadlocks, then see the Lab 4 write-up for some hints about this and how to fix it.

See the Experiment Tools page from Lab 1 for a reminder about useful tools for running experiments, writing scripts, some useful commands/utilities: screen, script, tmux, basch scripts, cron, …

Using scp, ssh, vim on Anvil and Strelka

You will need to use vim to edit files on Anvil and Strelka, and you will need to use scp to copy files between the CS system and Anvil and Strelka. For example, for your ACCESS odd-even sort big runs, you need to scp your oddevensort.c file from your repo on our system to Anvil. And you may want to scp result and slurm script files from Anvil back to our system. You can scp each file one by one or make a tar file containing all files you want to copy over, and just scp the single tar file.

more info on scp, more info on tar, more info on vim

Some scp basics (there are specific examples of scp commands in later sections about running on Strelka and Anvil):

# scp a file from your CS account onto Anvil from Anvil:
on_cs$ ssh you@anvil.rcac.purdue.edu
anvil$ scp you_on_cs@cs.swarthmore.edu:/oddevensort.c .

# remember, you can us mv to move file to a different location if you
# scp it into the wrong directory, for example:
anvil$ mv oddevensort.c cs87/oddeven/.

# scp a file from Anvil to CS from Anvil
anvil$ scp ACCESSRESULTS you_on_cs@cs.swarthmore.edu:./cs87/labs/Lab04-repo/.

You can also copy from CS either way:

# scp a file from your CS account onto Anvil from CS
on_cs$ scp filename you_at_xsede@anvil.rcac.purdue.edu:./cs87/oddeven/.

# scp a file from your Anvil onto CS from CS
on_cs$ scp  you_at_xsede@anvil.rcac.purdue.edu:./cs87/oddeven/ACCESSRESULTS .

Big Runs on Strelka

Try runs on Strelka before trying some runs on ACCESS. You will need to use vim to edit files on Strelka.

Swarthmore’s Strelka cluster is set up to use the slurm batch scheduler like ACCESS systems. It is also a resource you can use for your course project. You should try out some runs on Strelka before trying some runs on ACCESS. On both Strelka and Anvil, you will need to use vim to edit files, and you will use sbatch to submit jobs to slurm.

To get a Strelka account, complete this form, which asks for your SSH public key: Strelka account request

Once you have an account, try out your oddeven sort on Strelka, first with a smaller sized run to see that it works, and then try a couple larger runs.

running on Strelka

Strelka (and Anvil) uses the slurm scheduler, so the same process you use to compile and run on the ACCESS system, you will use on this system.

Basically, you will use scp to copy over your oddevensort.c file onto Strelka, copy over a Makefile and sample slurm batch script. Once this is set up you just compile your code and then you can submit oddevensort job(s) to slurm using sbatch.

Here are the steps:

First, ssh into strelka, and create some initial subdirectories for your cs87 work:

$ ssh you@strelka.swarthmore.edu

# create some subdirectory structure the first time:
strelka$ mkdir cs87
strelka$ chmod 700 cs87
strelka$ cd cs87
strelka$ mkdir oddeven
strelka$ cd oddeven

strelka$ pwd           # should be in ~/cs87/oddeven

strelka$ scp you@cs.swarthmore.edu:./cs87/Labs/Lab4-repo/oddevensort.c .

Second, edit a slurm script to run oddeven sort.

You can copy over some makefiles and example slurm scripts to use

I have a Makefile and example slurm batch scripts on strelka that you can copy over and try out (note that the last two characters in my user name are the letter l followed by the number 1, they look very similar):

strelka$ cd ~/cs87/oddeven
strelka$ cd ~/cs87/oddeven
strelka$ cp /home/tnewhal1/Makefile .
strelka$ cp /home/tnewhal1/*.sb .

There are two example slurm batch scripts for oddeven and one for the helloworld program. I suggest creating a few other .sb slurm scripts, each to run different sizes. The slurm scripts you copied over are:

oddeven.sb: runs oddeven on 2 nodes, 10 processes per node
oddeven.bigger.sb: runs oddeven on 4 nodes, 20 processes per node
helloworld_strelka.sb: for helloworld from my openMPI example programs

you will need to edit these to change the path to the output file.

Open oddeven.sb slurm batch script in vim to see what the settings are. You can change these for larger or smaller sized runs

You should change this to a path in your directory not mine:

#SBATCH --output=/home/tnewhal1/cs87/oddeven/oddeven.%j.out

Third, run module to load the version of the openMPI module you want to use:

# searches for modules with mpi in them
strelka$ module -r spider mpi

# pick one of the gcc openMPI ones to load and use
strelka$ module load openmpi/4.1.4-gcc-8.5.0

# more info:
man module

Fourth, compile oddeven, and submit your job to slurm:

strelka$ make
strelka$ sbatch oddeven.sb

# see queue state (or just your jobs in the queue)
strelka$ squeue
strelka$ squeue -u yourusername

# scancel: delete a job from the queue using its job id (squeue lists job ids)
strelka$ scancel jobid

# sinfo: to see how resources are currently allocated, unallocated, totals:
sinfo -o "%n %T %C"

After your program is run, you will see a result file in the location specified in your slurm batch file.

Let’s Try it out first with helloworld

We will try these steps together in Thursday lab:

after logging into strelka, scp over my openMPI_examples:

scp -R you@cslab.cs.swarthmore.edu/home/newhall/public/openMPI_examples .

cd into the helloworld subdirectory
load the version of openmpi we want to use, and compile:
```
module load openmpi/4.1.4-gcc-8.5.0
make
```
let’s look at the slurm run script file and then submit it:
```
vim hello_strelka.sb
sbatch hello_strelka.sb
```

More Strelka information

Here is some more information about module commands:

# list available modules
module avail
------------------------- /opt/modulefiles -----------------------
  list of modules
  ...
(L) are loaded

# list modules with a pattern (mpi) in their name
module -r spider mpi

# list loaded modules
module list

# to load an mpi version into your environment:
module load  openmpi/4.0.2-gcc-8.4.1

# to unload a module
module unload  openmpi/4.0.2-gcc-8.4.1
module unload  gcc/8.4.1

If you echo your PATH environment variable you will see the path to this module added to the fromt of your path:

echo $PATH
/opt/apps/mpi/openmpi-4.0.2_intel-19.0.5.281/bin: ...

You can also add this to your PATH in your .bashrc file and then avoid running module load each time you log in.

For more information on Strelka: Strelka cluster

Big Runs on Anvil

Run some large runs of your Lab 4 solution on Anvil. You should do this after first running some large runs on the CS machines and on strelka to make sure your solution does not have any deadlock or other errors that would make it run forever. Also, try out sbatch first on Strelka.

It is important that you estimate a reasonably accurate upper bound for its runtime in your slurm script. Do not make the time super long, as if your application has deadlock it will use up a lot of ACCESS CUs for no good reason. Pick a reasonable upper-bound estimate (you want to pad it a bit so that it is long enough for your application to finish, but you don’t want a deadlocked process to continue to use CUs for a huge over-estimate of total runtime). Do some experimentation with times on ACCESS and use times on our system to help you pick a good upper bound estimate. Your time doesn’t have to be super close, but if you expect it to easily complete within 15 minutes, submit a slurm script with a time of a few minutes beyond this, maybe 17, and don’t submit one with a runtime of 1 day, for example).

What to do

This is a high-level overview of an order in which to try out some things on ACCESS (details follow):

First make sure to set up your ACCESS and your Anvil account (follow all the directions under "ACCESS and Anvil Account Set-up"): ACCESS and Anvil accounts
Next, try ssh’ing into Anvil, and try running my mpi hello world example. You can scp from CS to your Anvil account.
```
anvil$ cd ~/cs87
anvil$ scp you_on_cs@cslab.cs.swarthmore.edu:/home/newhall/public/ACCESS_MPI.tar .
anvil$ tar xvf ACCESS_MPI.tar

# then cd into the ACCESS_MPI subdirectory:
anvil$ cd ACCESS_MPI
```
Then open the README file in vim and (follow the directions for how to submit jobs. Also look here: using ACCESS Anvil
Once you have tried out the helloworld, then practice with some small runs of oddevensort before submitting larger runs.
- Once you figured out slurm and how to submit jobs with helloworld, then scp over your Lab04 solution (details: scp, slurm scripts ), and build on Anvil.
- Then, write a slurm script to run oddeven sort (use the helloworld ones as examples). Try a very few very small runs with some debugging output (don’t enable all debugging output) to make sure it runs on Anvil.
- Finally, try a small run with printing disabled in preparation for larger runs (remove all debug output from your program, comment out #define DEBUG, except that you can keep in printing out timing information if it does that).
- Next, submit the result files of a few larger runs.

scp, slurm scripts

On Anvil you will need to scp over your oddevensort.c file, and copy mine or create your own Makefile and slurm batch file(s) to run.

First, scp over your oddevensort.c file:

on_cs$ pwd       # get path to your Lab 4 repo
on_cs$ ssh you@anvil.rcac.purdue.edu

anvil$ mkdir cs87
anvil$ chmod 700 cs87
anvil$ cd cs87
anvil$ mkdir oddeven
anvil$ cd oddeven
anvil$ scp you_on_cs@cslab.cs.swarthmore.edu:/oddevensort.c .

On Anvil, either create a Makefile and slurm runscript (oddeven.sb) to run a small run of oddeven sort (use the helloworld ones as example), or you can copy over ones I have on my Anvil account as starting points to try out:

# from your cs87/oddeven/ directory:
anvil$ cp /home/x-tnewhall/oddeven.sb .
anvil$ cp /home/x-tnewhall/Makefile .

Then:

load the openmpi module (one time each login), and then make to compile:
```
anvil$ module load openmpi/4.0.6

anvil$ make
```
run your oddevensort by submitting your oddeven.sb batch script to slurm by running sbatch:
```
anvil$ sbatch oddeven.sb
```
you can see the status of your job in the run queue:
```
anvil$ squeue -u yourusername
```

Assignment: try some large runs of your solution on Anvil

Try out at least two large runs of your Lab 4 solution on Anvil.

Try out some small runs first, in the debug queue, to debug your slurm script before trying larger runs on the wide queue.
Make sure to disable or remove all debug output from your program (comment out #define DEBUG) in your long runs.
Write a couple slurm submission scripts for long runs (large sizes). In your slurm script, you will want to modify at least these four lines (use the compute queue instead of the debug queue for your experiment runs):

#SBATCH -p debug               # which queue: **change to wide**
#SBATCH --nodes=2              # Total number of nodes: **increase!**
#SBATCH --ntasks-per-node=16   # number of tasks per node: **increase!**
#SBATCH -t 00:02:00            # Run time (hh:mm:ss) - 2 mins: **increase to 5 or 10 mins**


time mpirun -np $SLURM_NTASKS ./oddevensort 900000 3  # increase N and num iters

As a reference point, my solution (compiled without optimization, -g) completes in less than 6 seconds for the sizes in my oddeven.sb (shown above) when run on Anvils debug queue.

My solution (compiled -g) run on on a moderate size (900,000) on the wide queue with 8 nodes, 32 tasks per node (for a 256 total processes) run for 1 iteration takes 15 seconds.

Increasing nodes and ntasks-per-node will increase both total size N and P, and should result in more time. You may also want to recompile with -O2 CFLAGS instead of -g to produce better optimized executable (which will run faster).

In general:

Choose way more than 2 nodes and 16 total mpi tasks in your runs.
Try large sized arrays for each process to sort via command line args to your executable (add them to the command in the slurm script).
You will need to adjust the estimated runtime (try not to exceed 15 mins, and it is fine if your big runs don’t take all that long, but if you can get a few that are several minutes, that is great. Increasing P should help to increase time). If your estimate is too small and your program runs longer than your estimate, it will be killed before it completes. If your estimate is too long, it will wait in the job queue for much longer than it should.

be careful about time values on Anvil

Be mindful of ACCESS allocation compute units as you run these.

If your submitted job is killed because its runtime is longer than the time value you have in your slurm script, and if that value is already around 10 or 15 minutes, do not increase time (unless if it is almost done, then increasing it by a couple minutes is okay). Instead, change some other sizes to run a smaller size run that will complete in about that time (number of nodes, number of tasks per node, size, number of iterations). Particularly if you are using a slow local sort, try smaller size values.

I suggest trying some smaller runs first to get some time estimates for your solution and then try some larger runs relative to the runtime of your solution.

Submit

You will submit the following via git:

BIGRESULTS: add this file (like RESULTS) that contains results of your large run tests on the CS machines. make sure to edit this file so that it is clear what run sizes the results you show are from
Two Strelka output files from two large runs. The only program output should be that the process with rank 0 should print out the size of the problem: N and P (make sure you have no debug printing output in these runs…these files should be very small). If you forget to include this printing of the problem size N and P, then edit the output file to include these values (copy in the slurm script corresponding to the run, or just edit the output file to add this in).
Two Anvil output files from two large runs. The only program output should be that the process with rank 0 should print out the size of the problem: N and P (make sure you have no debug printing output in these runs…these files should be very small). If you forget to include this printing of the problem size N and P, then edit the output file to include these values (copy in the slurm script corresponding to the run, or just edit the output file to add this in).
```
vim ACCESSResults
# in vim can import file contents using :r  in ESC mode
# put cursor where you want file inserted:
:r output.jobnum....
:r oddeven.sb        # this job's slurm script
:r output.jobnum....
:r oddeven2.sb       # this job's slurm script
...
```
You can scp over the ACCESSResults file to your cs account and add them to your repo:
```
anvil$ scp filename you@cs.swarthmore.edu:./cs87/Labs/Lab04-you-partner/.

# or scp to home and then on CS just mv file from your
# home directory into your Lab04 repo:
anvil$ scp filename you@cs.swarthmore.edu:.
cs$ cd ~/cs87/labs/Lab04-you-partner
cs$ mv ~/filename .
```
A couple output file from running on on the Swarthmore Strelka cluster. Similar to ACCESS result ouput.

Then just add these to your git repo and commit and push:

$ cd ~/cs87/labs/Lab04-you-partner
$ git add BIGRESULTS
$ git add ACCESSResults
$ git add StrelkaResults
$ git commit -m "lab 4b results"
$ git push

Before the Due Date, one of you or your partner should push Anvil output files from two large runs to your Lab04 repo. (ssh them over to your cs account to git push them to your Lab04 repo).

If you have git problems, take a look at the "Troubleshooting" section of the Using git page.

Handy Resources

Using ACCESS and Anvil accounts
Strelka
slurm
Running OpenMPI on our system. This includes information about setting up ssh keys and ssh-agent so that mpi can ssh into hosts without you haveing to give your password, and also information about my simple MPI example programs you can copy over and try out on our system.
vi (and vim) quick reference
ssh,scp,tar,man,make,vim,bash scripts
CS87 github org, and Git help
Some Useful Unix Commands,
my help pages
CS department help pages
Class EdStem page for questions and answers about assignment

CS87 4 Part 2: Large Experiments on Different Systems