1. Results/Measures
You should run timed runs of different experiments, and multiple instances of each experiment. Present results as the average over runs, and note standard deviations. See the Lab 1 Experiments section for more details.
Here are Here are some measures that you may want to use to present results:
-
Average Total Runtime (and include Standard Deviation too).
-
Speed-up:
Speed up = (Sequential Time) / (Parallel Time)
-
Efficiency:
Efficiency = Speed-Up/(P) P is the number of cores or threads
2. Examples to try out
Copy over some example scripts that we will use to try out some of these tools, and that may be useful examples for writing and using your own experiment scripts:
cd cs87
cp -r ~newhall/public/cs87/experiment_tools .
cd exeriment_tools
ls -l
chmod 700 *.sh # if need be, set executable permission so you can run them
-
run.sh
: example bash script for running a bunch of experiments with different input parameters. To run:./run.sh
-
run_outfile.sh
: example bash script for running a bunch of experiments with different parameters and capturing all output to a file. Note the bash command syntax for runningtime matrixmult
and redirecting its output to a file (&>>
: appends stdout and stderr to the specified outpufile). To run:./run.sh # output will go to file named "myoutputfile" in your home directory ./run_outfile.sh ~/myoutputfile # or to default file name "output" in the current directory # you can't write into my directory so it will only work if you copied # this script over into your subdirectory ./run_outfile.sh
-
killmytests.sh
: an example script that to kill all your experiments. Always:-
first pkill -9 the run script (run_outfile.sh in this example)
-
then pkill -9 your program executable (matrixmult in this example)
This script is very useful if you want to stop all your experiments from running, particularly if you want to do so in the middle of the night, you can schedule a cron job to run this script.
-
3. Tools/Utilities
3.1. how to find an "idle" machine to use
You want to run your experiments on a machine that is mostly idle so that other processes running on the machine do not interfere with your results. If no one is logged in, the machine may be idle, but someone could be running jobs in a screen session. Also, there may be people logged in, but they are not actively running anything on the system (the forgot to logout when they left the lab, for example).
You can run who
see who is logged in, and top -H
or htop
to see what is running on a machine, and who is logged in, and
smarterSSH
to find good machines on which to run experiments. In
htop
if you hit the F6
key you can sort results by different
columns. My
top and htop help page
describes some examples of how to configure top to select what data top shows
(htop is similarly configurable). See the man pages for top
and htop for more options.
Let top
run for a minute or so to be certain a machine is
idle before firing off a lot of experiments. Also, if you
use machines reserved for our class, then you should be able
to find an idle one (please share these machines).
Here is some more information about finding idle machines
3.2. screen and tmux
screen
and tmux
are useful for logging in, starting something running,
and then and then logging out while it runs—the processes you start
running in a tmux or screen session stay running when you log out.
Here are the steps to using screen
:
-
login and run screen to start a screen session:
screen
-
start the script you plan to run in this session. I suggest running a bash script of experiments inside a script session (details below), or run a bash script that redirects output of each run to a file.
-
detach from the screen session by typing:
Cntl-a d
-
then logout of the computer if you’d like
To re-attach to a screen session, login to the computer
on which you ran screen
and started some experiments
running and detached, and then run:
screen -r
And you can attach and detach as many times as you’d like from the same screen session.
Here is some more information about tmux. For your use of tmux to run experiments, you likely don’t need to configure tmux with multiple panes.
Make sure to exit your screen
and tmux
sessions on machines when you
are done using them.
3.3. script and dos2unix
script captures a terminal session to a file. dos2unix cleans up the resulting file after quit script. See more details here:
Python is a nice language to use to process the resulting typescript file to pull out timing results for related runs, compute average, std dev, spit out results in a nice form.
3.4. bash scripts
Write a bash script to fire off a bunch of experiments. Then just run
the bash script and come back later when done. Its good to have some
echo
commands in your bash script to print out some information
about particular runs: this will help with your post-processing scripts
to find timing results and compute averages and std dev. With the lab01
starting point code was one example bash script, try that out to see what it
does. I also have links to bash programming off my help pages:
bash shell programming
When you create a bash script, make sure the file is executable to run it:
vim runexper.sh # or some other editor
chmod 700 runexper.sh # set to executable
ls -l
./runexper.sh
Also, try running your bash script a few times before starting it up in screen and coming back later: make sure it is doing what you think it is. You can always comment out the call to gol program in the script to see if it is doing what you want (# is the bash single line comment):
#!/usr/bash
for((n=256; n <= 2049; n=n*2))
do
for ((t=1; t <= 32; t=t*2))
do
echo ""
echo "gol -t $t -n $n -m $n -k 1000"
# time ./gol -t $t -n $n -m $n -k 1000 -x
done
done
If I run the above bash script I’ll see all the calls to echo print out parameter configs and see if they are what I expect. Then uncomment and run.
In your bash script make sure you run time ./gol …
to collect runtimes.
3.5. cron
You can add a cron job to run your script at a particular date and time by
editing the crontab file on the machine you are running your experiments (ex.
on chervil
or some of the other CS87-only machines):
$ ssh chervil
$ crontab -e
Then add a line like this to run the killmytests.sh script at a specific time and date (at 8pm (20:00), on January (1) 31 :
0 20 31 1 * /home/newhall/public/cs87/experiment_tools/killmytests.sh
Similarly you can add a cron job to run your experiments at a specific time (here I’m starting them at 4:05 am on February 3):
5 4 3 2 * /home/newhall/public/cs87/experiment_tools/run_outfile.sh ./mytests
NOTE: please after your cron jobs run, make sure to run crontab -e
again
to remove them from the crontab file (so that cron doesn’t run them every year
on this date at this time until we remove your account).
You can run cal
and date
to list the current date and time.
4. Let’s Try some stuff out
Let’s try some of these steps together in the example you copied over.
First lets try out screen and script:
-
ssh into a machine, see if idle
-
start screen
-
cd to directory containing gol and bash script
-
start script
-
start bash script to run experiments
-
hit return and type
exit
(to terminate script…good practice) -
detach from screen
-
run top -H just to see if program is running
-
log out of machine
Then later, ssh back in the machine and re-attach to screen session.
On a different machine, create a cron job to run a test script and another to kill my test script and running test programs.
-
run
date
to get the current time -
run
crontab -e
and let’s start the run_outfile.sh in 2 mins and kill one minute later. In this example, let’s say it is Sept. 14st at 1:30pm right now:$ crontab -e # start run_outfile.sh (with output file mytests in your home directory) # at 1:32pm on Sept. 14 (minute:32, hour:13, day:14, month:9) 32 13 14 9 * /home/tnas/cs87/experiment_tools/run_outfile.sh ~/mytests # run killmytests.sh at 1:33pm on Sept. 14 33 13 14 9 * /home/tnas/cs87/experiment_tools/killmytests.sh
Now, let’s run top -H
or htop
and see what happens.
You can also run ps --user <yourusername>
to list all your
running processes on a machine.
5. Handy Resources
-
Class EdStem page for questions and answers about assignment
-
The Handy Resources and Experiements Sections of the Lab 1 assignment page.
-
tools for running experiments off my Help Pages, has more documentation and links to information about utilities for running experiments.
-
CS Machine Specs page from the "cs lap help"
-
bash shell programming links