CS45: Lab 4

Content:

Problem Introduction
Part 1: System call Implementation Details
Part 2: Experiments and Report
Useful Unix Utilities
Latex: Word processing like a Computer Scientist
What to Hand in
Demo

Problem Introduction

This assignment is a study of Linux's implementation of virtual memory. It is a two part assignment, the first part involves implementing a simple system call to obtain page fault statistics for individual processes, groups of processes, and for the system as a whole. The second part involves evaluating Linux's page replacement policy using your system call (and other utilities) to obtain performance measures of workload tests you design and run. You will write a 3-5 page report describing your experiments and results.

This project requires a fair amount of time in designing, writing, and running user-level experiments, and very little time writing and testing your system call. I'd recommend leaving yourself at least one full week, and better yet, 1.5 weeks, for designing, writing, and running experiments and for writing your final report.

Project Details
Part 1: Adding a new system call

Start by setting up a git repo for your lab 4 work (remember to add three users: you, partner, and your shared cs45X user). Then, copy over my starting point files and add them to your repo:

cp ~newhall/public/cs45/lab04/* .

The starting point contians files with #includes and a Makefile for a test program.

Implement a new system call:

int pgfltstats(pid_t pid, int flag, pf_info_struct *info)

The pf_info_struct should contain fields for:

total number of minor page faults
total number of major page faults
number of processes included in these counts

If pid is:

-1: then return total count statistics for all processes in the system.
positive then look at the flag field and do the following:
1. flag is PGFLTSTAT_PROC: then return count statistics for the single process with the matching pid
2. flag is PGFLTSTAT_OWNER: then return the total count statistics for all process in the system with the same owner (uid) as the specified pid. Use the real_cred field in the task_struct to find the owner's uid value.

You can use fields in task_struct for obtaining the values you need. Make sure to check for error conditions and return appropriate error values as in the previous assignments.

You can use vmstat, ps and /proc/pid/ files to help you determine if your system call is implemented correctly (look at vmstat, ps, and proc man pages for more info). Here is an example of printing out the uid of every process in the system, and then sorting the results by pid:

ps axo pid,ruid,time,cmd | sort -k 2

You may also want to use your procinfo system call from lab 2 (perhaps modifying it to include uid information).

Part 2: Experiments Testing the VM system and a Written Report

For this part, you will use your system call (and other utilities) to collect information about page faults and other performance data for different programming loads. I want you to (through experiments) determine what page replacement algorithm Linux likely implements. I'm sure you can figure out what Linux's page replacement policy is just by reading Linux documentation and/or reading Linux source code. However, the point of this assignment is to see if you can verify experimentally what you know (or suspect) about Linux's page replacement policy.

Kernel code you may want to browse:

page fault handling routines in arch/x86/mm/fault.c and mm/memory.c
page replacement code in mm/vmscan.c, which begins with try_to_free_pages.

Think about running experiments to answer the questions about how the policy works for different types of workloads, and what that tells you about the policy. In designing your experiments, you should write user-level test programs that access memory in different patterns, and run different experiments using them to collect performance data to help you answer questions like:

When does Linux's page replacement policy work well (what types of workloads)?
When does it not work well?
How well does it work for a mixed workload?
etc.

And what do the answers to these questions tell you about Linux's page replacement policy? For example try and answer questions like:

is it most like FIFO?
or MRU?
or LRU?
or Working Set?
Why or why not? And, in what way(s)?

The idea is to design experiments that support your answers to these questions.

Getting Started With Testing

To get started, you will likely want to read kernel code and other Linux documentation to help you develop some initial hypotheses. You can test negative hypotheses as well as positive ones (i.e. "I don't think Linix implements the X policy, and I'm going to test this by ..."), but you should have some positive hypotheses as well; there is a difference between negative hypotheses that can help support your positive ones, and bad hypotheses (and bad experiments) that don't really show anything.

Next,think about how to design experiments to test your hypotheses:

You will want to write some user-level test programs that allocate memory and access it repeatedly in some pattern (you need to allocate enough memory to cause page faults).
Try to write very simple test programs; complicated test programs often complicate what they are testing, and simple test programs often clearly test what you think they are testing.
Use these test programs to design and run experiments to help support (or refute) your hypotheses.

Look at the details in the Report section below to help guide your hypothesis and experiments design and testing.

To write test programs that use enough memory to trigger paging, look at the /proc/meminfo file. This give the total amount of free memory (MemFree, in Kbytes). If you are running an experiments, make sure the the process(es) allocate enough memory to trigger page replacement (they need to allocate enough virtual memory space so that it won't all fit into free physical memory). However, be careful that you do not allocate too much memory, otherwise, you can run out of swap space and the kernel will start killing process. The amount of free swap space can be found in /proc/meminfo (SwapFree, in Kbytes).

Note: it will take several rounds of trial and error to get your experiments set so that they are really testing the thing you want to test. As a result, start early, run some initial tests, see what they show (or don't show) and then re-design to get tests that show what you want.

Also, each individual experiment should be run several times; a single run of an experiment is not definitive. You should run several iterations of each experiment and take the average (and compute stddev) over this set.

You may also want to periodically re-boot and start with a clean system from time to time as you go: RAM will get warmed up after the first run of something (the a.out file can still be in RAM from the previous run), and this may or may not be important to your experiments.

Report

You should write 3-5 page report (do not give me anything longer than 5 pages) describing your experiments and results.

In particular, your report should have the following:

Introduction. Describe at a high-level what you are doing and why, and describe what your high-level hypothesis is (i.e. "We started out guessing that Linux's page replacement policy is X, but our experiments will show that it is in fact, more like Y" or "we start out making no assumptions about Linux's page replacement policy, and our experiments will answer the question of is it more like X, Y or Z.").
Short paragraph describing how you implemented your system call to get get per-process, per-owner, and system wide page fault counts? Also, describe other Unix utilities you used to collect performance data for your experiments.
Main part: describe your experiments and results: For each experiment, you should:
1. Clearly state what your hypothesis is
  (i.e. "Application that does X should be a bad/good case for Linux's page replacement policy because ...)
2. Explain how your experiment is testing that hypothesis
  (We are testing this hypothesis by running application P, which does ..., N times on and collecting X,Y, and Z metrics using our system call, vmstat, time,... With these measures, we can see whether or not ... because ...).
3. Present your Results.
4. Explain your results!
  (Our results show that ... This does/does not match our expected result, because ... (if it doesn't, think about some other tests you could run to explain why it doesn't, or look for data you do have to explain it)
Quality of Experiments is much, much better than quantity; a few well thought out and well done experiments is sufficient.
Conclusion. It should include a statement of the major result(s) of your experiments (which policy(ies) is Linux's most like? and in what way(s)?). Also, tell me what you found to be the most difficult part of the hypothesis testing phase of this assignment.

Some useful system calls and utilities for testing purposes

sleep(val) make program block for val seconds
usleep(val) make program block for val micro seconds
clone create a kernel level thread
you can also write a pthreads program to create threads. See my documentation about pthread programming and pthreads man pages.
fork you know what this does
your system call
your system calls from previous labs (Events or procinfo may be useful for testing or for use in test programs you write).
time a.out times the execution of the a.out file
vmstat: prints out statistics about the VM system. This may help you in verifying that your system call is implemented correctly (vmstat alone is not enough to do your testing...you need your system call). 'vmstat -n 1' prints out vmstat statistics every 1 second.
watch -n X [command]: every X seconds execute command. For example, to print out the contents of the stat file for process 22832 every 1 second I'd do the following:
```
	watch -n 1 cat /proc/22832/stat
```
/proc: file system interface to processes. You can read (and write) variables in a process' address space (I'd recommend never writing).
- in /proc/sys is the /proc interface to the kernel. By cat'ing files in these subdirectories, you can read kernel statistics.
- in /proc/[pid] directories are the /proc interface to individual processes in the system. Again you can cat the contents of files in these directories to get per-process statistics.
- /proc/meminfo
- /proc/vmstat: system-wide VM information
- /proc/swaps: prints out the current amount of swap space in use. You can cat this out to see if your experiments are triggering swapping. In Linux, only pages that are written to are swapped out (like stack, heap or global space). Pages of instructions (.text) are paged in from the file system and not to/from swap.
tools for running experiments using screen, script, and writing bash scripts to run a set of experiments. You can also write some simple python scripts to process post experiment output in a typescript file, for example. As you write bash scripts to run large sets of experiments, keep this post processing in mind (you may want to echo out some state periodically to make it easier to write such a post processing script).
You will also likely want to run virtualbox in non-gui mode so that you can start up a bunch of experiments, logout, and then come back later to see how they worked out.

Latex

You can use any word processing software you'd like to write your report. However, if you want to learn latex (word processing the Unix way), there are some starting points for a report (paper) and a report using bibtex for references (bibtex) in the following directory:

/home/newhall/public/latex_examples

For this assignment, the example in the paper subdirectory is likely a good starting point. In the paper subdirectory is an README file with information on how to create a .dvi .ps and .pdf file from the latex source file (example.tex). In addition, there is a makefile that contains rules for building all of these. The example.tex file, contains examples of importing postscript figure files, creating sections and subsections, using different fonts, creating lists, and referencing figures within a document. Probably the easiest thing to do is to grab the contents of this directory as a starting point for your report, then edit example.tex with your report.

Also, off my Unix and C help pages, I have much more documentation about creating documents, and figures, graphs: Tools for Documents and Figures

Submit

Code: using cs45handin, submit a tar ball with the following:
1. README file with (1) your names, (2) how many late days you have used on this assignment, and how many you have used total, (3) a description of how to run your test programs (show example command lines, or how to run bash scripts).
2. copies of all .c and .h files you added or modified in the kernel to implement your system call
3. copies of all your test programs used in experiments, and the test program you used to test your system call for correctness.
  also include any bash scripts you wrote for running experiments.
Report: turn in a hardcopy of your report in class on the same day that you submit your tar ball. You should also email me a pdf file; but I want a print out handed in in class too.
NOTE: If you turn in your report later than the beginning of class on the due date, then your assignment is late (even if your tar ball was submitted before the due date).

Demo

You and your partner will sign up for a 20 minute demo slot. You should show me that your system call is correct, robust, and implements all required features (and think about verification of this to demo me). In addition, I'd like to see you run one or more of your test programs that you ran as part of your experiments.

CS45 Lab 4: Paging Experiments

Getting Started With Testing

Report

Demo