CS 45 — Lab 4: Simulating a Page Table
Checkpoint: March 21/22 (demo during lab)
Due: Wednesday, March 27 @ 11:59 PM
1. Overview
For this lab, you’ll be translating a sequence of virtual memory accesses into their corresponding physical memory addresses. You’ll build a simulator that will keep a single-level page table for one process and update the page-to-frame memory mappings as the process accesses virtual memory addresses. You’ll be simulating the page table and physical memory rather than using Linux because:
-
The details of memory hardware are nasty, particularly for architectures like x86 that have been around for a long time and support lots of backwards-compatibility features.
-
If your non-simulated implementation breaks, you’re unlikely to get much feedback, making development very frustrating.
-
Building an abstract simulator should cover the important high-level concepts without bogging us down in architectural details.
The provided starter code has three main components:
-
frames.h
&frames.c
: Simulates physical memory (manages physical frames). -
pages.h
&pages.c
: Simulates virtual memory mappings (page table) and tracks the page replacement policy. -
simulator.c
: The main file that reads a sequence of addresses, performs memory mappings, and prints the results.
Your simulator will take several parameters to vary the characteristics of the memory system you’re simulating, including the size in bytes of a page/frame, the size in bytes of the virtual address space, the size in bytes of the physical address space, and the page replacement policy to use.
1.1. Goals
-
Design and implement a single-level paging system.
-
Compare and contrast page replacement policies.
-
Use the
mmap()
function to map a file into a process’s virtual address space. -
Reason about the sizes and number of bits used to track pages and frames.
1.2. Handy References
1.3. Lab Recordings
[Week 2] |
2. Requirements
2.1. Memory and Files
Your simulator should free all memory, close all files it’s no longer using,
and get a clean bill of health from valgrind
: no invalid memory accesses,
uninitialized variables, or memory leaks.
Running valgrind
shouldn’t be an afterthought! Use it as you incrementally
test your code, and it’ll help you to catch memory problems sooner.
2.2. Address Translation
Your simulator should read a sequence of virtual memory addresses from a memory-mapped file. The file will contain addresses that are stored as 64-bit integers in a little-endian binary format (the native format for integers on x86).
The simulator already reads in the sizes of a page/frame, virtual address space, and physical address space (all in bytes) as command line parameters. It passes those values to the physical and virtual memory initializers, which you will write. Those initializers should set up the memory based on the provided sizes. You may assume that:
-
The page/frame size will evenly divide the VAS and PAS sizes.
-
The page/frame size will be a power of two.
-
The VAS size will be a power of two.
-
Your simulator will not be asked to translate a virtual address that is larger than the provided VAS size.
2.2.1. Mapping the Input File
The mmap()
function will
allow you to place the contents of an input file directly into your process’s
virtual address space. That way, you won’t need to call functions like
read()
or fread()
to access the file, you’ll be able to access it as if it
were an array loaded into memory.
When you map the file, you’ll want to map it as private (only accessible to the
simulator process) and read only (your simulator doesn’t change the contents).
While mmap()
allows you to express where in the virtual address space you’d
like the file to be mapped, for our purposes we don’t actually care — it’s
easier to just let the OS decide for us.
PLEASE look over the manual page for
mmap()
. It will tell you
how to express all of the preferences stated above. It will also describe how
you can check the return value for errors. I’m happy to answer mmap()
related questions, but I’d also like you to experiment a bit with memory
mapping the file.
2.3. Reporting Results
Your simulator can print any debugging information you’d like to the standard
error stream, but standard out should be reserved for your results, which
should be formatted as follows: for each input virtual address, you should
print the corresponding full physical address that it maps to, followed by a
newline (\n
). Each address should be printed as a 64-bit hexadecimal number
with a "0x" prefix and any necessary leading zeroes to pad the length of the
address to 16 hex digits. The starter code has an example of the printf
format
you should use.
Finally, after all the addresses have been translated and printed, you should print just a single number on a line by itself: the total number of page faults that occurred during your translations.
2.4. Free Frame Management
When initializing physical memory, frame 0 should be permanently reserved for the OS. You should initialize all other frames as free. When you detect a page fault (you are asked to map a page that is not in physical memory), you should first check physical memory to see if any frames are available. If so, you should give the faulting page the lowest numbered free frame and mark the frame as in-use.
2.5. Page Replacement
When you detect a page fault after all frames are full, you’ll need to evict a
mapped page to make room for the one that’s faulting. Your simulator should
support three page replacement policies to choose a victim page: LRU, Clock,
and a third policy that you invent. The selected policy will be passed in as
an integer command line parameter using constants that are defined in
pages.h
: POL_LRU (1), POL_CLOCK (2), and POL_CUSTOM (3).
-
For LRU, you should track the full order of page usage and evict the least recently used page.
-
For clock, you should initialize the clock hand to page 0.
-
For your custom policy, you can implement any non-trivial policy you want. To be eligible for extra credit, your custom policy must be deterministic (i.e., not random — every run should produce the same results if given the same inputs).
2.6. Data Storage and Function Interface
You’re welcome to add or modify structs, but you should not change any of the
function prototypes in pages.h
and frames.h
. Any global state or helper
functions you add to pages.c
and frames.c
should be static
(only accessible
in the file it’s defined in). Adding helper functions in those files is
strongly encouraged, but the simulator should ONLY be able to interact with
physical memory using the functions declared in frames.h
and the page table
using functions declared in pages.h
.
To be clear:
-
pages.c
should not includeframes.h
-
frames.c
should not includepages.h
-
data should not be shared between either module and the main simulator unless it’s through the provided function interface, which you should not change.
3. Checkpoint
To meet the checkpoint requirements, you should:
-
Add enough to
translate()
to see the sequence of virtual addresses that are arriving for translation (for debugging purposes). -
Sketch out a design for LRU (data structures and the operations you’ll need fully described in comments).
4. Tips & FAQ
-
When getting started, you should sketch out your
translate()
function using the functions that are available to you inpages.h
andframes.h
, even if you haven’t implemented them yet. -
The starter code provides several example sequences in the
input/
directory that you can use for testing. Thenotes
file contains more information about them. If you’re using these tests, make sure your VAS is at least 65536 bytes:./simulator 4096 65536 <PAS> <policy> <input file>
-
You can test your own memory address sequences by using the
write_bytes.py
python script to create a new mmapable input file. To use it, add a sequence of memory addresses in the list at the top and then run it with one argument: the filename you’d like to write. -
Even though you’ll only be using one page replacement algorithm at a time, it’s probably easiest to keep state for all the algorithms in your page struct rather than trying to dynamically allocate just one or the other (e.g., using lots void *'s, casting, and/or function pointers).
-
You might find the sys/queue.h header helpful, particularly the TAILQ library it provides.
5. Bonus Competition
This portion of the lab is not required. It’s for fun and a small amount of extra credit.
For a chance at extra credit and eternal CS 45 glory, you may optionally nominate your custom page replacement algorithm to compete against other submissions.
5.1. Rules for Participation
-
To participate, you must define a third replacement policy, numbered 3, and write a brief description of the intuition behind your policy in a file named
competition.txt
. Just a couple of sentences is fine. If nocompetition.txt
file is present, your submission will not enter the competition. -
You may NOT look ahead in the stream of memory accesses. That is, your simulator must only consider the address it’s currently translating plus any past information that it has seen, but it may not look at the addresses that will be accessed in the future.
-
You may NOT submit your LRU or Clock implementation as your competition submission. It needs to be a new algorithm of your design!
-
Your policy must be deterministic. That is, given the same inputs it must always produce the same output.
5.2. Prizes
I’ll run your new policy on two address streams: one that has good locality, and another that has poor locality. For each of the two inputs, I’ll declare a winner and runner-up based on the total number of page faults. The winner gets 1 extra lab point, and the runner-up gets 1/2. In the interest of balance, a group can only win one of the two files, with the good locality file being determined first.
6. Submitting
Please remove any excessive debugging output prior to submitting.
To submit your code, commit your changes locally using git
add
and git commit
. Then run git push
while in your lab
directory.