1. Due Date
-
Checkpoint: (floors 1 and 2): due 11:59 PM, Tuesday, March 3 (before break). You may not use late days on the checkpoint.
-
Complete Soln: (floors 1,2,3,4, (5 extra credit)): due 11:59 PM, Tuesday, March 17
Your partner for this lab is: Lab 5 Partners
2. Handy Resources
-
Weekly Resources:
-
Refer to the Week 6 weekly lab page
-
IA32 Reference Sheet, by Julie Zelenski
-
-
General Resources
3. Lab Goals
-
Gain experience reading and tracing through the execution of IA32 assembly instructions.
-
Enhance your understanding of how IA32 translates to C instructions, data structure access, and function calls.
-
Practice with tools for examining binary files.
-
Put your GDB skills to work to solve an assembly code puzzle.
4. Lab Overview
You have been sleep walking again, and you wake up on the roof of Parrish Hall. You need to find your way through its maze of floors and out of the building in time for your first class. The problem is that due to construction, there is no stairwell that connects more than two floors. As a result, you need to travel along each floor to find the next open stairwell down to the next floor below. However, there are people or things along your path that can trip you up and impede your progress, forcing you to run back to the roof to try again.
In this assignment, you and your partner are going to receive a binary
maze program. Your maze has 5 phases, one for each floor of Parrish Hall
(from the roof to out the door). Each floor’s phase is a binary puzzle
that needs to be solved to move on to the next floor. To solve
a puzzle you need to enter a correct phrase on stdin
(you can also have
your maze read phrases from a file given as a command line argument).
Your goal is to solve all phases/floors of your maze, limiting the number of times you trip-up along the way and have to start all over.
Your maze will automatically notify me every time you trip-up and whenever you have solved a puzzle on a particular floor.
5. Lab Starting Point Code
5.1. Getting Your Lab 5 Lab Repo
Both you and your partner should clone your Lab repo into
your cs31/labs
subdirectory:
-
get your Lab ssh-URL from the CS31 git org. The repository to clone is named Lab05-user1-user2, where user1 and user2 are the user names of you and your Lab partner.
-
cd into your
cs31/labs
subdirectory:$ cd ~/cs31/labs $ pwd
-
clone your repo
$ git clone [your Lab05-user1-user2 url] $ cd Lab05-user1-user2 $ ls README.md maze_ID maze_how_we_solved_it
Next, add a file named
soln.txt
to your repo. One of you and your partner:$ touch soln.txt $ git add soln.txt $ git commit -m "adding soln.txt $ git push
Then the other partner do a
git pull
.
The files in here include: README.md
, maze_ID
,
maze_how_we_solved_it
, and soln.txt
.
The README.md contains details of what you’re expected to fill in
for each of these other files.
There are more detailed instructions about getting your lab repo from the "Getting Lab Starting Point Code" section of the Using Git for CS31 Labs page.
5.2. Getting Your Maze: we will do this together in lab.
-
First,
cd
into your~/cs31/labs/Lab05-user1-user2
subdirectory. -
Next, only one of you will follow these steps to get your maze.
Each time you register on the maze server, you will receive a unique maze with unique solutions. For this reason, you should only perform the following steps once! |
-
In a browser on a CS lab machine, one of you or your partner should enter this url:
http://squash.cs.swarthmore.edu:8000/
-
Enter your CS user name and choose Submit.
-
Choose to save the
mazeX.tar
file in the dialog box that pops up. Save this.tar
file into your Lab05-user1-user2 repository directory. If you are not able to choose your Lab05-user1-user2 directory as the download directory, then the file is likely saved in yourDownloads
directory. Use themv
command to move it into your lab repo:cd ~/cs31/labs/Lab05-user1-user2 mv ~/Downloads/mazeX.tar .
-
Add your
mazeX.tar
file to your repo:git add mazeX.tar git commit -m "our maze" git push
At this point, your partner can git pull
and you can both untar
your mazeX.tar file. A tar file is an archive file (a single file
that contains a number of files). These files can be extracted
by running the tar
command:
cd ~/cs31/labs/Lab05-user1-user2
ls # you should see your mazeX.tar file
tar xvf mazeX.tar
README maze* main.c
Do not run the maze program yet!
5.3. Starting Point and Maze Program Code
The files included in your repo are:
-
maze_ID
: a file into which you will add your maze ID (maze number). -
maze_how_we_solved_it
: a file into which you will add your description of how you solved each phase of your maze. -
soln.txt
: a file into which you will add the solution to each maze phase, one solution per line. You can use this file as input to your maze program.
The files included in your mazeX.tar file are:
-
main.c
: contains the maze program’s main function. You can open main.c in vim (or another text editor) and see what the code is doing. -
maze
: contains your maze program binary. This is binary contains a IA32 maze (described below), that you need to solve for this lab. Do not run the maze program yet!
5.4. Checking the Status of your Maze
To check the status of your maze program, enter the following url into a browser (that is running on a CS machine):
http://squash.cs.swarthmore.edu:8000/scoreboard
or click here: Maze Scoreboard
5.5. Running your Maze program
The maze program can be run both with and without your soln.txt
input file (as you solve floors we recommend running with this).
Do not run the maze program yet!
As you are solving your maze, you are almost always going to want to
run it in gdb
or ddd
. Just start it in gdb, set breakpoints,
and then run (with or without the soln.txt
input file),
and step through its execution:
gdb ./maze
(gdb) break main
(gdb) run # run maze
# OR
(gdb) run soln.txt # run maze with your soln.txt input file
You can also just run your maze program from the command line.
./maze
./maze soln.txt
If you are afraid your maze is about to trip-up, just enter
CNTL-C
to kill it.
To make reporting work correctly, your maze will only run on one of the CS lab machines. The maze lab program and the browser connecting to the scoreboard must both be running on a CS machine. If you want to work from home, you should ssh into one of the lab machines and run your maze (and possibly the browser too if you want to check the scoreboard) on a CS machine. See the remote access CS help page about how to do this. |
6. Lab Details
The binary maze is a program that consists of a sequence of assembly language puzzles, one corresponding to each floor of Parrish that you need to pass through to get out the door. Each puzzle expects you to type a particular string when prompted. If you type the correct string, then the puzzle is solved and the maze program proceeds to the next floor. Otherwise, the maze program issues a trip-up message and terminates.
You will submit your lab6 solution in two parts:
-
Part 1: The Checkpoint: getting past the first two floors of your maze.
-
Part 2: The Complete Solution: getting past the first four floors of the maze and out the door. And submitting your write-up of how you solved the puzzle on each floor.
-
extra credit: solving floor 5. If you solve floor 5, to receive extra credit you must also include a description of how you solved it in your write-up.
The maze program is solved when every puzzle on every floor has been solved. You will be penalized for every trip-up that you let fully happen (1/4 a point for each one), so you need to be careful not to trip-up too many times. You can receive up to 71 points out of 66 total for this lab (5 points are extra credit):
-
Solving the first 4 puzzles are each worth 10 points (40 pts total).
-
Solving the 5th puzzle is worth 5 points (this is not required, and you must include in your write-up a description of how you solved it to receive all 5 points).
-
Your write-up of how you solved each floor is worth 4 points for each floor (16 points total). Your write-up should be in the file
maze_how_we_solved_it
in your {labgitrepo} subdirectory. -
The checkpoint is worth 10 points.
Additionally, 5 points will be taken off total for trip-ups. You will lose a point for each 4th trip-up (number 4, 8, 12, …), so you get a few for "free". I will not take off more than 5 points total for trip-ups, unless it is clear that you are trying a brute force approach.
Note: All labs in the course have equal weight when determining your overall lab score. This lab is not worth 10x the previous ones!
6.1. Solving Your Maze
-
You must run your maze on one of the CS lab machines; the maze will always trip-up if run elsewhere. There are several other tamper-proofing devices built into the maze binary as well. In particular, using the gdb
set
command while trying to solve your maze will cause a trip-up.
To kill your maze executable (to make it exit without tripping-up), type CNTRL-C. This way you can run your maze, solve a puzzle on a floor, and then exit and come back later to try the puzzle on the next floor. |
-
You can use many tools to help you solve your maze. Look at the hints section below for some tips and ideas. The best way is to use
ddd
orgdb
to step through the execution of the disassembled binary. -
Although the puzzles on each floor get progressively harder to solve, the expertise you gain as you move from floor to floor should offset this difficulty.
-
Once you have solved the puzzle on a floor, I encourage you to run your maze with a
soln.txt
file containing the input phrases for the floors you have solved. The format of the file should be one phrase per line, in order of the maze floors. Using an input file will help to prevent you from accidentally tripping up in the maze on a previously solved floor. For example:./maze soln.txt
reads the input lines from
soln.txt
until it reaches EOF (end of file), and then switches over tostdin
for the remaining input. This feature is also nice so you don’t have to keep retyping the solutions to floors you have already solved. The maze ignores blank input lines, both in the file and onstdin
. -
To avoid accidentally tripping up in the maze, you will need to learn how to single-step through the assembly code and how to set breakpoints. You will also need to learn how to inspect both the registers and the memory states. One of the nice side-effects of doing the lab is that you will get very good at using a debugger. This is a crucial skill that will pay big dividends the rest of your career!
7. Lab Requirements
-
You must solve your maze by examining it at the assembly code level using tools like
gdb
,ddd
,strings
,objdump
, and other similar tools for examining binary files. -
For the checkpoint you need to get past floors 1 and 2, and you to edit the
mazeID
file and push it to your repo. -
For the complete solution you need to get past floors 1-4, complete and submit the write-up of how you solved each floor, and submit the solution to each floor.
Write-up Requirements (for the Complete solution only)
-
Edit the
maze_how_we_solved_it
in vim to include a short explanation of how you solved each floor and a short explanation of what each floor is doing. -
Describe at a high-level what the original C code is doing for each floor. For example, is it doing some type of numeric computation, string processing, function calls, etc. and describe the specific computation it is doing (i.e. what type of string processing and how is that being used?).
-
Don’t describe in terms of registers and assembly code for this part, but describe what the puzzle on each floor is doing at a higher-level in terms of C semantics. You do not need to reverse engineer the IA32 code and translate every part of it to equivalent C code. Instead, give a rough idea of equivalent C or pseudo code for the main part of the puzzle on each floor.
For example, something like "uses an if-else to choose to do X or Y based on the input value Z" is an appropriate right level of explanation. Something like "moves the value at
%ebp-8
into register%eax
" is way too low-level. -
The lab write-up lab should not be onerous; you should be able to explain each puzzle in a short paragraph or two (maybe with a few lines of C or pseudo code to help explain). I recommend doing the write-ups for each floor as you solve them.
-
Excessively verbose, low-level descriptions will be penalized, as will vague descriptions; you want to clearly demonstrate that you figured out what that floor is doing by examining the IA32 code for each floor in your maze executable.
-
If you are unable to solve a floor, you can still receive partial credit for it in the write-up part by telling me what you have determined about that floor.
8. Tips
There are many ways of solving your maze. There are various tools for
examining the program binary without running the maze program. These
may provide some helpful information for solving some floors. The
most useful tool will be gdb
(and ddd
), which will allow you to
run the maze program, set breakpoints, step through parts of its
instruction’s execution, and examine its execution state. This will
help you to discover information about what the program does, and you
can use this information to solve your maze.
Remember that the maze program must be run on a CS machine. See Section 5.5 for information about how to run your maze, and how to run it remotely.
8.1. How not to solve the maze lab
Do not try to use brute force! You could write a program that tries every possible input string to find the right one. But this is no good for several reasons:
-
You lose 1/4 point (up to a max of 5 points) every time you guess incorrectly and the maze trips-up. Every 4th trip-up is -1 points (3 trip-ups is -0 points).
-
Every time you guess wrong, a message is sent to the mazelab server. You could very quickly saturate the network with these messages, and cause the system administrators to revoke your computer access.
-
We haven’t told you how long the input strings are, nor have we told you what characters are in them. Even if you made the (incorrect) assumptions that they all are less than 80 characters long and only contain letters, then you will have 26^80 guesses for each floor. This will take a very very long time to run, and you will not get the answer before the assignment is due.
8.2. How to solve the maze lab!
There are many tools that are designed to help you figure out both how programs work, and what is wrong when they don’t work. Here is a list of some of the tools you may find useful in analyzing your maze, and hints on how to use them. And refer to the {weeklab}[Week 6] weekly lab page for more information on using gdb and tools for examining binaries:
-
ddd (or gdb) maze
The GNU debugger will be your most useful tool. You can trace through a program line by line, examine memory and registers, look at both the source code and assembly code (we are not giving you the source code for most of your maze), set breakpoints, and set memory watch points.ddd
is likely a more helpful interface for this lab due to its support for multiple windows in which you can simultaneously see register values, and assembly code as you enter gdb commands. -
draw the stack and register contents as you are tracing through code in gdb, and take notes as you go (this will also help you with the write-up part of the lab assignment).
-
strings maze
: display the printable strings in your maze. -
objdump
: objdump may provide some information that is helpful, butddd
andgdb
will be your most useful tools-
objdump -t maze
prints out the maze’s symbol table. The symbol table includes the names of all functions and global variables in the maze, the names of all the functions the maze calls, and their addresses. You may learn something by looking at the function names. -
objdump -d maze
: disassemble all of the code in the maze. You can also just look at individual functions. Althoughobjdump -d
gives you a lot of information, it doesn’t tell you the whole story. Calls to system-level functions are displayed in a cryptic form. For example, a call tosscanf
might appear as:8048c36: e8 99 fc ff ff call 80488d4 <_init+0x1a0>
To determine that the call was to
sscanf
, you need to disassemble within a running maze program usinggdb
.
-
Looking for documentation about a particular tool? The man
command will help
you find documentation about unix utilities, and in gdb
the help
command
will explain gdb commands:
$ man objdump
(gdb) help ni
Here is some more information about man
8.3. Notes on odd instructions or code sequences
You may find some code in your maze lab that uses instructions that we have not talked about in class. Here are some notes about some of these:
-
Conditional Move
cmov
. Conditional move instructions perform a move only if the specified condition is true (the move is performed based on the condition code settings, similar to how conditional jumps test condition codes and jump or not based on their values). For example, thecmovne
instruction tests the condition not equal (ne
) and performs the mov only if the condition is true (ne
is true when the zero flag is 0). -
Stack canary code. You may see some odd code sequences around function return code that includes an instruction that looks like this:
xor %gx:0x14,%eax
. For example:0x56556a53 <+100>: xor %gs:0x14,%eax 0x56556a5a <+107>: jne 0x56556a61 <func+114> < function return instructions here> 0x56556a61 <+114>: call 0xf7eb1ee0 <__stack_chk_fail>
You can ignore this sequence of instructions in figuring out the solution to a maze floor. You should, however, look at the
<function return instructions here>
parts of the code that are surrounded by this sequence of instructions.This is stack canary testing code that the compiler generates to check that the stack is not corrupted by the function’s execution (called a buffer overflow error). The code checks something called a stack canary that is used to detect buffer overflow. If the compared value doesn’t match the stack canary value then the stack has been corrupted and the
__stack_check_fail
function is called, likely terminating the program with a stack memory error. Here is some more information about what a stack canary is: stack canaries -
Instructions that reference parts of registers. You may see code sequences that reference the lower half or lower quarter of a register. For example:
mov %al,-0xa1(%ebp)
stores a 1 byte value from the lower byte in register
eax
to the specified address. Instructions with sub-registers are generated when the C code manipulates values that are smaller than 4-bytes, such as short or char values. Specific sub-bytes of the 4-byte registereax
can be specified in instructions as registers:ah
is the byte in bits 15-8 of registereax
;al
is the byte in bits 7-0 of registereax
; andax
is the 2 byte value in bits 15-0 of registereax
. See the lecture notes, and "Advanced Register Notation" section of Chapter 6.1 of the textbook for more information about the names of these portions of the general purpose registers.
9. Submitting
Whether or not you have solved a maze floor, and how many times you have tripped-up your maze is automatically submitted to the maze server by your maze program, but you still need to submit a few things via GitHub:
-
For the checkpoint in addition to solving the first two floors, you need to fill in the
maze_ID
file with your maze number and the name of you and your partner. -
For the complete solution, you should additionally submit
soln.txt
, which contains the inputs you used to solve each floor, andmaze_how_we_solved_it
, a file the describes how you solved each floor, as detailed above. Note: Floor 5 is not required. If you solve it, you will receive extra credit, but you must also include a description of how you solved it in your write-up.
To submit your code, commit your changes locally using git add
and
git commit
. Then run git push
while in your lab directory.
Only one partner needs to run the final git push
, but make sure both
partners have pulled and merged each others changes.
Here are the commands to submit your complete solution
(from one of you or your partner’s cs31/labs/Lab05-user1-user2
directory:
$ git add maze_how_we_solved_it soln.txt mazeID
$ git commit -m "Lab 5 completed"
$ git push
If you have difficulty pushing your changes, see the "Troubleshooting" section and "can’t push" sections at the end of the Using Git for CS31 Labs page. And for more information and help with using git, see the git help page.