CS31: Lab 5

Handy References

The IA32 cheat sheet.
GDB guide
The weekly lab pages for week 4 and week 6
Figure 3.30 (page 255) in the supplemental textbook is a nice summary of some useful GDB commands.

Lab 5 Goals:

Gain experience reading and tracing through the execution of IA32 assembly instructions.
Enhance your understanding of how IA32 translates to C instructions, data structure access, and function calls.
Practice with tools for examining binary files.
Put your GDB skills to work. We've been blabbering about it in labs for long enough!

Lab Description

You've been sleep walking again, and you just woke up on the roof of Parrish Hall. You need to find your way through its maze of floors and out of the building in time for your first class. The problem is that due to construction, there is no stairwell that connects more than two floors. As a result, you need to travel along each floor to find the next open stairwell down to the next floor below. However, there are people or things along your path that can trip you up and impede your progress, forcing you to run back to the roof to try again.

In this assignment, you and your partner are going to receive a binary maze program. Your maze has 5 phases, one for each floor of Parrish Hall (from the roof to out the door). Each floor's phase is a binary puzzle that needs to be solved to move on to the next floor. To solve a puzzle you need to enter a correct phrase on stdin (you can also have your maze read phrases from a file given as a command line argument).

Your goal is to solve all phases/floors of your maze, limiting the number of times you trip-up along the way and have to start all over.

Your maze will automatically notify me every time you trip-up and whenever you have solved a puzzle on a particular floor. Note: To make this reporting work correctly, your maze will only run on one of the CS lab machines.

Introduction

The binary maze is a program that consists of a sequence of assembly language puzzles, one corresponding to each floor of Parrish that you need to pass through to get out the door. Each puzzle expects you to type a particular string when prompted. If you type the correct string, then the puzzle is solved and the maze program proceeds to the next floor. Otherwise, the maze program issues a trip-up message and terminates. The maze program is solved when every puzzle on every floor has been solved. You will be penalized for every trip-up that you let fully happen (1/4 a point for each one), so you need to be careful not to trip-up too many times.

You can receive up to 71 points out of 66 total for this lab (5 points are extra credit):

Solving the first 4 puzzles are each worth 10 points (40 pts total).
Solving the 5th puzzle is worth 5 points (this is not required, and you must include in your write-up a description of how you solved it to receive all 5 points).
Your write-up of how you solved each floor is worth 4 points for each floor (16 points total). Your write-up should be in the file maze_how_we_solved_it in your cs31/labs/05 subdirectory.
The checkpoint is worth 10 points.

Additionally, I will take up to 5 points off total for trip-ups. You will lose a point for each 4th trip-up (number 4, 8, 12, ...), so you get a few for "free". I will not take off more than 5 points total for trip-ups, unless it is clear that you are trying a brute force approach.

(Note: All labs in the course have equal weight when determining your overall lab score. This lab is not worth 10x the previous ones!).

Solving Your Maze

You must run your maze on one of the CS lab machines; the maze will always trip-up if run elsewhere. There are several other tamper-proofing devices built into the maze binary as well. In particular, using the gdb set command while trying to solve your maze will cause a trip-up.

To kill your maze executable (to make it exit without tripping-up), type CNTRL-C. This way you can run your maze, solve a puzzle on a floor, and then exit and come back later to try the puzzle on the next floor.

You can use many tools to help you solve your maze. Look at the hints section below for some tips and ideas. The best way is to use ddd or gdb to step through the disassembled binary.

Although the puzzles on each floor get progressively harder to solve, the expertise you gain as you move from floor to floor should offset this difficulty.

Once you have solved the puzzle on a floor, I encourage you to run your maze with a soln.txt file containing the input phrases for the floors you have solved. The format of the file should be one phrase per line, in order of the maze floors. Using an input file will help to prevent you from accidentally tripping up in the maze on a previously solved floor. For example:

./maze soln.txt

will read the input lines from soln.txt until it reaches EOF (end of file), and then switch over to stdin for the remaining input. This feature is also nice so you don't have to keep retyping the solutions to floors you have already solved. The maze ignores blank input lines, both in the file and on stdin.

To avoid accidentally tripping up in the maze, you will need to learn how to single-step through the assembly code and how to set breakpoints. You will also need to learn how to inspect both the registers and the memory states. One of the nice side-effects of doing the lab is that you will get very good at using a debugger. This is a crucial skill that will pay big dividends the rest of your career.

Getting your maze: we will do this together in lab.

First, both you and your partner should clone your Lab 5 git repository. In it, you should find a few files: README, maze_ID, maze_how_we_solved_it, and soln.txt. The README contains details of what you're expected to fill in each of these files.

Next, only one of you will follow these steps to get your maze. Each time you register on the maze server, you will receive a unique maze with unique solutions. For this reason, you should only perform the following steps once!

In firefox on a CS lab machine, one of you or your partner should enter this url:
```
http://sesame.cs.swarthmore.edu:8000/
```

Enter your CS user name and choose Submit.

Choose to save the mazeX.tar file in the dialog box that pops up. Save this .tar file in your Lab 5 repository. Add and commit the file via git add and git commit and then push it to GitHub. If you're not given a choice of where to save the file, it's probably in /home/[username]/downloads. Move it to your Lab 5 repo with a mv command and then add/commit/push it.

Both you (immediately) and your partner (after running git pull) can untar the mazeX.tar file:
```
cd ~/cs31/labs/Lab5-...
ls     # you should see your mazeX.tar file
tar xvf mazeX.tar
```

If you cd into your mazeX subdirectory, you will see 3 files:
```
README  maze*  maze.c
```
You do NOT need to check these files in to your git repository.

Do not run the maze program yet!

Checking the status of your maze

From firefox running on a CS lab machine, enter this url (you cannot connect to this url from outside our network):

http://sesame.cs.swarthmore.edu:8000/scoreboard

You will see your maze in this list. The scoreboard displays how many stages you have solved, how many total trip-ups you have had, and your score so far. Note that your maze will not appear on the list until you either solve a floor or trip up at least once.

Re-load the page to see the latest information.

Write-up Requirements (Complete solution only)

Edit the "maze_how_we_solved_it" in your favorite text editor to include a short explanation of how you solved each floor and a short explanation of what the floor is doing.

Describe at a high-level what the original C code is doing for each floor. For example, is it doing some type of numeric computation, string processing, function calls, etc. and describe the specific computation it is doing (i.e. what type of string processing and how is that being used?).

Don't list registers and assembly code for this part, but describe what the puzzle on each floor is doing at a higher-level in terms of C semantics. You do not need to reverse engineer the IA32 code and translate every part of it to equivalent C code. Instead, give a rough idea of equivalent C or pseudo code for the main part of the puzzle on each floor. For example, something like "uses an if-else to choose to do X or Y based on the input value Z" is an appropriate right level of explanation. Something like "moves the value at %ebp-8 into register %eax" is way too low-level.

This part of the lab should not be onerous; you should be able to explain each puzzle in a short paragraph or two (maybe with a few lines of C or pseudo code to help explain). I recommend doing the write-ups for each floor as you solve them.

Excessively verbose, low-level descriptions will be penalized, as will vague descriptions; you want to clearly demonstrate to me that you figured out what that floor is doing by examining the IA32 code for each floor in your maze executable.

If you are unable to solve a floor, you can still receive partial credit for it in the write-up part by telling me what you have determined about that floor.

Tips

There are many ways of solving your maze. You can examine it in great detail without ever running the program, and figure out exactly what it does. This is a useful technique, but it not always easy to do. You can also run it under a debugger, watch what it does step by step, and use this information to solve it. This is probably the fastest way of solving it.

We do make one request, please do not use brute force! You could write a program that will try every possible input string to find the right one. But this is no good for several reasons:

You lose 1/4 point (up to a max of 5 points) every time you guess incorrectly and the maze trips-up.

Every time you guess wrong, a message is sent to the mazelab server. You could very quickly saturate the network with these messages, and cause the system administrators to revoke your computer access.

We haven't told you how long the strings are, nor have we told you what characters are in them. Even if you made the (incorrect) assumptions that they all are less than 80 characters long and only contain letters, then you will have 26^80 guesses for each floor. This will take a very very long time to run, and you will not get the answer before the assignment is due.

There are many tools that are designed to help you figure out both how programs work, and what is wrong when they don't work. Here is a list of some of the tools you may find useful in analyzing your maze, and hints on how to use them. And refer to the week05 and week06 weekly lab pages for information on using gdb and tools for examining binaries:

ddd (or gdb) maze
The GNU debugger will be your most useful tool. You can trace through a program line by line, examine memory and registers, look at both the source code and assembly code (we are not giving you the source code for most of your maze), set breakpoints, and set memory watch points.
draw the stack and register contents as you are tracing through code in gdb, and take notes as you go (this will also help you with the write-up part of the lab assignment).

objdump -t maze
This will print out the maze's symbol table. The symbol table includes the names of all functions and global variables in the maze, the names of all the functions the maze calls, and their addresses. You may learn something by looking at the function names.
objdump -d maze
Use this to disassemble all of the code in the maze. You can also just look at individual functions. Reading the assembler code can tell you how the maze works.

Although objdump -d gives you a lot of information, it doesn't tell you the whole story. Calls to system-level functions are displayed in a cryptic form. For example, a call to sscanf might appear as:
```
8048c36:  e8 99 fc ff ff  call   80488d4 <_init+0x1a0> 
```
To determine that the call was to sscanf, you would need to disassemble within gdb.
strings maze
This utility will display the printable strings in your maze.

Looking for documentation about a particular tool? The man command will help you find documentation about unix utilities, and in gdb the help command will explain gdb commands:

$ man objdump

(gdb) help ni

Submitting

Whether or not you have solved a maze floor, and how many times you have tripped-up your maze is automatically submitted to the maze server by your maze program, but you still need to submit a few things via GitHub:

For the checkpoint: in addition to solving the first two floors, you need to fill in the maze_ID file with your maze number and the name of you and your partner.
For the complete solution, you should additionally submit soln.txt, which contains the inputs you used to solve each floor, and maze_how_we_solved_it, a file the describes how you solved each floor, as detailed above. Note: Floor 5 is not required. If you solve it, you will receive extra credit, but you must also include a description of how you solved it in your write-up.

CS 31 Lab 5: Binary Maze
(The great escape from Parrish Hall)

Checkpoint (floors 1 and 2): due 11:59 PM, Tuesday, October 23

Complete Solution (floors 1, 2, 3, 4, (5 extra credit)): due 11:59, Tuesday, October 30

Handy References

Lab 5 Goals:

Lab Description

Introduction

Solving Your Maze

Getting your maze: we will do this together in lab.

Checking the status of your maze

Write-up Requirements (Complete solution only)

Tips

Submitting

CS 31 Lab 5: Binary Maze (The great escape from Parrish Hall)

Checkpoint (floors 1 and 2): due 11:59 PM, Tuesday, October 23

Complete Solution (floors 1, 2, 3, 4, (5 extra credit)): due 11:59, Tuesday, October 30

Handy References

Lab 5 Goals:

Lab Description

Introduction

Solving Your Maze

Getting your maze: we will do this together in lab.

Checking the status of your maze

Write-up Requirements (Complete solution only)

Tips

Submitting

CS 31 Lab 5: Binary Maze
(The great escape from Parrish Hall)