Lab Due Date: Tuesday, September 6, 11:59 PM
Handy References
Lab 0 Goals
-
Use
git
to clone a repository full of starter code. -
Practice writing C programs and refresh your understanding of the memory layout of a process
-
Use GDB to identify and access functions, local variables, and function arguments in the stack.
-
Use GDB to map calling function calling conventions, x86 registers values.
Overview
This lab is meant to help explore the basics of reverse engineering and to set up for more advanced buffer overflow attacks in following labs. This week we will work our way through a C program to compile it into assembly and understand the basics of stack layout.
Recall from CS 31 that the stack data structure is organized into units called frames.
-
Each stack frame maintains the invariant:
%esp
or the stack pointer points to the top of the stack and%ebp
or the base/frame pointer, points to the bottom of the stack. -
Within each stack frame, we maintain state about the function including local variables, previous stack frame base address (or the caller’s frame pointer), the instruction to return to in the caller function and function arguments.
In this lab we will use our understanding of memory layout in the stack, to explore how we might run simple stack buffer overflows.
Lab Requirements
You will be required to:
-
Understand the memory layout for
fib.c
and use gdb to answer the questions inlab0-worksheet.adoc
. -
Guess the secret code on
main.c
by deploying a basic buffer overflow attack. -
Describe how and why your buffer overflow attack works in
lab0-worksheet.adoc
.
Getting your Lab0 Starting Point Code
Log into CS88 Github for our class and get the ssh-URL to your lab git repository. Follow along with the prompts below to SSH, create a lab directory and clone your lab repos. For a refresher on getting setup with git take a look at Git Setup.
# ssh into our lab machines ssh yourusername@lab.cs.swarthmore.edu # create a cs88/labs sub-directory in your home directory mkdir ~/cs88 cd cs88 mkdir labs cd labs # clone your lab0 repo into your labs sub-directory git clone [your-ssh-URL] # change directory to list its contents cd lab0-you # ls should list the following contents ls Makefile README.md lab0-worksheet.adoc fib.c secret main.c
If you have not yet been assigned a github account, please follow the instructions below to access lab0 code and to get started with the lab.
|
# create a cs88/labs sub-directory in your home directory mkdir ~/cs88 cd cs88 mkdir labs cd labs mkdir lab0-you cd lab0-you # copy the starter code into lab0 # USAGE: cp <source> <destination> $ cp ~chaganti/public/cs88/lab0/* ./ # <-- the dot means "here" (the current directory) $ ls Makefile README.md lab0-worksheet.adoc main.c secret
Lab-0 Functionality
-
To get started with the lab run
make
. You should now see a compiled binaryfib
along withsecret
. Run the following command to set executable permissions onsecret
:chmod +x secret
. A successful attack onsecret
is shown below:
$ ./secret < attack
Enter secret number:
You are so wrong!
You win!
As you work through this lab it will be helpful to have the source code open in the editor of your choice, and a separate terminal window to run gdb
.
-
Your first task is to use gdb to walk through
fib
and use the references provided on this lab page to answer questions inlab0-workseet.adoc
. Here are some basic gdb commands to get you started:gdb fib #runs gdb debugger on fib. break main #sets a breakpoint in main break 4 #sets a breakpoint on line 4 run #runs the code and stops at the first breakpoint.
info break #displays a list of breakpoints Example: Num Type Disp Enb Address What 1 breakpoint keep y 0x0000120e in main at fib.c:17 2 breakpoint keep y 0x000011ae in f at fib.c:4
disas main #disassembly for main info reg #shows the register addresses and values info reg esp ebp eip #shows specific register addresses Example: esp 0xffffd810 0xffffd810 ebp 0xffffd818 0xffffd818 eip 0x565561ae 0x565561ae <f+17>
info frame #information about the current frame help info frame #provides "man" page equivalent for the command.
-
You can also run a single
gdb
command in a separate terminal window to view the output ofdisas
. Open a new terminal window and run the following command.# command to view assembly from functions main and f in file fib.c gdb -batch -ex 'file fib' -ex "disassemble main" -ex "disassemble f" # command to view assembly from function main in file fib.c gdb -batch -ex 'file fib' -ex "disassemble main" # command to view assembly from function main in file secret gdb -batch -ex 'file secret' -ex "disassemble main"
There are many more examples provided in the readings above, walk through those readings to answer the questions in the worksheet.
Simple Buffer Overflow
Once you are familiar with the workings of fib.c
you should be ready to move on to secret
.
For the second part of the lab, you are provided with a compiled binary secret
that is called by main
. Your job is to get to endGame()
without loosing. To do so you will have to provide input to the function either that correctly guesses the secret string, and the correct secret value, or as a savy reverse engineering hack, by pass all of the code and force the function to execute endGame
.
To see how secret
works let’s run it with some arbitrary user input:
$ ./secret
Enter secret number:
1
You are so wrong!
Unfortunately we don’t have the source code for secret
- (that would make this task trivial). But we do have main that calls on functions defined in secret
. Looking at main we see that it takes as input a char buf
that is 12 characters long, performs some operations on this buffer before calling endGame
.
-
Naive Approach: Your first task is to use
gdb
to figure out what the functionsgetSecretCode
andcalculateValue
do. Usegdb
to walk through the function and figure out at a high-level what each of these functions do. -
Security Mindset: In this task, you decide to be smarter and find loopholes or vulnerabilities in the code that you can take advantage of to get past the checks to reach
endGame
and succeed.-
Notice that one of the first instructions in
main
is the call toscanf
. If you typeman scanf
on the terminal, you will notice that there is no specification for the length of the string thatscanf
takes as input! -
You can now leverage this and try to enter an input to main as a secret number that far exceeds the length of
buf
and see the results. If you see a segmentation fault you are on the right track! You’ve effectively "overflown" the buffer into neighboring regions of memory (in this case the stack memory located at higher addresses), and corrupted the state of the frame causing the program to SEGFAULT.$ ./secret Enter secret number: 1233943249324320948234091238401923874129348710329587495 You are so wrong! Segmentation fault (core dumped)
-
Since our goal is not to simply corrupt the code but manipulate it to win we can try being slightly smarter with how we add data to our input. We can now overflow our stack in a manner such that when reach the
eip
register we overwrite it to point to the instruction toendGame
. -
To accomplish this, let’s first call
gdb
with 121
s as input. To automate the process of creating multiple inputs we can use a python command to help us out:python -c 'print "1"x12' > allOnes
. -
In gdb you can just pipe this input when we run
secret
as follows:$gdb secret (gdb) break main Breakpoint 1 at 0x8048575 (gdb) run < allOnes Breakpoint 1, 0x08048575 in main () (gdb)
You can now try to put a breakpoint right after the call to scanf to confirm that our input of all 1s is in memory. Hint: Try using the examine command
x/20w $esp
to view 20 words on the top of the stack. -
Once you figure out the starting address of the array
buf
you can calculate the distance from this memory address to the location ofeip
on the stack. If we manage to overfloweip
to point to the location ofendGame
, then we have successfully launced a buffer overflow attack!
-
In order to add a memory address into our secret number to change the value in eip, we need to account for the fact that multi-byte memory addresses in x86 CPUs use a little-endian format - i.e. from the least-significant byte ("little end") to the most-significant byte in consecutive addresses. |
-
Therefore, if we wanted to store a memory address of
0xabcdefgh
in our buffer following a string of1
s, we would use the following commandpython -c 'print "1"*12 + "\xgh\xef\xcd\xab"'>attack
.
Miscellaneous hints
Good systems programming and reverse engineering involves: |
-
use gdb to incrementally walk through your code and provide input
-
use a piece of paper to draw out the stack
-
locate the addresses on the stack
-
repeat step 1
-
To rerun the code in gdb you can simply call
run
again rather than quittinggdb
and ask gdb to start from the beginning. This way, your breakpoints are preserved. -
You can also change values of variables during execution in
gdb
using(gdb) set {int}0xfffff = 3.
Grading Rubric
Total: 30 points
-
10 points for successfully causing a buffer overflow in secret.
-
20 points for completing the worksheet.
Submitting
Please remove any debugging output prior to submitting.
To submit your code, simply commit your changes locally using git
add and git commit
. Then run git
push while in your lab directory.