1. Goals for this week:
-
Learn tools for examining binary files (gdb and ddd in particular)
-
Practice examining a binary program file to discover what it is doing
-
Introduction to Lab 5.
2. Starting Point Code
Start by creating a week06
in your cs31/inlab
subdirectory
and copying over some files:
$ cd ~/cs31/inlab
$ mkdir week06
$ cd week06
$ pwd
/home/you/cs31/inlab/week06
$ cp ~richardw/public/cs31/week06/* ./
$ ls
Makefile mystery* README simplefuncs.c
Compile simplefuncs.c
using the provided Makefile
:
$ make
3. Tools for examining binary files
Some tools for examining binary files:
-
strings
dumps all the strings in a binary file:
$ strings simplefuncs
-
objdump -t
ornm
to list the symbol table contents:
$ objdump -t simplefuncs # list symbol table in the executable file $ nm --format sysv simplefuncs # list symbol table in the executable file
The symbol table includes the names of all functions and global variables in
the program. There is a lot of information in the symbol table that looks
odd, but you should be able to see an entry for the two functions main
and
func1
, and see where their start addresses are in memory.
-
gdb
(andddd
): for debugging programs at the assembly code level and examining the state of CPU registers and memory as the program runs. These will be the most useful tools for the next lab assignment.
3.1. gdb (and ddd) for debugging at the assembly code level
With gdb
you can debug and trace through a program execution at the
assembly code level. This includes executing individual IA32
instructions, examining register values, and disassembling functions.
First, let’s open up simplefuncs.c
in an editor. Then, let’s try some
things out in gdb:
$ gdb simplefuncs
(gdb) break main
(gdb) break func1
(gdb) run
In gdb you can disassemble code using the disass command:
(gdb) disass
(gdb) disass func1
You can set a break point at a specific instruction:
(gdb) break *0x08049d9d # set breakpoint at specified address
Or you can break at a particular offset into a function:
(gdb) break *main+29 # set breakpoint at offset +29 in main
And you can step or next at the instruction level using ni
or si
(si
steps into function calls, ni
skips over them):
(gdb) ni # execute the next instruction then gdb gets control again
(gdb) ni
(gdb) ni
(gdb) ni
(gdb) ni
(gdb) disass
(gdb) cont # continue to next break point
Now we are at the call to func1, let’s step into this function using si (we also have a breakpoint at this function, let’s see when it is hit):
(gdb) si # step into instructions in the called function (func1)
(gdb) disass
(gdb) ni
(gdb) where
(gdb) disass
(gdb) cont
The difference between si
and ni
shows up in what each does on a
call
instruction. si
gives gdb control again at instructions at
the beginning of the called function. ni
gives gdb control again at
the instruction immediately after the call
instruction (the
instruction at the return address). In other words, si
"steps into"
the called function, whereas ni
lets the called function code
continue, and only after the function returns does gdb get control
again.
You can print out the values of individual registers like this:
(gdb) print $eax
You can also view all register values:
(gdb) info registers
You can also use the display command to automatically display values each time a breakpoint is reached:
(gdb) display $eax
(gdb) display $edx
You can use the examine command (x
) to display the contents of a
memory location. The memory address operand to (x
) can be specified
as the name of the register storing the address value or as an
absolute memory address value. Here are some examples:
(gdb) x $esp-0x8 # see what p and x display for the same value
(gdb) p $esp-0x8
(gdb) p *(int *)($ebp-0x8) # here is how to print value at memory location
(gdb) x $ebp-0x8 # or a much easier way using x
# here is an example of examining the contents at a memory location
# specifying the address in two different ways (the exact address
# value in the second depends on what $esp - 0x1c is, it can vary run to run)
(gdb) x $esp + 0x1c
(gdb) x 0xffffd2fc
The examine command also takes formatting options to tell it how to interpret the memory at the address:
(gdb) x/wd $ebp-0x8 # examine memory at address ($ebp-8) as an int in decimal
# w: word size (32 bit on IA32) d: signed decimal
(gdb) x/wx $ebp-0x8 # examine memory at address ($ebp-8) as an 4-byte value in hex
(gdb) x/s $ebp-0x8 # examine memory at address ($ebp-8) as a string
Examine’s formatting is sticky, which means that its last format specification is the one used for subsequent calls. To change it, explicitly specify an option again. This is different from print, which always defaults to int.
(gdb) x/wd $ebp-0x8 # examine memory at address ($ebp-8) as an int in decimal
(gdb) x $ebp-0xc # examine memory ($ebp-0xc) with /wd formatting (sticky formatting)
The sticky formating also applies to the size of value stored at the
address (i.e. is an an address of a 1 byte value, a 2 byte value, or a 4 byte).
This "size stickyness" can result in some seemly strange behavior when switching
between formatting, and it sometimes requires specifying the size in the format
options to x to fix (e.g. x/wd in example below).
|
(gdb) x/wd $ebp-0x8 # examine memory at address ($ebp-8) as a 4-byte int in decimal
(gdb) x/s $ebp-0x8 # examine memory at address ($ebp-8) as a string
(gdb) x/d $ebp-0x8 # examine memory at address ($ebp-8) as an 1 byte decimal
(gdb) x/wd $ebp-0x8 # examine memory at address ($ebp-8) as an int
# NEED to specify /wd to say interpret this as addr of a 4-byte
# word rather than to a 1-byte (x/s set it to 1 byte address)
Becuase of this behavior, we recommend that you always specify the byte-width
(w
for 4-bytes) when you specify int or hex formatting for an int or unsigned
int value: x/wd
or x/wx
For more information about the x
command see the IA32 debugging links
in Section 5.
3.1.1. ddd
We are going to try debugging this in ddd
instead of gdb
, because ddd
has a nicer interface for viewing assembly, registers, and stepping through
program execution:
$ ddd simplefuncs
The gdb prompt is in the bottom window. There are also menu options and buttons for gdb commands, but using the gdb prompt at the bottom is likely easier to use.
Choose View→Machine Code Window
to view the IA32 assembly code.
You can view the register values as the program runs
(choose Status→Registers
to open the register window).
For more information see the IA32 debugging links in Section 5
4. Try out some of these tool on a program binary
Run the mystery binary a few times and see what it is doing:
$ ./mystery
The program is asking you for input, but there is really not a lot of
information provided to guess the right input, and this executable was
not compiled with -g
so there is no C code information we can get from
it when we run it in gdb.
Let’s see if we can examine the assembly code to see if we can figure out what to enter.
Lets trying running in ddd
and disassemble some code.
$ ddd ./mystery
(gdb) break main
(gdb) run
(gdb) disass
Let’s consider some questions about this program:
-
what does main control flow look like?
-
let’s add some break points around function calls and in functions
-
let’s examine some state around functions
-
we can print out strings using x/s
(gdb) x/s base_addr_of_string
5. Handy References
-
gdb for IA32 assembly debugging IA32 gdb debugging guide
-
GDB for Assembly (from the gdb Guide). (assembly debugging and
x
command) -
Sections 3.2 and 3.5 of textbook (assembly debugging, print, display, info and
x
commands) -
Tools for examining phases of compiling and running C programs