We mentioned before that simply saying the program starts
with the main()
function is not
quite true. Below we exaime what happens to a typical dynamically
linked program when it is loaded and run (statically linked
programs are similar but different XXX should we go into
this?).
Firstly, in response to an
exec
system call the kernel
allocates the structures for a new process and reads the ELF file
specified from disk.
We mentioned that ELF has a program interpreter field,
PT_INTERP
, which can be set to
'interpret' the program. For dynamically linked applications that
interpreter is the dynamic linker, namely
ld.so, which allows some of the linking
process to be done on the fly before the program starts.
In this case, the kernel also reads in
the dynamic linker code, and starts the program from the entry
point address as specified by it. We examine the role of the
dynamic linker in depth in the next chapter, but suffice to say it
does some setup like loading any libraries required by the
application (as specified in the dynamic section of the binary)
and then starts execution of the program binary at it's entry
point address (i.e. the _init
function).
The kernel needs to communicate some things to programs
when they start up; namely the arguments to the program, the
current environment variables and a special structure called the
Auxiliary Vector
or
auxv
(you can request the the
dynamic linker show you some debugging output of the
auxv
by specifying the
environment value
LD_SHOW_AUXV=1
).
The arguments and environment at fairly straight forward,
and the various incarnations of the
exec
system call allow you to
specify these for the program.
The kernel communicates this by putting all the required information on the stack for the newly created program to pick up. Thus when the program starts it can use it's stack pointer to find the all the startup information required.
The auxiliary vector is a special structure that is for passing information directly from the kernel to the newly running program. It contains system specific information that may be required, such as the default size of a virtual memory page on the system or hardware capabilities; that is specific features that the kernel has identified the underlying hardware has that userspace programs can take advantage of.
We mentioned previously that system calls are slow, and modern systems have mechanisms to avoid the overheads of calling a trap to the processor.
In Linux, this is implemented by a neat trick between
the dyanmic loader and the kernel, all communicated with the
AUXV structure. The kernel actually adds a small shared
library into the address space of every newly created process
which contains a function that makes system calls for you.
The beauty of this system is that if the underlying hardware
supports a fast system call mechanism the kernel (being the
creater of the library) can use it, otherwise it can use the
old scheme of generating a trap. This library is named
linux-gate.so.1
, so called
because it is a gateway to the inner
workings of the kernel.
When the kernel starts the dynamic linker it adds an
entry to the auxv called
AT_SYSINFO_EHDR
, which is the
address in memory that the special kernel library lives in.
When the dynamic linker starts it can look for the
AT_SYSINFO_EHDR
pointer, and
if found load that library for the program. The program has
no idea this library exists; this is a private arrangement
between the dynamic linker and the kernel.
We mentioned that programmers make system calls indirectly through calling functions in the system libraries, namely libc. libc can check to see if the special kernel binary is loaded, and if so use the functions within that to make system calls. As we mentioned, if the kernel determines the hardware is capable, this will use the fast sytem call method.
Once the kernel has loaded the interpreter it passes it to the entry point as given in the interpreter file (note will not examine how the dynamic linker starts at this stage; see Chapter 9, Dynamic Linking for a full discussion of dyanmic linking). The dynamic linker will jump to the entry point address as given in the ELF binary.
1 $ cat test.c int main(void) 5 { return 0; } $ gcc -o test test.c 10 $ readelf --headers ./test | grep Entry Entry point address: 0x80482b0 $ objdump --disassemble ./test 15 [...] 080482b0 <_start>: 80482b0: 31 ed xor %ebp,%ebp 20 80482b2: 5e pop %esi 80482b3: 89 e1 mov %esp,%ecx 80482b5: 83 e4 f0 and $0xfffffff0,%esp 80482b8: 50 push %eax 80482b9: 54 push %esp 25 80482ba: 52 push %edx 80482bb: 68 00 84 04 08 push $0x8048400 80482c0: 68 90 83 04 08 push $0x8048390 80482c5: 51 push %ecx 80482c6: 56 push %esi 30 80482c7: 68 68 83 04 08 push $0x8048368 80482cc: e8 b3 ff ff ff call 8048284 <__libc_start_main@plt> 80482d1: f4 hlt 80482d2: 90 nop 80482d3: 90 nop 35 08048368 <main>: 8048368: 55 push %ebp 8048369: 89 e5 mov %esp,%ebp 804836b: 83 ec 08 sub $0x8,%esp 40 804836e: 83 e4 f0 and $0xfffffff0,%esp 8048371: b8 00 00 00 00 mov $0x0,%eax 8048376: 83 c0 0f add $0xf,%eax 8048379: 83 c0 0f add $0xf,%eax 804837c: c1 e8 04 shr $0x4,%eax 45 804837f: c1 e0 04 shl $0x4,%eax 8048382: 29 c4 sub %eax,%esp 8048384: b8 00 00 00 00 mov $0x0,%eax 8048389: c9 leave 804838a: c3 ret 50 804838b: 90 nop 804838c: 90 nop 804838d: 90 nop 804838e: 90 nop 804838f: 90 nop 55 08048390 <__libc_csu_init>: 8048390: 55 push %ebp 8048391: 89 e5 mov %esp,%ebp [...] 60 08048400 <__libc_csu_fini>: 8048400: 55 push %ebp [...]
Above we investigate the very simplest program. Using
readelf we can see that the entry
point is the _start
function in
the binary. At this point we can see in the disassembley some
values are pushed onto the stack. The first value,
0x8048400
is the
__libc_csu_fini
function;
0x8048390
is the
__libc_csu_init
and then
finally 0x8048368
, the
main()
function. After this
the value __libc_start_main
function is called.
__libc_start_main
is
defined in the glibc sources
sysdeps/generic/libc-start.c
.
The file function is quite complicated and hidden between a
large number of defines, as it needs to be portable across the
very wide number of systems and architectures that glibc can run
on. It does a number of specific things related to setting up
the C library which the average programmer does not need to
worry about. The next point where the library calls back into
the program is to handle init
code.
init
and
fini
are two special concepts
that call parts of code in shared libraries that may need to be
called before the library starts or if the library is unloaded
respectivley. You can see how this might be useful for library
programmers to setup variables when the library is started, or
to clean up at the end. Originally the functions
_init
and
_fini
were looked for in the
library; however this became somewhat limiting as everything was
required to be in these functions. Below we will examine just
how the
init
/fini
process works.
At this stage we can see that the
__libc_start_main
function will
receive quite a few input paramaters on the stack. Firstly it
will have access to the program arguments, environment variables
and auxiliary vector from the kernel. Then the initalization
function will have pushed onto the stack addresses for functions
to handle init
,
fini
, and finally the address
of the main function it's self.
We need some way to indicate in the source code that a
function should be called by
init
or
fini
. With
gcc we use
attributes to label two functions as
constructors and
destructors in our main program. These
terms are more commonly used with object orientent langauges to
describe object lifecycles.
1 $ cat test.c #include <stdio.h> 5 void __attribute__((constructor)) program_init(void) { printf("init\n"); } void __attribute__((destructor)) program_fini(void) { 10 printf("fini\n"); } int main(void) { 15 return 0; } $ gcc -Wall -o test test.c 20 $ ./test init fini $ objdump --disassemble ./test | grep program_init 25 08048398 <program_init>: $ objdump --disassemble ./test | grep program_fini 080483b0 <program_fini>: 30 $ objdump --disassemble ./test [...] 08048280 <_init>: 8048280: 55 push %ebp 35 8048281: 89 e5 mov %esp,%ebp 8048283: 83 ec 08 sub $0x8,%esp 8048286: e8 79 00 00 00 call 8048304 <call_gmon_start> 804828b: e8 e0 00 00 00 call 8048370 <frame_dummy> 8048290: e8 2b 02 00 00 call 80484c0 <__do_global_ctors_aux> 40 8048295: c9 leave 8048296: c3 ret [...] 080484c0 <__do_global_ctors_aux>: 45 80484c0: 55 push %ebp 80484c1: 89 e5 mov %esp,%ebp 80484c3: 53 push %ebx 80484c4: 52 push %edx 80484c5: a1 2c 95 04 08 mov 0x804952c,%eax 50 80484ca: 83 f8 ff cmp $0xffffffff,%eax 80484cd: 74 1e je 80484ed <__do_global_ctors_aux+0x2d> 80484cf: bb 2c 95 04 08 mov $0x804952c,%ebx 80484d4: 8d b6 00 00 00 00 lea 0x0(%esi),%esi 80484da: 8d bf 00 00 00 00 lea 0x0(%edi),%edi 55 80484e0: ff d0 call *%eax 80484e2: 8b 43 fc mov 0xfffffffc(%ebx),%eax 80484e5: 83 eb 04 sub $0x4,%ebx 80484e8: 83 f8 ff cmp $0xffffffff,%eax 80484eb: 75 f3 jne 80484e0 <__do_global_ctors_aux+0x20> 60 80484ed: 58 pop %eax 80484ee: 5b pop %ebx 80484ef: 5d pop %ebp 80484f0: c3 ret 80484f1: 90 nop 65 80484f2: 90 nop 80484f3: 90 nop $ readelf --sections ./test 70 There are 34 section headers, starting at offset 0xfb0: Section Headers: [Nr] Name Type Addr Off Size ES Flg Lk Inf Al [ 0] NULL 00000000 000000 000000 00 0 0 0 75 [ 1] .interp PROGBITS 08048114 000114 000013 00 A 0 0 1 [ 2] .note.ABI-tag NOTE 08048128 000128 000020 00 A 0 0 4 [ 3] .hash HASH 08048148 000148 00002c 04 A 4 0 4 [ 4] .dynsym DYNSYM 08048174 000174 000060 10 A 5 1 4 [ 5] .dynstr STRTAB 080481d4 0001d4 00005e 00 A 0 0 1 80 [ 6] .gnu.version VERSYM 08048232 000232 00000c 02 A 4 0 2 [ 7] .gnu.version_r VERNEED 08048240 000240 000020 00 A 5 1 4 [ 8] .rel.dyn REL 08048260 000260 000008 08 A 4 0 4 [ 9] .rel.plt REL 08048268 000268 000018 08 A 4 11 4 [10] .init PROGBITS 08048280 000280 000017 00 AX 0 0 4 85 [11] .plt PROGBITS 08048298 000298 000040 04 AX 0 0 4 [12] .text PROGBITS 080482e0 0002e0 000214 00 AX 0 0 16 [13] .fini PROGBITS 080484f4 0004f4 00001a 00 AX 0 0 4 [14] .rodata PROGBITS 08048510 000510 000012 00 A 0 0 4 [15] .eh_frame PROGBITS 08048524 000524 000004 00 A 0 0 4 90 [16] .ctors PROGBITS 08049528 000528 00000c 00 WA 0 0 4 [17] .dtors PROGBITS 08049534 000534 00000c 00 WA 0 0 4 [18] .jcr PROGBITS 08049540 000540 000004 00 WA 0 0 4 [19] .dynamic DYNAMIC 08049544 000544 0000c8 08 WA 5 0 4 [20] .got PROGBITS 0804960c 00060c 000004 04 WA 0 0 4 95 [21] .got.plt PROGBITS 08049610 000610 000018 04 WA 0 0 4 [22] .data PROGBITS 08049628 000628 00000c 00 WA 0 0 4 [23] .bss NOBITS 08049634 000634 000004 00 WA 0 0 4 [24] .comment PROGBITS 00000000 000634 00018f 00 0 0 1 [25] .debug_aranges PROGBITS 00000000 0007c8 000078 00 0 0 8 100 [26] .debug_pubnames PROGBITS 00000000 000840 000025 00 0 0 1 [27] .debug_info PROGBITS 00000000 000865 0002e1 00 0 0 1 [28] .debug_abbrev PROGBITS 00000000 000b46 000076 00 0 0 1 [29] .debug_line PROGBITS 00000000 000bbc 0001da 00 0 0 1 [30] .debug_str PROGBITS 00000000 000d96 0000f3 01 MS 0 0 1 105 [31] .shstrtab STRTAB 00000000 000e89 000127 00 0 0 1 [32] .symtab SYMTAB 00000000 001500 000490 10 33 53 4 [33] .strtab STRTAB 00000000 001990 000218 00 0 0 1 Key to Flags: W (write), A (alloc), X (execute), M (merge), S (strings) 110 I (info), L (link order), G (group), x (unknown) O (extra OS processing required) o (OS specific), p (processor specific) $ objdump --disassemble-all --section .ctors ./test 115 ./test: file format elf32-i386 Contents of section .ctors: 8049528 ffffffff 98830408 00000000 ............ 120
The last value pushed onto the stack for the
__libc_start_main
was the
initalisation function
__libc_csu_init
. If we follow
the call chain through from
__libc_csu_init
we can see it
does some setup and then calls the
_init
function in the
executable. The _init
function
eventually calls a function called
__do_global_ctors_aux
. Looking
at the disassembley of this function we can see that it appears
to start at address 0x804952c
and loop along, reading an value and calling it. We can see
that this starting address is in the
.ctors
section of the file; if
we have a look inside this we see that it contains the first
value -1
, a function address
(in big endian format) and the value zero.
The address in big endian format is
0x08048398
, or the address of
program_init
function! So the
format of the .ctors
section is
firstly a -1, and then the address of functions to be called on
initalisation, and finally a zero to indicate the list is
complete. Each entry will be called (in this case we only have
the one funtion).
Once __libc_start_main
has completed with the _init
call it finally calls the
main()
function! Remember that
it had the stack setup initially with the arguments and
environment pointers from the kernel; this is how main gets it's
argc, argv[], envp[]
arguments.
The process now runs and the setup phase is complete.
A similar process is enacted with the
.dtors
for destructors when the
program exits.
__libc_start_main
calls these
when the main()
function
completes.
As you can see, a lot is done before the program gets to start, and even a little after you think it is finished!