You might have noticed a critical problem with relocations when thinking about the goals of a shared library. We mentioned previously that the big advantage of a shared library with virtual memory is that multiple programs can use the code in memory by sharing of pages.
The problem stems from the fact that libraries have no guarantee about where they will be put into memory. The dynamic linker will find the most convenient place in virtual memory for each library required and place it there. Think about the alternative if this were not to happen; every library in the system would require its own chunk of virtual memory so that no two overlapped. Every time a new library were added to the system it would require allocation. Someone could potentially be a hog and write a huge library, not leaving enough space for other libraries! And chances are, your program doesn't ever want to use that library anyway.
Thus, if you modify the code of a shared library with a relocation, that code no longer becomes sharable. We've lost the advantage of our shared library.
Below we explain the mechanism for doing this.
So imagine the situation where we take the value of a symbol. With only relocations, we would have the dynamic linker look up the memory address of that symbol and re-write the code to load that address.
A fairly straight forward enhancement would be to set aside space in our binary to hold the address of that symbol, and have the dynamic linker put the address there rather than in the code directly. This way we never need to touch the code part of the binary.
The area that is set aside for these addresses is called
the Global Offset Table, or GOT. The GOT lives in a section of
the ELF file called .got
.
The GOT is private to each process, and the process must have write permissions to it. Conversely the library code is shared and the process should have only read and execute permissions on the code; it would be a serious security breach if the process could modify code.
1 $ cat got.c extern int i; 5 void test(void) { i = 100; } 10 $ gcc -nostdlib -shared -o got.so ./got.c $ objdump --disassemble ./got.so ./got.so: file format elf64-ia64-little 15 Disassembly of section .text: 0000000000000410 <test>: 410: 0d 10 00 18 00 21 [MFI] mov r2=r12 20 416: 00 00 00 02 00 c0 nop.f 0x0 41c: 81 09 00 90 addl r14=24,r1;; 420: 0d 78 00 1c 18 10 [MFI] ld8 r15=[r14] 426: 00 00 00 02 00 c0 nop.f 0x0 42c: 41 06 00 90 mov r14=100;; 25 430: 11 00 38 1e 90 11 [MIB] st4 [r15]=r14 436: c0 00 08 00 42 80 mov r12=r2 43c: 08 00 84 00 br.ret.sptk.many b0;; $ readelf --sections ./got.so 30 There are 17 section headers, starting at offset 0x640: Section Headers: [Nr] Name Type Address Offset Size EntSize Flags Link Info Align 35 [ 0] NULL 0000000000000000 00000000 0000000000000000 0000000000000000 0 0 0 [ 1] .hash HASH 0000000000000120 00000120 00000000000000a0 0000000000000004 A 2 0 8 [ 2] .dynsym DYNSYM 00000000000001c0 000001c0 40 00000000000001f8 0000000000000018 A 3 e 8 [ 3] .dynstr STRTAB 00000000000003b8 000003b8 000000000000003f 0000000000000000 A 0 0 1 [ 4] .rela.dyn RELA 00000000000003f8 000003f8 0000000000000018 0000000000000018 A 2 0 8 45 [ 5] .text PROGBITS 0000000000000410 00000410 0000000000000030 0000000000000000 AX 0 0 16 [ 6] .IA_64.unwind_inf PROGBITS 0000000000000440 00000440 0000000000000018 0000000000000000 A 0 0 8 [ 7] .IA_64.unwind IA_64_UNWIND 0000000000000458 00000458 50 0000000000000018 0000000000000000 AL 5 5 8 [ 8] .data PROGBITS 0000000000010470 00000470 0000000000000000 0000000000000000 WA 0 0 1 [ 9] .dynamic DYNAMIC 0000000000010470 00000470 0000000000000100 0000000000000010 WA 3 0 8 55 [10] .got PROGBITS 0000000000010570 00000570 0000000000000020 0000000000000000 WAp 0 0 8 [11] .sbss NOBITS 0000000000010590 00000590 0000000000000000 0000000000000000 W 0 0 1 [12] .bss NOBITS 0000000000010590 00000590 60 0000000000000000 0000000000000000 WA 0 0 1 [13] .comment PROGBITS 0000000000000000 00000590 0000000000000026 0000000000000000 0 0 1 [14] .shstrtab STRTAB 0000000000000000 000005b6 000000000000008a 0000000000000000 0 0 1 65 [15] .symtab SYMTAB 0000000000000000 00000a80 0000000000000258 0000000000000018 16 12 8 [16] .strtab STRTAB 0000000000000000 00000cd8 0000000000000045 0000000000000000 0 0 1 Key to Flags: 70 W (write), A (alloc), X (execute), M (merge), S (strings) I (info), L (link order), G (group), x (unknown) O (extra OS processing required) o (OS specific), p (processor specific)
Above we create a simple shared library which refers to an external symbol. We do not know the address of this symbol at compile time, so we leave it for the dynamic linker to fix up at runtime.
But we want our code to remain sharable, in case other processes want to use our code as well.
The disassembly reveals just how we do this with the
.got
. On IA64 (the
architecture which the library was compiled for) the register
r1
is known as the
global pointer and always points to where
the .got
section is loaded
into memory.
If we have a look at the
readelf
output we can see
that the .got
section starts
0x10570 bytes past where library was loaded into memory. Thus
if the library were to be loaded into memory at address
0x6000000000000000 the .got
would be at 0x6000000000010570, and register
r1
would always point to this
address.
Working backwards through the disassembly, we can see
that we store the value 100 into the memory address held in
register r15
. If we look
back we can see that register 15 holds the value of the memory
address stored in register 14. Going back one more step, we
see we load this address is found by adding a small number to
register 1. The GOT is simply a big long list of entries, one
for each external variable. This means that the GOT entry for
the external variable i
is
stored 24 bytes (that is 3 64 bit addresses).
1 $ readelf --relocs ./got.so Relocation section '.rela.dyn' at offset 0x3f8 contains 1 entries: 5 Offset Info Type Sym. Value Sym. Name + Addend 000000010588 000f00000027 R_IA64_DIR64LSB 0000000000000000 i + 0
We can also check out the relocation for this entry too. The relocation says "replace the value at offset 10588 with the memory location that symbol i is stored at".
We know that the .got
starts at offset 0x10570 from the previous output. We have
also seen how the code loads an address 0x18 (24 in decimal)
past this, giving us an address of 0x10570 + 0x18 =
0x10588 ... the address which the relocation is for!
So before the program begins, the dynamic linker will
have fixed up the relocation to ensure that the value of the
memory at offset 0x10588 is the address of the global variable
i
!