The presence of the dynamic linker provides both some advantages we can utilise and some extra issues that need to be resolved to get a functional system.
One potential issue is different versions of libraries. With only static libraries there is much less potential for problems, as all library code is built directly into the binary of the application. If you want to use a new version of the library you need to recompile it into a new binary, replacing the old one.
This is obviously fairly impractical for common libraries, the most common of course being libc which is included in most all applications. If it were only available as a static library any change would require every single application in the system be rebuilt.
However, changes in the way the dynamic library work could
cause multiple problems. In the best case, the modifications
are completely compatible and nothing externally visible is
changed. On the other hand the changes might cause the
application to crash; for example if a function that used to
take an int
changes to take an
int *
. Worse, the new library
version could have changed semantics and suddenly start silently
returning different, possibly wrong values. This can be a very
nasty bug to try and track down; when an application crashes you
can use a debugger to isolate where the error occurs whilst data
corruption or modification may only show up in seemingly
unrelated parts of the application.
The dynamic linker requires a way to determine the version of libraries within the system so that newer revisions can be identified. There are a number of schemes a modern dynamic linker can use to find the right versions of libraries.
sonames
Using sonames
we can
add some extra information to a library to help identify
versions.
As we have seen previously, an application lists the
libraries it requires in
DT_NEEDED
fields in the
dynamic section of the binary. The actual library is held in
a file on disc, usually in
/lib
for core system
libraries or /usr/lib
for
optional libraries.
To allow multiple versions of the library to exist on
disk, they obviously require differing file names. The
soname
scheme uses a
combination of names and file system links to build a
hierarchy of libraries.
This is done by introducing the concept of major and minor library revisions. A minor revision is one wholly backwards compatible with a previous version of the library; this usually consists of only bug fixes. A major revision is therefore any revision that is not compatible; e.g. changes the inputs to functions or the way a function behaves.
As each library revision, major or minor, will need to
be kept in a separate file on disk, this forms the basis of
the library hierarchy. The library name is by convention
libNAME.so.MAJOR.MINOR
[30]. However, if every application
were directly linked against this file we would have the same
issue as with a static library; every time a minor change
happened we would need to rebuild the application to point to
the new library.
What we really want to refer to is the major number of the library. If this changes, we reasonably are required to recompile our application, since we need to make sure our program is still compatible with the new library.
Thus the soname
is the
libNAME.so.MAJOR
. The
soname
should be set in the
DT_SONAME
field of the
dynamic section in a shared library; the library author can
specify this version when they build the library.
Thus each minor version library file on disc can specify
the same major version number in it's
DT_SONAME
field, allowing
the dynamic linker to know that this particular library file
implements a particular major revision of the library API and
ABI.
To keep track of this, an application called
ldconfig is commonly run to create
symbolic links named for the major version to the latest minor
version on the system. ldconfig
works by running through all the libraries that implement a
particular major revision number, and then picks out the one
with the highest minor revision. It then creates a symbolic
link from libNAME.so.MAJOR
to
the actual library file on disc,
i.e. libNAME.so.MAJOR.MINOR
.
XXX : talk about libtool versions
The final piece of the hierarchy is the
compile name for the library. When you
compile your program, to link against a library you use the
-lNAME
flag, which goes off
searching for the libNAME.so
file in the library search path. Notice however, we have not
specified any version number; we just want to link against the
latest library on the system. It is up to the installation
procedure for the library to create the symbolic link between
the compile libNAME.so
name
and the latest library code on the system. Usually this is
handled by your package management system
(dpkg or
rpm). This is not an automated
process because it is possible that the latest library on the
system may not be the one you wish to always compile against;
for example if the latest installed library were a development
version not appropriate for general use.
The general process is illustrated below.
sonames
When the application starts, the dynamic linker looks
at the DT_NEEDED
field to
find the required libraries. This field contains the
soname
of the library, so
the next step is for the dynamic linker to walk through all
the libraries in its search path looking for it.
This process conceptually involves two steps. Firstly
the dynamic linker needs to search through all the libraries
to find those that implement the given
soname
. Secondly the file
names for the minor revisions need to be compared to find
the latest version, which is then ready to be loaded.
We mentioned previously that there is a symbolic link
setup by ldconfig between the
library soname
and the
latest minor revision. Thus the dynamic linker should need
to only follow that link to find the correct file to load,
rather than having to open all possible libraries and decide
which one to go with each time the application is
required.
Since file system access is so slow,
ldconfig also creates a
cache of libraries installed in the
system. This cache is simply a list of
soname
s of libraries
available to the dynamic linker and a pointer to the major
version link on disk, saving the dynamic linker having to
read entire directories full of files to locate the correct
link. You can analyse this with /sbin/ldconfig
-p; it actually lives in the file
/etc/ldconfig.so.cache
. If
the library is not found in the cache the dynamic linker
will fall back to the slower option of walking the file
system, thus it is important to re-run
ldconfig when new libraries are
installed.
We've already discussed how the dynamic linker gets the address of a library function and puts it in the PLT for the program to use. But so far we haven't discussed just how the dynamic linker finds the address of the function. The whole process is called binding, because the symbol name is bound to the address it represents.
The dynamic linker has a few pieces of information;
firstly the symbol that it is searching
for, and secondly a list of libraries that that symbol might be
in, as defined by the DT_NEEDED
fields in the binary.
Each shared object library has a section, marked
SHT_DYNSYM
and called
.dynsym
which is the minimal
set of symbols required for dynamic linking -- that is any
symbol in the library that may be called by an external
program.
In fact, there are three sections that all play a part in describing the dynamic symbols. Firstly, let us look at the definition of a symbol from the ELF specification
1 typedef struct { Elf32_Word st_name; Elf32_Addr st_value; 5 Elf32_Word st_size; unsigned char st_info; unsigned char st_other; Elf32_Half st_shndx; } Elf32_Sym; 10
Field | Value |
---|---|
st_name | An index to the string table |
st_value | Value - in a
relocatable shared object this holds the offset from
the section of index given in
st_shndx |
st_size | Any associated size of the symbol |
st_info | Information on the binding of the symbol (described below) and what type of symbol this is (a function, object, etc). |
st_other | Not currently used |
st_shndx | Index of the section this symbol resides in
(see st_value |
As you can see, the actual string of the symbol name is
held in a separate section
(.dynstr
; the entry in the
.dynsym
section only holds an
index into the string section. This creates some level of
overhead for the dynamic linker; the dynamic linker must read
all of the symbol entries in the
.dynsym
section and then
follow the index pointer to find the symbol name for
comparison.
To speed this process up, a third section called
.hash
is introduced,
containing a hash table of symbol names
to symbol table entries. This hash table is pre-computed when
the library is built and allows the dynamic linker to find the
symbol entry much faster, generally with only one or two
lookups.
Whilst we usually say the process of finding the address of a symbol refers is the process of binding that symbol, the symbol binding has a separate meaning.
The binding of a symbol dictates its external visibility during the dynamic linking process. A local symbol is not visible outside the object file it is defined in. A global symbol is visible to other object files, and can satisfy undefined references in other objects.
A weak reference is a special type of lower priority global reference. This means it is designed to be overridden, as we will see shortly.
Below we have an example C program which we analyse to inspect the symbol bindings.
1 $ cat test.c static int static_variable; 5 extern int extern_variable; int external_function(void); int function(void) 10 { return external_function(); } static int static_function(void) 15 { return 10; } #pragma weak weak_function 20 int weak_function(void) { return 10; } 25 $ gcc -c test.c $ objdump --syms test.o test.o: file format elf32-powerpc 30 SYMBOL TABLE: 00000000 l df *ABS* 00000000 test.c 00000000 l d .text 00000000 .text 00000000 l d .data 00000000 .data 00000000 l d .bss 00000000 .bss 35 00000038 l F .text 00000024 static_function 00000000 l d .sbss 00000000 .sbss 00000000 l O .sbss 00000004 static_variable 00000000 l d .note.GNU-stack 00000000 .note.GNU-stack 00000000 l d .comment 00000000 .comment 40 00000000 g F .text 00000038 function 00000000 *UND* 00000000 external_function 0000005c w F .text 00000024 weak_function $ nm test.o 45 U external_function 00000000 T function 00000038 t static_function 00000000 s static_variable 0000005c W weak_function 50
Notice the use of
#pragma
to define the weak
symbol. A pragma
is a way of
communicating extra information to the compiler; its use is
not common but occasionally is required to get the compiler to
do out of the ordinary operations.x
We inspect the symbols with two different tools; in both cases the binding is shown in the second column; the codes should be quite straight forward (are are documented in the tools man page).
It is often very useful for a programmer to be able to override a symbol in a library; that is to subvert the normal symbol with a different definition.
We mentioned that the order that libraries is searched
is given by the order of the
DT_NEEDED
fields within the
library. However, it is possible to insert libraries as the
last libraries to be searched; this
means that any symbols within them will be found as the
final reference.
This is done via an environment variable called
LD_PRELOAD
which specifies
libraries that the linker should load last.
LD_PRELOAD
1 $ cat override.c #define _GNU_SOURCE 1 #include <stdio.h> 5 #include <stdlib.h> #include <unistd.h> #include <sys/types.h> #include <dlfcn.h> 10 pid_t getpid(void) { pid_t (*orig_getpid)(void) = dlsym(RTLD_NEXT, "getpid"); printf("Calling GETPID\n"); 15 return orig_getpid(); } $ cat test.c #include <stdio.h> 20 #include <stdlib.h> #include <unistd.h> int main(void) { 25 printf("%d\n", getpid()); } $ gcc -shared -fPIC -o liboverride.so override.c -ldl $ gcc -o test test.c 30 $ LD_PRELOAD=./liboverride.so ./test Calling GETPID 15187
In the above example we override the
getpid
function to print
out a small statement when it is called. We use the
dlysm
function provided by
libc
with an argument
telling it to continue on and find the
next symbol called
getpid
.
The concept of the weak symbol is that the symbol is marked as a lower priority and can be overridden by another symbol. Only if no other implementation is found will the weak symbol be the one that it used.
The logical extension of this for the dynamic loader is that all libraries should be loaded, and any weak symbols in those libraries should be ignored for normal symbols in any other library. This was indeed how weak symbol handling was originally implemented in Linux by glibc.
However, this was actually incorrect to the letter of the Unix standard at the time (SysVr4). The standard actually dictates that weak symbols should only be handled by the static linker; they should remain irrelevant to the dynamic linker (see the section on binding order below).
At the time, the Linux implementation of making the dynamic linker override weak symbols matched with SGI's IRIX platform, but differed to others such as Solaris and AIX. When the developers realised this behaviour violated the standard it was reversed, and the old behaviour relegated to requiring a special environment flag (LD_DYNAMIC_WEAK) be set.
We have seen how we can override a function in another library by preloading another shared library with the same symbol defined. The symbol that gets resolved as the final one is the last one in the order that the dynamic loader loads the libraries.
Libraries are loaded in the order they are specified
in the DT_NEEDED
flag of
the binary. This in turn is decided from the order that
libraries are passed in on the command line when the object
is built. When symbols are to be located, the dynamic
linker starts at the last loaded library and works backwards
until the symbol is found.
Some shared libraries, however, need a way to override
this behaviour. They need to say to the dynamic linker
"look first inside me for these symbols, rather than working
backwards from the last loaded library". Libraries can set
the DT_SYMBOLIC
flag in
their dynamic section header to get this behaviour (this is
usually set by passing the
-Bsymbolic
flag on the
static linkers command line when building the shared
library).
What this flag is doing is controlling symbol visibility. The symbols in the library can not be overridden so could be considered private to the library that is being loaded.
However, this loses a lot of granularity since the library is either flagged for this behaviour, or it is not. A better system would allow us to make some symbols private and some symbols public.
That better system comes from symbol versioning. With symbol versioning we specify some extra input to the static linker to give it some more information about the symbols in our shared library.
1 $ cat Makefile all: test testsym 5 clean: rm -f *.so test testsym liboverride.so : override.c $(CC) -shared -fPIC -o liboverride.so override.c 10 libtest.so : libtest.c $(CC) -shared -fPIC -o libtest.so libtest.c libtestsym.so : libtest.c 15 $(CC) -shared -fPIC -Wl,-Bsymbolic -o libtestsym.so libtest.c test : test.c libtest.so liboverride.so $(CC) -L. -ltest -o test test.c 20 testsym : test.c libtestsym.so liboverride.so $(CC) -L. -ltestsym -o testsym test.c $ cat libtest.c #include <stdio.h> 25 int foo(void) { printf("libtest foo called\n"); return 1; } 30 int test_foo(void) { return foo(); } 35 $ cat override.c #include <stdio.h> int foo(void) 40 { printf("override foo called\n"); return 0; } 45 $ cat test.c #include <stdio.h> int main(void) { 50 printf("%d\n", test_foo()); } $ cat Versions {global: test_foo; local: *; }; 55 $ gcc -shared -fPIC -Wl,-version-script=Versions -o libtestver.so libtest.c $ gcc -L. -ltestver -o testver test.c 60 $ LD_LIBRARY_PATH=. LD_PRELOAD=./liboverride.so ./testver libtest foo called 100000574 l F .text 00000054 foo 000005c8 g F .text 00000038 test_foo 65
In the simplest case as above, we simply state if the
symbol is global or
local. Thus in the case above the
foo
function is most likely
a support function for
test_foo
; whilst we are
happy for the overall functionality of the
test_foo
function to be
overridden, if we do use the shared library version it needs
to have unaltered access nobody should modify the support
function.
This allows us to keep our
namespace better organised. Many
libraries might want to implement something that could be
named like a common function like
read
or
write
; however if they all
did the actual version given to the program might be
completely wrong. By specifying symbols as
local only the developer can be sure
that nothing will conflict with that internal name, and
conversely the name he chose will not influence any other
program.
An extension of this scheme is symbol
versioning. With this you can specify multiple
versions of the same symbol in the same library. The static
linker appends some version information after the symbol
name (something like @VER
)
describing what version the symbol is given.
If the developer implements a function that has the
same name but possibly a binary or programatically different
implementation he can increase the version number. When new
applications are built against the shared library, they will
pick up the latest version of the symbol. However,
applications built against earlier versions of the same
library will be requesting older versions (e.g. will have
older @VER
strings in the
symbol name they request) and thus get the original
implementation. XXX : example
[30] You can optionally have a release as a final identifier after the minor number. Generally this is enough to distinguish all the various versions library.