Libraries

Developers soon tired of having to write everything from scratch, so one of the first inventions of computer science was libraries.

A library is simply a collection of functions which you can call from your program. Obviously a library has many advantages, not least of which is that you can save much time by reusing work someone else has already done and generally be more confident that it has fewer bugs (since probably many other people use the libraries too, and you benefit from having them finding and fixing bugs). A library is exactly like an executable, except instead of running directly the library functions are invoked with parameters from your executable.

Static Libraries

The most straight forward way of using a library function is to have the object files from the library linked directly into your final executable, just as with those you have compiled yourself. When linked like this the library is called a static library, because the library will remain unchanged unless the program is recompiled.

This is the most straight forward way of using a library as the final result is a simple executable with no dependencies.

Inside static libraries

A static library is simply a group of object files. The object files are kept in an archive, which leads to their usual .a suffix extension. You can think of archives as similar to a zip file, but without compression.

Below we show the creation of basic static library and introduce some common tools for working with libraries.

Example 8.14. Creating and using a static library
  1 
                  $ cat library.c
    /* Library Function */
    int function(int input)
  5 {
            return input + 10;
    }
    
    $ cat library.h
 10 /* Function Definition */
    int function(int);
    
    $ cat program.c
    #include <stdio.h>
 15 /* Library header file */
    #include "library.h"
    
    int main(void)
    {
 20         int d = function(100);
    
            printf("%d\n", d);
    }
    
 25 $ gcc -c library.c
    $ ar rc libtest.a library.o
    $ ranlib ./libtest.a
    $ nm --print-armap ./libtest.a
    
 30 Archive index:
    function in library.o
    
    library.o:
    00000000 T function
 35 
    $ gcc -L . program.c -ltest -o program
    
    $ ./program
    110
 40 
                

Firstly we compile or library to an object file, just as we have seen in the previous chapter.

Notice that we define the library API in the header file. The API consists of function definitions for the functions in the library; this is so that the compiler knows what types the functions take when building object files that reference the library (e.g. program.c, which #includes the header file).

We create the library ar (short for "archive") command. By convention static library file names are prefixed with lib and have the extension .a. The c argument tells the program to create the archive, and a tells archive to add the object files specified into the library file.[27]

Next we use the ranlib application to make a header in the library with the symbols of the object file contents. This helps the compiler to quickly reference symbols; in the case where we just have one this step may seem a little redundant; however a large library may have thousands of symbols meaning an index can significantly speed up finding references. We inspect this new header with the nm application. We see the function symbol for the function() function at offset zero, as we expect.

You then specify the library to the compiler with -lname where name is the filename of the library without the prefix lib. We also provide an extra search directory for libraries, namely the current directory (-L .), since by default the current directory is not searched for libraries.

The final result is a single executable with our new library included.

Static Linking Drawbacks

Static linking is very straight forward, but has a number of drawbacks.

There are two main disadvantages; firstly if the library code is updated (to fix a bug, say) you have to recompile your program into a new executable and secondly, every program in the system that uses that library contains a copy in it's executable. This is very inefficient (and a pain if you find a bug and have to recompile, as per point one).

For example, the C library (glibc) is included in all programs, and provides all the common functions such as printf.

Shared Libraries

Shared libraries are an elegant way around the problems posed by a static library. A shared library is a library that is loaded dynamically at runtime for each application that requires it.

The application simply leaves pointers that it will require a certain library, and when the function call is made the library is loaded into memory and executed. If the library is already loaded for another application, the code can be shared between the two, saving considerable resources with commonly used libraries.

This process, called dynamic linking, is one of the more intricate parts of a modern operating system. As such, we dedicate the next chapter to investigating the dynamic linking process.



[27] Archives created with ar pop up in a few different places around Linux systems other than just creating static libraries. One widely used application is in the .deb packaging format used with Debian, Ubuntu and some other Linux systems is one example. debs use archives to keep all the application files together in the one package file. Redhat RPM packages use an alternate but similar format called cpio. Of course the canonical application for keeping files together is the tar file, which is a common format to distribute source code.