Description
1.1 Aims
The aim of this lab is to familiarize you with the use of linkers in C. After
completing this lab, you should be familiar with the following topics:
• Recognizing linker error messages during the course of compilation.
• The process used by a linker to resolve symbols.
• The use of libraries and the dierences between static and dynamic libraries.
1.2 Background
There are 3 possible types of object les:
Relocatable object le This type of le contains code and data but will
need to be combined with other object les and libraries to create an
executable object le. Usually, this kind of object le has extension .o
and is produced using gcc’s -c option.
Executable object le Contains code and data for a complete program (except shared libraries) which is ready to be loaded into memory and executed.
Shared object le A type of relocatable object le which can loaded into
memory and linked dynamically either at load time or at run time.
Recall that building an executable program involves the following steps:
1. Compiling all the .c les into .o les.
2. Linking the .o les along with any needed libraries to produce an executable le.
A library is a collection of object les. When a library is linked into a program,
only those object les which dene symbols required by the program are actually
linked into the program.
There are two kinds of libraries:
Static Libraries All object modules needed from the library are linked into
the executable at link time.
1
Dynamic Libraries The object modules needed from the library are not
linked into the program until it is loaded into memory or even later. Also
known as shared libraries as the library code can be shared simultaneously by multiple processes by being loaded (at possibly dierent virtual
addresses) into the virtual address spaces of the processes.
1.3 Exercises
1.3.1 Starting up
Use the startup directions from the earlier labs to create a lab7 directory and
re up a terminal whose output you are logging using the script command.
Make sure that your lab7 directory contains a copy of the <./les> directory.
1.3.2 Exercise 1: Recognizing and Fixing a Linker Error
Change over to the log10 directory and look at the log10.c program. It contain
code to read doubles from stdin and print out their base-10 logarithms. Once
you have looked at the source code, build the log10 executable by typing make.
You should receive a linker error saying there is an undefined reference to
log10. The problem is that the log10() function is dened in the math library,
but the provided Makefile does not link in the math library.
Do a man on the log10 function. You will see a requirement that you link with
the math library using -lm. When you get such a link error you should look at
the man page for the problematic function to gure out which library you may
be missing.
Now add the -lm ag to the LDLIBS make variable and re-make. This time it
should build correctly. You can test it by simply typing echo 10 1000 2 |
./log10.
Notice that in addition to the log10() function, the program also uses the
printf() and scanf() functions. The reason you did not get an error for these
functions is that they are dened in the standard C library libc which is linked
in by default.
Where are these libraries located? Traditionally, they were in /usr/lib. However, since modern systems may need to support multiple architectures (like
64-bit and 32-versions of x86), the version of Linux you are running has them in
/usr/lib/x86_64-linux-gnu All the library le names start with the prex lib
and have extension either .a for a static library and .so (standing for shared
object) for a dynamic library. Specifying the linker option -lXXX means to link
with the library with name libXXX.
2
Do a ls -l /usr/lib/x86_64-linux-gnu/libm.* which will list all les in
/usr/lib/x86_64-linux-gnu/ which begin with the prex libm.. You should
see two: libm.a and libm.so; the former is GNU load script referring to the
static math library and the latter is a script referring to the dynamic math
library.
Look inside the libm.a script; you should see a reference to a specic version
LIBM_VERSION of the actual libm library.
List out all the symbols from that specic version of the libm static library using
nm /usr/lib/x86_64/linux-gnu/LIBM_VERSION >libm.nm 2>/dev/null (the
2> redirects stderr to a bit-sink). Look at the generated output in libm¬
.nm using a text editor. You should see a denition for log10 listed under
the object le w_log10.o with __log10 dened in as a code symbol (aka text
symbol) using T, and log10 dened as a weak symbol using W (a weak symbol
can be overridden by a non-weak denition). You should see the other symbols
referenced by that object le but not dened within that object le as undened
U.
Do a man on nm (if it does not work on your local system pull a man page o
the web). Using the information documenting the symbol types, look at the nm
output in libm.nm to discover which object le denes the __ieee754_log10
symbol referenced by the w_log10.o object le.
1.3.3 Exercise 2: Multiply-Dened Symbols
Change over to the directory multiple-symbols. Look at the les contained
there; note that sym is dened dierently in def1.c and def2.c. If you type make,
you will get a multiply-dened symbol error for sym.
The linker classies all external identiers which have initializers as strong symbols and allows only a single strong denition in a program. A declaration
without an initializer declares weak symbols. Multiple declarations for the same
weak symbol are merged together; when weak symbols are linked with a strong
symbol with the same spelling, the strong symbol denition wins.
You can x the error by changing one of the denitions to be a weak symbol by
removing the initializer.
1.3.4 Exercise 3: Multiple-Denition Bug
Change over to the directory multiple-defs. Observe that x is declared with inconsistent types in main.c and f.c. The denition in main.c is a strong denition
and dominates over that in f.c.
Build the program by typing make. You may receive a link warning. If you
ignore the warning and run the program you will see that the inconsistent types
3
for x has caused a pernicious bug: f() changing x also happens to change the
value of y!! Can you understand why?
These kind of bugs can be avoided by putting the declaration of a symbol
referenced by multiple program les into a single header le and #include’ing
the header le into all les which reference or dene the symbol.
1.3.5 Exercise 4: Dynamic versus Static Linking
Change over to the static-dynamic directory, the contained log10.c program
is identical to that from the previous exercise. However, the .<./les/staticdynamic/Makele> Makele is setup to build both a statically-linked and dynamicallylinked executable. Build them by typing simply make. You should see make
building a statically-linked log10-static executable and a dynamically-linked
log10-dynamic executable.
Do a ls -l. You should see a dramatic dierence in size between log10-static
and log10-dynamic. That is because log10-static contains within it all the
library code needed, whereas for log10-dynamic, the library code is linked in
dynamically (at load-time or later).
Do a nm on both executables: nm log10-static >log10-static.nm and nm
log10-dynamic >log10-dynamic.nm. Look at both output les using a text
editor. In log10-dynamic.nm you should see that log10 is undened U, but in
log10-static.nm you should see it is dened as a weak (`W`) symbol.
1.3.6 Exercise 5: Building a Non-Standard Library
This exercise involves building static and dynamic versions of a custom library
to add/multiply vectors (based on the example from the text, ch. 7). Change
over to the libvec directory and look at the two les addvec.c and multvec.c
(which will be put into the library) and a test le testvec.c which will be linked
with the library.
Then specically look at the Makele which is reproduced here with the lines
numbered to facilitate discussion:
01 CFLAGS = -g -Wall -fPIC -std=c11
02
03 OBJS = \
04 addvec.o \
05 multvec.o
06
07 all: libvec.so libvec.a testvec-static testvec-dynamic
08
09 libvec.so: $(OBJS)
10 $(CC) -shared $(OBJS) -o $@
4
11
12 libvec.a: $(OBJS)
13 ar rcs $@ $(OBJS)
14
15 testvec-static: testvec.o
16 $(CC) -static testvec.o -L. -lvec -o $@
17
18 testvec-dynamic: testvec.o
19 $(CC) testvec.o -L. -lvec -o $@
20
21 .PHONY: clean
22 clean:
23 rm *.o *.so *.a testvec-*
24
Line 1 denes the options used for compilation. The one option which may not
have been seen earlier is -fPIC which species generating Position-Independence
Code. This is usually necessary when generating shared libraries which can be
simultaneously loaded at dierent addresses in the virtual address spaces of
multiple processes. -g turns on debugging, -Wall turns on reasonable warnings
and -std=c99 species the C dialect as C99.
Lines 3 – 5 specify the objects included in the libraries.
Line 7 lists all the targets to be built.
Lines 9 and 10 specify how to build the dynamic library libvec.so. The –
shared option builds a shared library.
Line 12 and 13 specify how to build the static library libvec.a. The program
ar is used to archive the object les together: r species inserting the object
les into the archive with replacement, c creates the archive and s creates a
symbol-table in the archive to facilitate searching (this can also be done using
a special ranlib command).
Lines 15 and 16 specify how to build the test program using the static library.
The -static option species that no dynamic libraries should be used, -L .
says to add the current directory to the set of directories in which libraries are
searched for, the -lvec option species the libvec.a library.
Lines 18 and 19 specify how to build the test program using the dynamic library.
Since no -static option is used, it will use dynamic libraries and link with
libvec.so in the current directory.
Build all the targets by typing make. Once again, do a ls -l listing to observe
the signicant dierence in size between the statically-linked executable and the
dynamically-linked executable.
Now run the statically-linked executable by typing a command like ./testvecstatic 1 2 3 which should print out the sum and product of the vector [1,
5
2, 3] with itself.
Try the same thing with the dynamically-linked executable by typing a command like ./testvec-dynamic 1 2 3. You will get an error saying that it
cannot nd the dynamic library libvec.so. This proves that this library is
necessary to run the program.
You will need to tell the system to add the current directory to the set of directories which are searched for libraries when running the program. One way
to do so is to add the current directory to the LD_LIBRARY_PATH environmental variable. If using a sh-based shell (your shell prompt will usually contain
a $ character), simply type LD_LIBRARY_PATH=. ./testvec-dynamic 1 2 3
which should work. If using a csh-based shell (your shell prompt contains a %
character) you will need 2 commands:
% setenv LD_LIBRARY_PATH .
% ./testvec-dynamic 1 2 3
You can list out the dynamic dependencies of a dynamically-linked executable
by using the ldd command. Type ldd testvec-dynamic to see what libraries
the testvec-dynamic executable depends on. Set up the ldd command so that
it does not print libvec.so as not found.
1.3.7 Exercise 6: Symbol Denitions using nm
Stay in the same libvec directory as the previous exercise.
Produce a nm dump of the symbols in the statically-linked executable by running
nm testvec-static >testvec-static.nm. Look at testvec-static.nm using
a text editor and nd the denitions for the symbols main, addvec and multvec.
Then run gdb on that executable and print out the address of main, addvec and
multvec using p &main, p &addvec and p &multvec. You should see that the
addresses match the values in the nm dump.
Do the same thing with the dynamically-linked executable: specically, produce a nm dump of testvec-dynamic using nm testvec-dynamic >testvec-¬
dynamic.nm and use a text editor to nd the denitions for the symbols main,
addvec and multvec. You should see that main is dened but addvec and
multvec are not yet dened; they can only be dened once the dynamic library
libvec.so has been loaded. Run gdb on the dynamic executable using gdb
testvec-dynamic. Before you start the program print out the address of main,
addvec and multvec. The address of main should print out ne, but the other
two should print out only partial information.
Before you can run the program inside gdb for the dynamically-linked executable, you will need to set the LD_LIBRARY_PATH using set env LD_LIBRARY¬
_PATH . at the gdb prompt. Then put a breakpoint at main() using b main
and run the program using r 1 2 3. When the program stops at main(), you
6
should once again print out the addresses of main, addvec and multvec. You
will now see that the latter two are dened and are in much higher memory
than main, proving that they are in a dierent memory area (or segment) from
the text segment containing main.
1.3.8 Exercise 7: Building Your Own Libraries
Change over to the libgeom directory. It contains a specication le geom.h for
routines which calculate the perimeter and area of circles and rectangles and
implementation les circ.c and rect.c. It also contains a crude interactive test
program testgeom.c.
Create a Makefile which
1. Will build a static library libgeom.a containing the circ.o and rect.o
object les.
2. Will build a dynamic library libgeom.so containing the circ.o and
rect.o object les.
3. Will build a statically-linked executable testgeom-static which contains
the test program linked with the static library.
4. Will build a dynamically-linked executable testgeom-dynamic which contains the test program linked with the dynamic library.
You can use the libvec Makele as a starting point.
Build all of the above targets by running make on your Makefile. Test your
static and dynamic executables.
1.4 References
Text, chapter 7