Different Ways of Building Woe32 DLLs

Bruno Haible
<bruno@clisp.org>
Started: 2006-08-25
Last modified: 2006-09-03

Abstract: We present different ways of building shared libraries on Woe32 platforms, and make suggestions for libtool and gcc.

Prerequisite knowledge: Some basic knowledge about how linkers work and about the implementations of shared libraries. A good introduction the book "Linkers and Loaders" by John R. Levine, online at http://www.iecc.com/linker/. Knowledge about ELF is also useful.

Woe32 Introduction

Building shared libraries on Woe32 is well-known to be error-prone. Why?

By Woe32, here, we mean those platforms that use the Win32 DLL format, namely
One of the reasons is that on Woe32 with MSVC, there are three different ABIs
and the one which nearly everyone uses - the third one - is not the default.

The second reason is that the include file for a shared library on Woe32 and the include file for the same library, compiled as a static library (.a), are different: the include file for a shared library must have __declspec(dllimport) attributes at various places, whereas the include file for the static library must not.

But this is not all. Even if you are careful enough to always use the right ABI flags and the correct include file, you'll get hassles. More on this below.

ELF Introduction

Most Unix platforms nowadays (Solaris, Linux, FreeBSD, OpenBSD, NetBSD, ...) use the ELF format for object files, libraries and executables. Why is building shared libraries a breeze on ELF platforms? Just use
gcc -shared -o libfoo.so obj1.o obj2.o ...
Explanation: The ELF format was designed with the prerogative that a program consisting of several C modules, separated into a main executable and any number of disjoint shared libraries, should perform semantically identically to the same program, linked statically. Furthermore, building a shared library should be possible without changes to the source code; only different compiler and linker options should be needed.

Another requirement is central in ELF: If a symbol (= function or variable with external linkage) in the main executable has the same name as the one from a shared library, the one in the executable takes precedence.

Based on these requirements, the ELF format was designed to have

ELF and variables

We won't go into all the details. Only mention what happens to variable accesses. Assume we have a variable in the same shared library:
extern int samelibvar[5];
and a variable in a different shared library:
extern int externvar[5];
How does the code to access the second element of such an array look like?
int get_samelibvar() { return samelibvar[1]; }
int get_externvar() { return externvar[1]; }
Compiled with "gcc -O2 -fomit-frame-pointer -fPIC" on x86 platforms:
get_samelibvar:
        call    .L2
.L2:
        popl    %ecx
        addl    $_GLOBAL_OFFSET_TABLE_+[.-.L2], %ecx
        movl    samelibvar@GOT(%ecx), %eax
        movl    4(%eax), %eax
        ret

get_externvar:
        call    .L4
.L4:
        popl    %ecx
        addl    $_GLOBAL_OFFSET_TABLE_+[.-.L4], %ecx
        movl    externvar@GOT(%ecx), %eax
        movl    4(%eax), %eax
        ret
The first three instructions in each function are PIC boilerplate. It could also read
        leal    _GLOBAL_OFFSET_TABLE_-get_externvar(%eip), %ecx
if the x86 had such an instruction.

The next instruction is the most interesting. You can see here that
  1. The code for both variables is the same. This is because samelibvar could actually be overridden by another samelibvar in the executable! The compiler therefore has no room for optimization. (Better optimizations are possible, with gcc-4's -fvisibility option, but it requires source code modifications.)
  2. The code fetches the address of the variable externvar from a pointer-wide memory location, symbolically denoted externvar@GOT, in the current shared library's GOT. There is an externvar@GOT in every shared library that uses the variable; at runtime they all point to the same memory location.

Woe32 and variables

Now, for comparison, how does this look like on a Woe32 platform, with the PE format? The variable declarations indicate the location of the variable with respect to the current compilation unit:
extern __declspec(dllexport) int samelibvar[5];

extern __declspec(dllimport) int externvar[5];
Compiled with "gcc -O2 -fomit-frame-pointer -fPIC" on cygwin or mingw, or with "cl -O2 -Oy" on msvc:
_get_samelibvar:
        movl    _samelibvar+4, %eax
        ret

_get_externvar:
        movl    __imp__externvar, %eax
        movl    4(%eax), %eax
        ret
You can see here that
  1. The code for the samelibvar access is optimized. No indirection is needed, since the variable resides in the same library.
  2. The code fetches the address of the variable externvar from a pointer-wide memory location called _imp__externvar. As in the ELF case, there may be many of these _imp__externvar variables, but they all point to the same externvar.

"Premature optimization is the root of all evil." -- Donald Erwin Knuth

Look again at the two assembly snippets for get_samelibvar and get_externvar.
  1. The code for get_samelibvar is the same as when no __declspec had been given at all. It's the default code. In other words, __declspec(dllexport) does not immediately lead to different code generation. Rather, its primary purpose is that when the variable is actually defined, it will be marked as "exported" in the object file and likewise later in the shared library.
  2. The code for get_externvar is longer than the code for get_samelibvar. This makes it impossible to convert from the default assembly code to this one in the linker. (In theory this would be possible. Techniques like "linker relaxation" allow to change code in the linker. But the compiler has to know that it must not simplify the difference of two addresses of labels in the same function to a numeric constant, but rather leave that to the linker. But then you have a different object file format than PE. [Object file format such as ELF-Xtensa can store the difference of two addresses in a symbolic way.]) In other words, the __declspec(dllimport) is necessary for the compiler to generate the correct code.
The same __declspec attributes also apply to functions.
extern __declspec(dllexport) int samelibfunc();
extern __declspec(dllimport) int externfunc();
int get_samelibfunc() { return samelibfunc(); }
int get_externfunc() { return externfunc(); }
Again, the default code is used when __declspec(dllexport) is in effect, and the code used when __declspec(dllimport) is in effect has one more indirection. Here is the code, compiled with "gcc -O -fPIC" on cygwin or mingw:
_get_samelibfunc:
        pushl   %ebp
        movl    %esp, %ebp
        subl    $8, %esp
        call    _samelibfunc
        leave
        ret

_get_externfunc:
        pushl   %ebp
        movl    %esp, %ebp
        subl    $8, %esp
        call    *__imp__externfunc
        leave
        ret
or with "cl -Os -Oy" on msvc:
_get_samelibfunc:
        pushl   %ebp
        movl    %esp, %ebp
        call    _samelibfunc
        popl %ebp
        ret

_get_externfunc:
        pushl   %ebp
        movl    %esp, %ebp
        call    *__imp__externfunc
        popl %ebp
        ret
But there is a big, crucial difference: When you omit the __declspec(dllimport) attribute although the referenced function is in a different shared library, for functions, the linker will provide a trampoline function, roughly like this:
_externfunc:
        jmp     *__imp__externfunc
The linker will warn when doing this, but it will do the right thing. Whereas for variables, when you omit the __declspec(dllimport) attribute although the referenced variable is in a different shared library, the linker does nothing, and you're left with a link error: "undefined reference to `externvar'" (gcc) or "unresolved external symbol _externvar" (msvc).

(Technical precision: For every shared library libfoo.dll in PE format, there is a static library libfoo.dll.a that contains only these _imp__* trampoline functions. It is a static library, therefore an _externfunc symbol and  an _imp__externfunc pointer memory location will be present in every executable and every shared library that uses externfunc. -- In contrast, on ELF systems, the linker and runtime linker perform this business of indirections automatically and without the need of a .dll.a file. And they also do it for variables!)

Woe32 problem summary

In summary, some code changes are needed for the sake of shared libraries on Woe32.
  1. You must ensure that the relevant functions and variables are exported from the library.
  2. You must ensure that variables imported from other shared libraries are known as __declspec(dllimport).
These are two different problems. They can be solved together, but they can also be solved separately.

Woe32 problem 1: Exports

There are four ways to export functions and variables from a Woe32 shared library. (See also the GNU binutils documentation ld.info sections "`ld' and WIN32 (cygwin/mingw)" and "Options Specific to i386 PE Targets".)
  1. Export them all. This works only when using the GNU linker, not (IIRC) when using the MS linker. The linker option that triggers this is called --export-all-symbols. It is also the default behaviour if no symbol is explicitly exported, i.e. if the library would otherwise have no exported symbols.
  2. Export a selected set of symbols. How?
    1. By use of __declspec(dllexport) in the declaration or the definition of each such symbol.
    2. By use of asm statements that write into the .drectve section:
      asm (".section .drectve\n");
      asm (".ascii \" -export:variable,data\"\n");
      for a variable,
      asm (".section .drectve\n");
      asm (".ascii \" -export:function\"\n");
      for a function. This works only when using GCC, not MSVC.
    3. By use of an export list. This is a .def file that has, in particular, an EXPORTS statement followed by a list of all exported symbols.
Evaluation / Comparison:

Method 1 is clearly the easiest to put in place when portability only to cywin and mingw is desired. No source code modifications. At most one modification in Makefile.am:
libfoo_la_LDFLAGS += -Wl,--export-all-symbols

Method 2a has the drawback that it requires source code modifications of the include files; however, these modifications can be applied automatically (change "extern" to "extern LIBFOO_DLL_EXPORTED"). Its advantage is that it can work with C++ as well.

Methods 2b and 2c don't work for C++, because of the name mangling.

Method 2c additionally has the drawback that it works in a single configuration only; a library cannot export different sets of symbols depending on configuration settings.

Recommendation:

If msvc is not among the target platforms, use method 1.

If msvc is a target platform, use method 2a.

Woe32 problem 2: Exported variables

There are four ways to deal with the requirement to declare exported variables specially. (See also the GNU binutils documentation ld.info section "`ld' and WIN32 (cygwin/mingw)".)
  1. Don't export variables at all. Export only functions from the shared library.
  2. Use the GNU ld --enable-auto-import option.  It is the default on Cygwin since July 2005.
  3. Define a macro that expands to  __declspec(dllexport)  when building the library and to  __declspec(dllimport)  when building code outside the library, and use it in all header files of the library that define variables.
  4. Define a macro that expands to  __declspec(dllimport)  always, and use it in all header files of the library that define variables.
Evaluation / Comparison:

Method 1 has the drawback of severely affecting the programming style in use. It does not let the programmer use full ANSI C. It lets one platform dictate the code style on all platforms.

Method 2 has three fatal drawbacks:
Method 3 has the drawback that it makes a distinction between code inside and code outside the library. Thus the boundaries of the library have an influence on the source.

Method 4 has the benefit that the partitioning of the source files into libraries (which source file goes into which library) does not affect the source code; only the Makefiles reflect it. The performance loss due to the unnecessary indirection for references to variables from within the library defining the variable is acceptable.

Recommendation:

Method 1 is unacceptable.

Method 2 is unacceptable as well. But since it's enabled by default, we have to disable it, through the gl_WOE32_DLL autoconf macro, contained in the file woe32-dll.m4.

Method 3 is acceptable if
  1. the header files are unique to this library (not shared with other packages), and
  2. the library sources are contained in one directory, making it easy to define a -DBUILDING_LIBXYZ flag for the library. Example:
    #ifdef BUILDING_LIBASPRINTF
    # define LIBASPRINTF_DLL_EXPORTED __declspec(dllexport)
    #else
    # define LIBASPRINTF_DLL_EXPORTED __declspec(dllimport)
    #endif
Method 4 is always acceptable.

Use method 3 or 4, depending on the circumstances.

The catch

So, why isn't method 4 in wider use? The reason is that
  1. the compiler signals warnings, making the developer think that he is on the wrong path,
  2. libtool fails to handle self-references, i.e. references to a symbol from within the shared library that exports the symbol lead to a link error.

Libtool suggestion

When symbols (both variables and functions) are referenced from within the shared library that exports the symbols, link errors about the _imp__* indirections result. Here is an example:

/* ============== shared.h ============== */
extern __declspec(dllimport) int externvar[5];
extern __declspec(dllimport) int externfunc(int);

/* ============== shared.c ============== */
#include "shared.h"

int externvar[5] = { 11, 22, 33, 44, 55 };

int externfunc (int x)
{
  if (x == 0)
    return 42;
  else
    return externfunc2 (x);
}

/* ============== shared2.c ============== */
#include "shared.h"

int externfunc2 (int x)
{
  return externvar[x] + externfunc (0);
}

Compile with these commands:
mingw-libtool --mode=compile i386-pc-mingw32-gcc -O2 -c shared.c
mingw-libtool --mode=compile i386-pc-mingw32-gcc -O2 -c shared2.c
objects="shared.lo shared2.lo"
mingw-libtool --mode=link i386-pc-mingw32-gcc -O2 -Wl,--disable-auto-import -o libshared.la -rpath `pwd` -no-undefined $objects

This yields the link errors:
.libs/shared2.o:shared2.c:(.text+0x8): undefined reference to `_imp__externvar'
.libs/shared2.o:shared2.c:(.text+0xe): undefined reference to `_imp__externfunc'

The problem is that the linker generates _imp__* pointer variables only for exported symbols it finds among the dependencies. (It probably does so to be compatible with the MSVC linker, and the MSVC linker doesn't care about this situation since Microsoft recommends method 3, not method 4.)

How can the linker be told to generate _imp__* pointer variables also for exported symbols in the library itself? It is a kind of chicken-and-egg problem, since as long as the library is not build, it cannot be used as a dependency of itself...

The trick is to add these _imp__* pointer variables right before linking, based on the symbol lists of the constituent object files. Here is a first attempt:

lo_to_o='s,\.lo$,.o,'
(for o in $objects; do
   i386-pc-mingw32-nm .libs/`echo $o | sed -e "$lo_to_o"`
 done
) > nm-output
cat nm-output | sed -n -e 's,^.* U __imp__\(.*\),\1,p' | LC_ALL=C sort | LC_ALL=C uniq > needed-imps
cat nm-output | sed -n -e 's,^.* T _\(.*\),\1,p' | LC_ALL=C sort | LC_ALL=C uniq > defined-text-syms
cat nm-output | sed -n -e 's,^.* [DR] _\(.*\),\1,p' | LC_ALL=C sort | LC_ALL=C uniq > defined-data-syms
cat defined-text-syms defined-data-syms | LC_ALL=C sort > defined-syms
LC_ALL=C join needed-imps defined-syms | \
  sed -e 's/^\(.*\)$/extern int \1; void *_imp__\1 = \&\1;/' > exports.c
i386-pc-mingw32-gcc -O2 -fomit-frame-pointer -S exports.c
mingw-libtool --mode=compile i386-pc-mingw32-gcc -O2 -fomit-frame-pointer -c exports.c
mingw-libtool --mode=link i386-pc-mingw32-gcc -O2 -Wl,--disable-auto-import -o libshared.la -rpath `pwd` -no-undefined $objects exports.lo

This could basically work. But apparently, those symbols for which an _imp__ indirection is already defined in the DLL are not put into the libshared.dll.a. This leads to link errors later, when linking against libshared. To work around this, we need to export these symbols explicitly. And then, of course, we need --export-all-symbols because it is no longer the default.

lo_to_o='s,\.lo$,.o,'
(for o in $objects; do
   i386-pc-mingw32-nm .libs/`echo $o | sed -e "$lo_to_o"`
 done
) > nm-output
cat nm-output | sed -n -e 's,^.* U __imp__\(.*\),\1,p' | LC_ALL=C sort | LC_ALL=C uniq > needed-imps
cat nm-output | sed -n -e 's,^.* T _\(.*\),\1,p' | LC_ALL=C sort | LC_ALL=C uniq > defined-text-syms
cat nm-output | sed -n -e 's,^.* [DR] _\(.*\),\1,p' | LC_ALL=C sort | LC_ALL=C uniq > defined-data-syms
cat defined-text-syms defined-data-syms | LC_ALL=C sort > defined-syms
(
  LC_ALL=C join needed-imps defined-text-syms | \
    sed -e 's/^\(.*\)$/asm (".section .drectve\\n"); asm (".ascii \\" -export:\1\\"\\n"); extern int \1; void *_imp__\1 = \&\1;/'
  LC_ALL=C join needed-imps defined-data-syms | \
    sed -e 's/^\(.*\)$/asm (".section .drectve\\n"); asm (".ascii \\" -export:\1,data\\"\\n"); extern int \1; void *_imp__\1 = \&\1;/'
) > exports.c
i386-pc-mingw32-gcc -O2 -fomit-frame-pointer -S exports.c
mingw-libtool --mode=compile i386-pc-mingw32-gcc -O2 -fomit-frame-pointer -c exports.c
mingw-libtool --mode=link i386-pc-mingw32-gcc -O2 -Wl,--disable-auto-import -Wl,--export-all-symbols -o libshared.la -rpath `pwd` -no-undefined $objects exports.lo

This second attempt does work.

Libtool should incorporate this functionality of resolving self-references. It's a major simplification to the developer, because method 4 is so much simpler than method 3.

Note that this will work equally well with C++ as in C, since the name mangling does not matter.

Further note: We were only interested in variables, not functions, because functions are already handled well without __declspec(dllimport). But the manipulations should be performed on functions as well, because in C++, the developer might need to use __declspec(dllimport) for an entire class, and the class defines both variables (called "static data members" in C++) and functions (called "member functions" in C++).

When the option -export-symbols or -export-symbols-regex is passed to libtool, the asms for exporting the symbols should only be used for the symbols designated for export; for symbols that are not to be exported (but have nevertheless been marked __declspec(dllimport) by the programmer) the code without asms from the first attempt should be used.

GCC suggestion

On code like this:

extern __declspec(dllimport) int externvar[5];
extern __declspec(dllimport) int externfunc(int);

int externfunc2 (int x)
{
  return externvar[x] + externfunc (0);
}

int externvar[5] = { 11, 22, 33, 44, 55 };

int externfunc (int x)
{
  if (x == 0)
    return 42;
  else
    return externfunc2 (x);
}

gcc reports warnings:

warning: 'externvar' defined locally after being referenced with dllimport linkage
warning: 'externfunc' defined locally after being referenced with dllimport linkage

Once libtool is changed to not cause link errors for self-references, there is no need any more for this warning. gcc should remove this warning.

Resources

This writeup is available at http://www.haible.de/bruno/woe32dll.html. Test cases are available at http://www.haible.de/bruno/woe32dll.tar.gz.