The goal of this post is to analyse the key differences between linking libraries in .NET and C++. The idea is to explore what are the pitfalls when planning how to structure your components in a reusable way.
The focus is on the C++ side. The .NET aspects here will be more to demonstrate the differences and to clarify the complexities involved in C++.
This will focus on C++ for Windows, but the concepts should be very similar in Linux, just changing the idea of a .dll to a .so.
When I initially wrote this, I had in mind the integration of unit test frameworks with C++. In C# it is very easy and handy, adding a reference from your code, in its own class library to another assembly which references the test framework. In C++, a whole new set of problems crops up, which I’ll demonstrate here.
How to use a library in .NET?
Usually in .NET you can have your application shipped in several different formats. A console application, a windows application, a web application, a service, etc. Most of the cases they are just a .csproj and will compile to a .exe or .dll that is runnable, depending on what kind of application it is. An .exe for console app, windows app and windows services, and a dll (combined with some aspx and some other framework magic) for web apps.
These application containers (exe’s, or dll’s) can make use of other code in other class libraries. Class libraries are by definition compiled in .dll and can be “referenced” by application containers. In .NET is fairly simple. Once you create a depedency to a .dll, you can include the namespaces and classes existing in that class library in your application.
Why this is possible?
Since .NET is a byte-code language, everything is compiled to IL (intermediate language) what is only compiled to platform-dependent binary in runtime. Exaclty: in run-time. As your application hits a function, the framework compiles its code and replaces the pointer to that function to the real binary code and don’t need to compile again. This is what often is referred as “JIT” (just-in-time) compiling. Some other environments uses the same technique, like node.js and some python runtimes (PyPy, for example).
The framework also contains the reflection API, which allows you to get data about the exposed types on that class libraries, which can happen in run-time with the compiled assembly.
Because of this process and the fact that all the binaries (including the class libraries referenced by your application) will share the same virtual machine, which will make sure all the implementation of the platform-dependant code is fully compatible to each other. Sweet, huh?
How to organise your code in libraries in C++
In C++ your application can compile in static libraries (.lib), dynamic libraries (.dll) and, what I called application containers, like executables (windows applications, console applications, UWP applications, etc). The .lib would be the preferred target if you want to reuse code. You can include this code in a .lib or a .exe.
So, that is just following the same process, right? Referencing the .libs and .dlls in your executables and done!
No. I which was simple as that. First, because you don’t have reflection or any easy way of inspecting what is inside a .lib or a .dll in runtime. The only way of doing that is using header files, in compilation-time.
A header file is a file that contain the definitions (but not the implementation) of the classes and functions. Usually you define the objects on the header and implement the body of the functions in source files (.c, .cpp). This also involves the concept of forward declarations, when in one header you define that a given type is just a class, for example, but don’t define the members of this class. The compiler allows you to reference it, and this will be defined later, in other headers.
So, to “reference” headers the process is to add an include path on the application you are building. In other words, when you compile something in C++ you need to tell all the paths of all the headers to include in that particular binary you are building.
So, it is easy, then. Reference the headers, .libs and .dlls and job done. Not quite yet. There’s another important thing that is: binary compatibility.
What is the idea of binary compatibility in C++
Imagine you compile a set of classes in a static library (.lib) with compiler “A”. Then you use the same headers and link to this .lib on an executable, but you compile this executable with a compiler “B”. It is not guaranteed to work.
The reason is because you can have shared types that does not have the same implementation or the same memory model. This means that the implementation of a std::string in compiler “A”, might be completely different than in compiler “B”. Even though when you use both they look the same (same members, method signatures, etc). From a binary point of view, they might not have the same implementation.
Now multiply this problems by: different versions of compilers, different versions of stdlib and different versions of run-time! Then, you can have a very interesting time with this. And remember, this is C++. If a problem happens probably you will have a crash (if you are lucky) or your application will start behaving in a funny way and you will have no idea why (if you are unlucky).
The same problem applies for .dll’s. Even though they look quite reusable, they may not be, because you can’t use any type of the standard library or any other more complex type that can have binary compatibility problems. That’s why in most DLL’s you will see only very basic types like char* to represent strings, integers, doubles, because those scalar types will have a common definition across compilers.
Linking in a nutshell
This will look very simple when you read, but afterwards, I’ll explore some of the potential problems.
Looking at the whole process, the process to build an application (and libraries) in C++ will be:
- The compiler will compile all the source files (.cpp) in .obj files. It will need the headers to do that. The obj is just a binary version of the source file (cpp). Sometimes a header may be included more than once. That’s why you have “#pragma once” or some #ifdefs to avoid headers to be included more than once, generating compiler errors.
- There’s a variation of this process which involves “precompiled headers”. I won’t discuss this here.
- A static library (.lib) will be just a set of .obj glued on each other. In this model, there’s no concern if all the things declared on the header are implemented. The compiler will produce a warning, but not an error.
- A dll is slightly different. Because the dll produces something that can be executed (although inside other process), the linker will make sure that all the functions referenced in the code is implemented. If you reference a header that declares a function but don’t implement, you will have a linker error (prefixed with LNK) telling that that implementation couldn’t be found.
- For an exe, the same process of the dll. The exe will be expecting the application entry-point (main method, or winmain method) to be implemented somewhere.
- For both exe and dll, if you are referencing other static libraries (.lib), you will have to specify all these libraries for your linker.
Potential problems linking C++ code
Some errors only happen in linking time
Although C++ is a very powerful and flexible, it can cause some really interesting issues, if you are not very careful about how you organise your code.
One example is: Imagine you define a class library 1, with one header definition for a function:
Even if you don’t have any .cpp file implementing what this method does, this file will compile. You will only see an error about the missing implementation when you add this class library 1 as a reference to another process (dll or exe). But you will have no idea which class library defined this function without implementing.
To make everything even more fun, some developers uses this as a feature. As a way of creating an “extension point” in the application. So, they define only the function, expecting the implementation will be somewhere else, like a plugin.
Since you use external headers to compile your application, it is very common to have different ways of handling warnings in different projects. One project may consider something a warning, the other may disable it and one can use “warnings as errors” and the other don’t. This means, if for instance you want to add FakeIt.hpp (Fake It is a mocking framework for C++) in your existing project, it seems to be very straightforward, but since FakeIt produces warnings that you consider errors in your project, you’ll need to ignore than manually.
You can workaround this “wrapping” your include in things like #pragma warning disable (for Microsoft VC++) or #pragma clang diagnostic ignored (for Clang).
Defines and namespaces
Sometimes 3rd party includes redefines or expects some defines like “NULL”, “UINT”, etc. Since those lives on global namespace, you may have ambiguous names. The same for class names and types.
Using namespaces for your types can save you a lot of time and frustration. If you are expecting that your code defines some of these macros (which I’d avoid), make sure they all use an internal name which can be redefined easily on a single file.
For 3rd party code, I don’t have interesting strategies to avoid ambiguous defines. You will probably have to change the 3rd party code, which I would avoid if I can.
Standard C Library issues
Sometimes you can compile 3rd party code on a .lib considering one runtime library. The other process that includes this .lib is using a different runtime library. You can find linker errors based on that. Make sure those match.
In Microsoft Visual Studio this is given by the /MD switch (Options, C/C++, Code Generation, Runtime library).
I hope this information helps you to understand better why some of these issues happens. For people used with C/C++ this can be straightforward, but for people starting in this technology this can be scary. I hope this helps.