The Dynamic Cast Story

Summary

This section details the dynamic_cast issues that appear in linux, with recent compilers. Typically this problem will be manifested as a dynamic_cast that should work (the object really is a derived object) but fails. This problem should be adequately resolved in Mpacts in a way that is mostly transparent to users. However, some issues might pop up in the future. This is why this documentation section is here.

The problem: Multiple definitions

The reason why dynamic_casts fail in Linux while they actually should succeed finds its origin in a ‘new’ policy adapted under Linux. In Windows, and older versions of Linux, the legality of a dynamic cast is verified by comparing the type_info.name() string of the object and the requested cast-to-class. Under Linux this is done by comparing the typeinfo pointers directly to allow for faster dynamic_casts.

In principle C++ code should honor the “one definition” rule, stating that any symbol may be defined only once in any executable. Most compilers also enforce this rule at the linking stage, and will indeed report duplicated symbols.

Python modules however are in fact small shared libraries, which are not linked against each other. If a symbol would be defined more than once, no error would be reported by the compiler/linker. Moreover if two modules would have the same definition of a class, both type_info.name()’s would be identical, however the two type_info pointers would be different.

This results in a failing dynamic_cast under linux, while everything works perfectly in windows. A feature that can be hard to detect, since it is not expected to occur. A more elaborate description can be found here [gcc-dso]

The possible solutions

In the remainder of this topic, several possible solutions are proposed each with their advantages and disadvantages.

Avoid the use of multiple definitions

In principle the cleanest way to solve this problem is to avoid having multiple definitions in the first place.

Advantages:

  • Theoretically the cleanest solution

Disadvantages:

  • The user needs to be aware of this issue.
  • The modularity of Mpacts requires some symbols to be duplicated in order to have a minimal module interdependence. Current wisdom says that we don’t need to export all symbols; classes needed in multiple modules need to be defined in one of the base libraries (and their symbols explicitly exported, see libspec.h).

Provide a dynamic cast from within the module

This solution requires a static member function in a module (shared library) to be created that performs the actual casting and returns a derived pointer. The idea here is that a derived object will always be created from within this module since it contains the constructor. When the cast is guaranteed to be done with the symbol definition of this module (i.e. the cast is done from within the module with the created function) the dynamic_cast will always succeed.

Advantages:

  • Solution is relatively safe conceptually, if the static cast function is systematically created and used its behavior is very similar to regular dynamic casts.

Disadvantages:

  • The user needs to be aware of this issue.
  • Places a heavy burden on the developer since the function needs to be created for any object that will be casted.
  • There is no clean way to avoid that the user uses dynamic_cast directly instead of the static function that does the cast. (Redefining dynamic_cast could be thought of, but since defines are evil, this could be really dangerous. Moreover, it makes legal uses, within one module, of dynamic_cast very difficult.)
  • Dynamic casts are also used implicitly in, for instance, try/catch statements, and will fail horribly here (even with dynamic_cast redefined).

Don’t use dynamic casts

Dynamic casts can be avoided (to some extent), and replaced by static casts combined with a method that allows to verify that the cast is legal. This verification method could be a virtual function that returns a custom type descriptor that can be checked at runtime.

Advantages:

  • Since there are not that many cases where this relevant, this solution could work as the number of types would be less than 20.

Disadvantages:

  • The user needs to be aware of this issue.
  • There is no way to avoid the use of dynamic_cast. Even redefining is not an option.
  • Places a heavy burden on the developer since the function needs to be overridden for any object that will be casted to.
  • Dynamic casts are also used implicitly in, for instance, try/catch statements, these will still fail inexplicably.

Use “weak linkage” and load symbols globally

It is possible in Linux to define all symbols as “weak” symbols. This means that they will be used in case no other definition is available, but in case a definition exists already, the existing definition is used.

While it might sound as if this solves the problem completely, it does not. In Linux, a library is by default loaded “locally” with its private ‘namespace’ to avoid namespace clashes. In effect, if two libraries with duplicated symbols are loaded, the duplicated symbols will be in different namespaces and the weak linkages will have no effect.

This is why the modules, after being compiled with weak linkage, should be loaded globally.

Advantages:

  • Effectively solves the dynamic_cast problem completely.
  • No user awareness required.

Disadvantages:

  • Global loaded libraries can result in namespace clashes, with subtle and very hard to find bugs. This is especially true if all subsequently loaded libraries are also loaded globally, especially non-Mpacts libraries.

This solution was previously selected, but the effect of the global loading of symbols (numpy/scipy!) was too severe. We now (try to) stick to the first solution, i.e. not exporting any duplicated symbols. This means that any class potentially casted between two modules has to be defined in a common basic library (libETility, libArraythmetic, libMpacts).

Solution implementation details

module import overload in [PEP302].

References

[PEP302]New Import Hooks. http://www.python.org/dev/peps/pep-0302/
[gcc-dso]dynamic_cast, throw, typeid don’t work with shared libraries. http://gcc.gnu.org/faq.html#dso