A Casting Show

In C++ there are two ways of type conversions: implicit and explicit type conversions. The latter are called type casts and they are what this post is about.

Overview

C++ has the following capabilities for explicit type conversions:

  • The C++ cast operators are keywords defined in the language. While they look like template functions, they are part of the language itself, i.e.the behavior is implemented in the compiler, not in the standard library. There are four of them:
    1. `const_cast`
    2. `reinterpret_cast`
    3. `static_cast`
    4. `dynamic_cast`
  • The C-style and function-style casts. The C-style cast consists of the type you want in parentheses, followed by the expression you want to be converted into that type, e.g. `(double)getInt()`. The function style cast works only slightly different, by stating the target type followed by the source expression in parentheses, i.e. `double(getInt())`. It is equivalent to the C-style cast in every respect, except the target type has to be a single word, so `unsigned long`, `const double` and any kind of pointer or reference is not allowed.
  • Construction of a temporary value in C++11. It looks similar to the function-style cast: `long{getInt()}` but uses the initializer list with curly braces introduced in C++11. It has a few more restrictions than the function-style cast, e.g. if converting to a user defined type that is only possible when the target tape has a corresponding constructor, but not when the spurce type has a corresponding conversion operator.
  • Functions that take a parameter of one type and return an object of another type, representing the same value. While they are technically not real casts, they have the same look and feel and usage, and sometimes even are implemented with casts. Prominent examples are `std::move`, `std::dynamic_pointer_cast` and `boost::lexical_cast`.

The four cast operators represent the basic conversions possible in C++, so I will explain them in detail. The other possibilities will be covered only briefly.

const_cast

This cast has one sole purpose: removing constness from a pointer or reference. In theory, it is also usable to add constness, but since this is possible via an implicit conversion, it is not recommended to explicitly use a cast for that. It is the only cast operator that can remove the constness, other cast operators are not allowed to do so.

void foo(MyClass const& myObject) {
  MyClass& theObject = const_cast<MyClass&>(myObject);
  // do something with theObject
}

Casting away the constness of an object can be dangerous. In the example above, the user of the function will expect his object to remain unchanged. The `const_cast` on the other hand gives full write access to the object so it could be changed. Irresponsible use of `const_cast` therefore can lead to unexpected behavior, hard to debug bugs and even undefined behavior.

In many cases `const_cast` is only necessary due to desing problems. Const correctness is often missing in legacy code or it is percieved hard to get right because developers mix up semantic and syntactic constness or don’t use `mutable` when appropriate.

Be suspicious whenever you encounter a const_cast in code. Think twice before you write one.

There are a few cases where `const_cast` is indeed the right thing to do. The best known case are accessor functions that have a const and a non-const version, where the former returns a const reference (or pointer) and the latter a non-const reference:

class MyContainer {
  int* data;
public:
  int& getAt(unsigned index) {
    auto const_this = static_cast<MyContainer const*>(this);
    return const_cast<int&>(const_this->getAt(index));
  }
  int const& getAt(unsigned index) const {
    checkIndex(index);
    return data[index];
  }
};

More general, `const_cast` is then used to access a const object in a way that syntactically might change the object, but you know for sure that it does not. This is mostly restricted to the object’s own methods, since encapsulation demands that outsiders can not be sure when a non-const operation does not alter the object.

reinterpret_cast

`reinterpret_cast` is the most aggressive, insecure and (hopefully) least used of the four C++ cast operators. It can be used only on integral types, enums, all kinds of pointers including function and member pointers and nullpointer constants like `std::nullptr`. It is meant to be used to convert types that are otherwise not compatible, i.e. mainly from pointer to int and back, or from pointer to X to pointer to Y and back, where X and Y are unrelated types.

The usual behavior is to just reinterpret the bit representation of the source value as bit representation of the target value. No checks are applied, which means if you use the cast, you are on your own. For example, you can indeed cast a `car*` into a `duck*`, and casting it back is guaranteed to give you the same `car*`. Actually using the `duck*` will most surely result in undefined behavior. In fact, any use of `reinterpret_cast` that can not be done via other casts has a bunch of “DANGER” and “Undefined Behavior” signs around it.

I know of only two examples I know of where there is no option but to use `reinterpret_cast` are casting pointer values to int, to log them in the well known `0x50C0FFEE` format and storing a pointer where another poitner (or int) is meant to be stored. The latter is e.g. the case in Borland’s VCL where GUI objects have the ability to store data in a `TObject` pointer. If you want to store a context that is not derived from `TObject` you have to store and retrieve it by casting your object’s address to and from `TObject*`. `void*` would have been a better choice in my opinion.

If you need to write a `reinterpret_cast`, make a break, go for a walk, and reconsider that decision. If you come back and still want to write it, encapsulate the cast(s) carefully.

static_cast

`static_cast` is the most straight forward cast. Consider you have an expression `a` of type `A` and want that value converted to type `B`, and the conversion is possible per sé, i.e. the types are not unrelated so you don’t need a `reinterpret_cast`. If the conversion is not implicit, or the compiler is not able to select the right implicit conversion because you passed the value to a function that has overloads that get preferred over the one you want or make it ambigous, then you have to explicitly force the conversion.

If `B` is a userdefined class type it is common to use a function style cast or call the conversion constructor explicitly, i.e. `B(a)` or `B{a}`. Both have the same effect like a `static_cast` in this case. In all other cases, i.e. if you convert to or between built in types, use `static_cast` explicitly. Cases where this is necessary are:

  1. narrowing conversions between numbers (int to short, double to int, …)
  2. conversions between integrals and enums
  3. conversion from `void*` to any other pointer type
  4. downcasts of pointers or references in class hierarchies when you know the dynamic type of the object (see below)

Points 3 and 4 are to be used with caution: If you `static_cast` to a pointer (or reference) of type `T*`, the compiler believes you and assumes you really know that in fact there is a `T` at the address stored in the pointer. If there is something else, it will still treat the bits and bytes at that location as if there was a `T`, causing undefined behavior and hopefully blowing the program up right in your face. (I say hopefully because a crash is much less pain to debug than a silent failure that lets the program just act weird but continue).

dynamic_cast

This cast is used for downcasts and crosscasts of pointers and references in class hierarchies. You pass in a pointer of class X, casting it to a pointer of a class somewhere else in the class hierarchy. Casting to a base class (upcast) is implicitly possible and does not need a explicit cast.

Depending on whether the type of the object behind that pointer (called the dynamic type) in fact is of that other class or not, the result of the cast is the new pointer or a null pointer. Of course, if the object is of a type that is derived from the target class, the cast succeeds as well. Since references can not be null, `dynamic_cast` on a reference throws an `std::bad_cast` exception if the cast does not succeed.

class B {};
class D1: public B {};
class D2: public B {};

void foo() {
  D1 d1;
  D2 d2;
  B* b1 = &d1;
  B* b2 = &d2;

  D1* d1b1 = dynamic_cast<D1*>(b1); //ok, d1b1 now points to d1
  D1* d1b2 = dynamic_cast<D1*>(b2); //result is NULL because *b2 is not a D1

  D1& rd1b2 = dynamic_cast<D1&>(*b2); //throws std::bad_cast
}

People often view the presence of `dynamic_cast` with suspicion because it is often a hint to a flawed design. Many naive applications of `dynamic_cast` can be solved more cleanly with virtual functions.

Before applying a dynamic_cast, check if the problem at hand can not better be solved with virtual functions.

Downcasts in class hierarchies

`dynamic_cast`, `static_cast` and `reinterpret_cast` can all three be used to cast a base class pointer or reference into a pointer or reference to a more derived class. So what is the difference between the three?

As shown above, `dynamic_cast` checks if the dynamic type of the object is of the expected class. That check is performed at run time which needs access to run time type information (RTTI) and costs a few CPU cycles. The other two casts occur (almost) purely at compile time and are therefore faster. However, if you do not know the dynamic type of the object, you have no other option.

If you know the dynamic type and the relationship between the two classes is a line of single inheritances, then the two other casts do exactly the same, which is exactly nothing. The new pointer contains the exact same address, it just has another type. However, in case of `static_cast` the compiler checks if that conversion is even possible, i.e. if the target type is indeed a derived class of the source type, so it is more secure than `reinterpret_cast`. The following example will lead to a compiler error:

class B; //forward declaration
class D; //forward declaration

B* pb;
D* pd = static_cast<D*>(pb); //ERROR: B* is not convertible to D*

In case of multiple inheritance, the memory layout may be in a way that the address of the derived object differs from the address of the base class object:

class B1 { int i; };
class B2 { int j; };

class D : public B1, public B2 {};

void bar() {
  D d;
  B2* pb2 = &d;
  D* pd1 = static_cast<D*>(pb2);
  D* pd2 = reinterpret_cast<D*>(pb2);
}

Let’s assume for simplicity that `sizeof(int)` is 4, and there are no padding bytes, and we are in a typical environment where the subobjects are stored in order in memory. Compared to the address of `d` itself, the offset of the `B1` subobject and its member `i` is 0, i.e. they have the same address. The offset of the `B2` subobject and `j` is 4.

When the compiler sees the line `B2* pb2 = &d;` it knows that offset and performs the implicit conversion from `D*` to `B2*` by adding 4, so that pointer does indeed point to the `B2` subobject. The `static_cast` is doing the exact opposite: The compiler subtracts 4 and `pd1` has again the address with offset 0, pointing correctly to `d`. The `reinterpret_cast` on the other hand will preserve the value of `pb2`, so `pd2` will contain the same address, pointing to offset 4 and not to `d`. Accessing it will result in undefined behavior. Oops.

For downcasts use either dynamic_cast or static_cast, never reinterpret_cast.

C-style cast and function style cast

When the compiler sees a C-style or function style cast, it tries to apply different sequences of elementary conversion. The first one that is possible is applied. The sequences are in order:

  1. `const_cast`
  2. `static_cast`
  3. `static_cast` followed by `const_cast`
  4. `reinterpret_cast`
  5. `reinterpret_cast` followed by `const_cast`

As seen above, `reinterpret_cast` is very unsafe, so you don’t want the compiler to accidentally apply that one. As a corollary, you don’t want to use these casts to convert something to pointers, references or other built in types. `const_cast` can be applied only to pointers and references that we have ruled out already, so what remains is a sole application of `static_cast`. That is the reason why I mentioned the possibility of function style casts to user defined types in the `static_cast` section. Since that leaves no composed types as target types, the C-style form is never necessary and therefore discouraged. Instead of a C-style cast you can also use the constructor call conversion.

Use function style cast for conversions to user defined classes only.  Don’t use C-style casts.

Conclusion

Be careful when applying casts, no matter what kind of cast. Always make yourself aware of the risks and implications, especially if it is not a `static_cast`.

Facebooktwittergoogle_plusredditlinkedinFacebooktwittergoogle_plusredditlinkedinby feather

20 Comments


  1. Why no in the static_cast used in non-const ‘getAt’ function (in the example for const_cast)? I have not seen that usage before.

    Reply

    1. I have to confess I don’t fully understand the question. If you meant the missing `< MyContainer const*>` after the `static_cast`, then this was simply a rendering problem in the code formatting plugin.

      Reply

      1. I see it’s fixed now. Yes, when I viewed it, the was not there after static_cast.

        Reply

        1. Hmm. My comment got altered. Yes, the type that you added now appears. No confusion now.

          Reply

          1. Yeah I noticed that things with angle brackets sometimes are interpreted as maybe-html-tags that don’t get printed.


    2. p.s. Thanks for finding me on Twitter. Like the blog!

      Reply

  2. Hey Arne, great article!

    What do you think about wrapping existing casts in stricter ones?
    I gave a lightning talk about this at Meeting C++ 2016.
    (https://www.youtube.com/watch?v=62c_Xm6Zh1k)

    I think “thin wrappers” which are stricter and do as much checks as possible at compile-time (also assertions) are a good idea to make the code safer and more readable, and also to clearly express your intentions.

    My code is here if you want to take a look:
    https://github.com/SuperV1234/meetingcpp2015/tree/master/0_MeaningfulCasts

    Thankfully, boost (and probably other mature libraries) also provide their “cast wrappers” for numbers and class hierarchies.

    Reply

  3. Hi Arne!

    Maybe you should additionally add (from the new standard for casting of std::shared_ptr’s):

    * const_pointer_cast
    * static_pointer_cast
    *dynamic_pointer_cast

    kind regards,
    Christian

    Reply

    1. Hi Christian,
      thanks for the suggestion. This post was at the time purely targeted at built in casts. I will cover the ones you mentioned in a post about shared_ptr in the future.

      Reply


  4. Wrt. reinterpret_cast : I use it when converting btw. char* and anyIntegralType*, that is when casting an int buffer to a char buffer. Using reinterpret_cast directly seems much clearer to me that the other option of doing two static casts via void* (see, e.g. http://stackoverflow.com/questions/24626972/reinterpret-cast-vs-static-cast-for-writing-bytes-in-standard-layout-types).

    The only annoying thing is that casting int* to char* via reinterpret_cast is virtually guaranteed to be a safe operation, while casting char* to int*, even if the buffer contents are good, can invoke UB due to alignment issues.

    Personally I feel that rather than explaining what reinterpret_cast does or does not, we need to look very hard at what it is still used and needed for and come up with safer wrapper functions. (See e.g. http://stackoverflow.com/a/27237839/321013)

    Reply

  5. Your example for const_cast is seriously flawed: It’s the non-const version which should call the const one. Otherwise changing the implementation of the non-const version may break const correctness.

    class MyContainer {
      int* data;
    public:
      int& getAt(unsigned index) {
        return const_cast(static_cast(*this).getAt(index));
      }
      const int& getAt(unsigned index) const {
        checkIndex(index);
        return data[index];
      }
    };
    
    Reply

    1. Thanks for pointing that out. Fixed!

      Reply

      1. Uhu? I still see a couple of errors in the code snippet:

        1) the const version of getAt returns “int&” instead of “int const &”
        2) the non-const version of getAt is missing a “static_cast(this)” when delegating to the const version of getAt, which would result in infinite recursion.

        Maybe something didn’t work with updating the snippet or my browser is playing (caching) tricks on me?

        Reply

        1. The thing that did not work when updating the snippet was my brain 😉 I hopefully fixed it now.

          Reply

  6. Nice post! Very good overview of casts.

    Also, I would add a point that in some cases you can’t implicity add ‘const’ to type. For example:

    class Test {};
    
    Test* t;
    const Test ct;
    Test ** t1 = &t; //OK
    Test const ** t2 = t1; //ERROR. Should use Test const * const *
    
    //Otherwise you could do something like this:
    *t2 = &ct;
    //Now you can modify const ct object through the t pointer
    Reply

    1. Hi Evgeny, thanks for your comment. I would consider your example an edge case or rather a purely academic case. To be honest, I have never come across a situation where a pointer to pointer was needed, or useful, and I can’t think of a situation where they should be used. In fact, I would strongly advise against it, since the double indirection adds complexity that should be avoided.

      Reply

      1. Sure, it’s generally not very common to see code like this. In my practice I met similar case when was working with a little out-dated API which was receiving array of strings as char const **.

        Reply

        1. Wrap that API so it does not leak the **’s, then dealing with them is a one-time problem 😉

          Reply

Leave a Reply