A Casting Show

Arne Mertz January 22, 2015 21

Contents

In C++ there are two ways of type conversions: implicit and explicit type conversions. The latter are called type casts and they are what this post is about.

Overview

C++ has the following capabilities for explicit type conversions:

The C++ cast operators are keywords defined in the language. While they look like template functions, they are part of the language itself, i.e.the behavior is implemented in the compiler, not in the standard library. There are four of them:
1. `const_cast`
2. `reinterpret_cast`
3. `static_cast`
4. `dynamic_cast`
The C-style and function-style casts. The C-style cast consists of the type you want in parentheses, followed by the expression you want to be converted into that type, e.g. `(double)getInt()`. The function style cast works only slightly different, by stating the target type followed by the source expression in parentheses, i.e. `double(getInt())`. It is equivalent to the C-style cast in every respect, except the target type has to be a single word, so `unsigned long`, `const double` and any kind of pointer or reference is not allowed.
Construction of a temporary value in C++11. It looks similar to the function-style cast: `long{getInt()}` but uses the initializer list with curly braces introduced in C++11. It has a few more restrictions than the function-style cast, e.g. if converting to a user-defined type that is only possible when the target type has a corresponding constructor, but not when the source type has a corresponding conversion operator.
Functions that take a parameter of one type and return an object of another type, representing the same value. While they are technically not real casts, they have the same look and feel and usage, and sometimes even are implemented with casts. Prominent examples are `std::move`, `std::dynamic_pointer_cast` and `boost::lexical_cast`.

The four cast operators represent the basic conversions possible in C++, so I will explain them in detail. The other possibilities will be covered only briefly.

const_cast

This cast has one sole purpose: removing constness from a pointer or reference. In theory, it is also usable to add constness, but since this is possible via an implicit conversion, it is not recommended to explicitly use a cast for that. It is the only cast operator that can remove the constness, other cast operators are not allowed to do so.

void foo(MyClass const& myObject) {
  MyClass& theObject = const_cast<MyClass&>(myObject);
  // do something with theObject
}

Casting away the constness of an object can be dangerous. In the example above, the user of the function will expect his object to remain unchanged. The const_cast on the other hand gives full write access to the object so it could be changed. Irresponsible use of const_cast therefore can lead to unexpected behavior, hard to debug bugs and even undefined behavior.

In many cases const_cast is only necessary due to design problems. Const correctness is often missing in legacy code or it is perceived hard to get right because developers mix up semantic and syntactic constness or don’t use mutable when appropriate.

Be suspicious whenever you encounter a const_cast in code. Think twice before you write one.

There are a few cases where const_cast is indeed the right thing to do. The best-known cases are accessor functions that have a const and a non-const version, where the former returns a const reference (or pointer) and the latter a non-const reference:

class MyContainer {
  int* data;
public:
  int& getAt(unsigned index) {
    auto const_this = static_cast<MyContainer const*>(this);
    return const_cast<int&>(const_this->getAt(index));
  }
  int const& getAt(unsigned index) const {
    checkIndex(index);
    return data[index];
  }
};

More general, const_cast is then used to access a const object in a way that syntactically might change the object, but you know for sure that it does not. This is mostly restricted to the object’s own methods since encapsulation demands that outsiders cannot be sure when a non-const operation does not alter the object.

reinterpret_cast

reinterpret_cast is the most aggressive, insecure and (hopefully) least used of the four C++ cast operators. It can be used only on integral types, enums, all kinds of pointers including function and member pointers and nullpointer constants like std::nullptr. It is meant to be used to convert types that are otherwise not compatible, i.e. mainly from pointer to int and back, or from pointer to X to pointer to Y and back, where X and Y are unrelated types.

The usual behavior is to just reinterpret the bit representation of the source value as bit representation of the target value. No checks are applied, which means if you use the cast, you are on your own. For example, you can indeed cast a car* into a duck*, and casting it back is guaranteed to give you the same car*. Actually using the duck* will most surely result in undefined behavior. In fact, any use of reinterpret_cast that can not be done via other casts has a bunch of “DANGER” and “Undefined Behavior” signs around it.

I know of only two examples I know of where there is no option but to use reinterpret_cast are casting pointer values to int, to log them in the well known 0x50C0FFEE format and storing a pointer where another pointer (or int) is meant to be stored. The latter is e.g. the case in Borland’s VCL where GUI objects have the ability to store data in a TObject pointer. If you want to store a context that is not derived from TObject you have to store and retrieve it by casting your object’s address to and from TObject*. void* would have been a better choice in my opinion.

If you need to write a `reinterpret_cast`, make a break, go for a walk, and reconsider that decision. If you come back and still want to write it, encapsulate the cast(s) carefully.

static_cast

static_cast is the most straightforward cast. Consider you have an expression a of type A and want that value converted to type B, and the conversion is possible per sé, i.e. the types are not unrelated so you don’t need a reinterpret_cast. If the conversion is not implicit, or the compiler is not able to select the right implicit conversion because you passed the value to a function that has overloads that get preferred over the one you want or make it ambiguous, then you have to explicitly force the conversion.

If B is a user-defined class type it is common to use a function-style cast or call the conversion constructor explicitly, i.e. B(a) or B{a}. Both have the same effect as a static_cast in this case. In all other cases, i.e. if you convert to or between built-in types, use static_cast explicitly. Cases, where this is necessary, are:

narrowing conversions between numbers (int to short, double to int, …)
conversions between integrals and enums
conversion from `void*` to any other pointer type
downcasts of pointers or references in class hierarchies when you know the dynamic type of the object (see below)

Points 3 and 4 are to be used with caution: If you static_cast to a pointer (or reference) of type T*, the compiler believes you and assumes you really know that in fact there is a T at the address stored in the pointer. If there is something else, it will still treat the bits and bytes at that location as if there was a T, causing undefined behavior and hopefully blowing the program up right in your face. (I say hopefully because a crash is much less pain to debug than a silent failure that lets the program just act weird but continue).

dynamic_cast

This cast is used for downcasts and cross-casts of pointers and references in class hierarchies. You pass in a pointer of class X, casting it to a pointer of a class somewhere else in the class hierarchy. Casting to a base class (upcast) is implicitly possible and does not need an explicit cast.

Depending on whether the type of the object behind that pointer (called the dynamic type) in fact is of that other class or not, the result of the cast is the new pointer or a null pointer. Of course, if the object is of a type that is derived from the target class, the cast succeeds as well. Since references cannot be null, dynamic_cast on a reference throws an std::bad_cast exception if the cast does not succeed.

class B {};
class D1: public B {};
class D2: public B {};

void foo() {
  D1 d1;
  D2 d2;
  B* b1 = &d1;
  B* b2 = &d2;

  D1* d1b1 = dynamic_cast<D1*>(b1); //ok, d1b1 now points to d1
  D1* d1b2 = dynamic_cast<D1*>(b2); //result is NULL because *b2 is not a D1

  D1& rd1b2 = dynamic_cast<D1&>(*b2); //throws std::bad_cast
}

People often view the presence of dynamic_cast with suspicion because it is often a hint to a flawed design. Many naive applications of dynamic_cast can be solved more cleanly with virtual functions.

Before applying a dynamic_cast, check if the problem at hand can not better be solved with virtual functions.

Downcasts in class hierarchies

dynamic_cast, static_cast and reinterpret_cast can all three be used to cast a base class pointer or reference into a pointer or reference to a more derived class. So what is the difference between the three?

As shown above, dynamic_cast checks if the dynamic type of the object is of the expected class. That check is performed at runtime which needs access to runtime type information (RTTI) and costs a few CPU cycles. The other two casts occur (almost) purely at compile time and are therefore faster. However, if you do not know the dynamic type of the object, you have no other option.

If you know the dynamic type and the relationship between the two classes is a line of single inheritances, then the two other casts do exactly the same, which is exactly nothing. The new pointer contains the exact same address, it just has another type. However, in case of static_cast the compiler checks if that conversion is even possible, i.e. if the target type is indeed a derived class of the source type, so it is more secure than reinterpret_cast. The following example will lead to a compiler error:

class B; //forward declaration
class D; //forward declaration

B* pb;
D* pd = static_cast<D*>(pb); //ERROR: B* is not convertible to D*

In case of multiple inheritance, the memory layout may be in a way that the address of the derived object differs from the address of the base class object:

class B1 { int i; };
class B2 { int j; };

class D : public B1, public B2 {};

void bar() {
  D d;
  B2* pb2 = &d;
  D* pd1 = static_cast<D*>(pb2);
  D* pd2 = reinterpret_cast<D*>(pb2);
}

Let’s assume for simplicity that sizeof(int) is 4, and there are no padding bytes, and we are in a typical environment where the subobjects are stored in order in memory. Compared to the address of d itself, the offset of the B1 subobject and its member i is 0, i.e. they have the same address. The offset of the B2 subobject and j is 4.

When the compiler sees the line B2* pb2 = &d; it knows that offset and performs the implicit conversion from D* to B2* by adding 4, so that pointer does indeed point to the B2 subobject. The static_cast is doing the exact opposite: The compiler subtracts 4 and pd1 has again the address with offset 0, pointing correctly to d. The reinterpret_cast on the other hand will preserve the value of pb2, so pd2 will contain the same address, pointing to offset 4 and not to d. Accessing it will result in undefined behavior. Oops.

For downcasts use either dynamic_cast or static_cast, never reinterpret_cast.

C-style cast and function-style cast

When the compiler sees a C-style or function style cast, it tries to apply different sequences of elementary conversion. The first one that is possible is applied. The sequences are in order:

`const_cast`
`static_cast`
`static_cast` followed by `const_cast`
`reinterpret_cast`
`reinterpret_cast` followed by `const_cast`

As seen above, reinterpret_cast is very unsafe, so you don’t want the compiler to accidentally apply that one. As a corollary, you don’t want to use these casts to convert something to pointers, references or other built-in types. const_cast can be applied only to pointers and references that we have ruled out already, so what remains is a sole application of static_cast. That is the reason why I mentioned the possibility of function style casts to user-defined types in the static_cast section. Since that leaves no composed types as target types, the C-style form is never necessary and therefore discouraged. Instead of a C-style cast, you can also use the constructor call conversion.

Use function-style cast for conversions to user-defined classes only. Don’t use C-style casts.

Conclusion

Be careful when applying casts, no matter what kind of cast. Always make yourself aware of the risks and implications, especially if it is not a static_cast.

21 Comments

Teodor Calin
7 years ago Permalink

Hi Arne,

Thank you for the very instructive post.

Something bothers me regarding the cast-operators gospel: why is it that “conversion from void* to any other pointer type” is a use case for static_cast and not reinterpret_cast ?
Please take a look at the following code:

int j = 7; // Does not compile // double* pDbl = static_cast<double*>(&j);
// Old C-style type erasure, pervasive in legacy code void* pV = &j; // Compiles, no warning -> bogus value is printed double* pDbl = static_cast<double*>(pV); cout << "Int to void to double : " << *pDbl << endl;
// Compiles, no warning -> same bogus value is printed double* pDbl2 = reinterpret_cast<double*>(&j); cout << "Int to double : " << *pDbl2 << endl;

We know that on static_cast between pointers the compiler does some checking, because static down-casting is allowed, but not static casting between unrelated types. Can it actually do any checking when we static_cast from void* ? If not, is it not correct to use the reinterpret_cast instead ?

Also, you argue against using C-style cast thus: it is as unsafe as reinterpret_cast, because sometimes reduced to the no-checks-involved reinterpret_cast. Then would you not agree that in legacy code the C-style casts between pointer types (including casts from void*) should be mechanically replaced with reinterpret_casts – and only after careful consideration promoted, if applicable, to static_cast or dynamic_cast ?

Finally, I have to disagree with the labeling of reinterpret_cast as “unsafe” and static_cast as “straightforward”. The danger of using C-style cast came from the C compiler lumping together a bunch of use cases and forcing an use-it-and-pray decision on the developer. If we look at the four C++ casts, it is static_cast that still combines several use cases under a single syntax, while the other three appear to have their own well-defined niche.

Reply
Rich
8 years ago Permalink

Why no in the static_cast used in non-const ‘getAt’ function (in the example for const_cast)? I have not seen that usage before.

Reply
1. Arne Mertz
  8 years ago Permalink
  
  I have to confess I don’t fully understand the question. If you meant the missing <<i></i>MyContainer const*> after the static_cast, then this was simply a rendering problem in the code formatting plugin.
  
  Reply
  1. Rich
    8 years ago Permalink
    
    I see it’s fixed now. Yes, when I viewed it, the was not there after static_cast.
    
    Reply
    1. Rich
      8 years ago Permalink
      
      Hmm. My comment got altered. Yes, the type that you added now appears. No confusion now.
      
      Reply
      1. Arne Mertz
        8 years ago Permalink
        
        Yeah I noticed that things with angle brackets sometimes are interpreted as maybe-html-tags that don’t get printed.
2. Rich
  8 years ago Permalink
  
  p.s. Thanks for finding me on Twitter. Like the blog!
  
  Reply
  1. Arne Mertz
    8 years ago Permalink
    
    Thank you!
    
    Reply
Vittorio Romeo
9 years ago Permalink

Hey Arne, great article!

What do you think about wrapping existing casts in stricter ones?
I gave a lightning talk about this at Meeting C++ 2016.
(https://www.youtube.com/watch?v=62c_Xm6Zh1k)

I think “thin wrappers” which are stricter and do as much checks as possible at compile-time (also assertions) are a good idea to make the code safer and more readable, and also to clearly express your intentions.

My code is here if you want to take a look:
https://github.com/SuperV1234/meetingcpp2015/tree/master/0_MeaningfulCasts

Thankfully, boost (and probably other mature libraries) also provide their “cast wrappers” for numbers and class hierarchies.

Reply
@cwschmidt
9 years ago Permalink

Hi Arne!

Maybe you should additionally add (from the new standard for casting of std::shared_ptr’s):

const_pointer_cast
static_pointer_cast
*dynamic_pointer_cast

kind regards,
Christian

Reply
1. Arne Mertz
  9 years ago Permalink
  
  Hi Christian,
  thanks for the suggestion. This post was at the time purely targeted at built in casts. I will cover the ones you mentioned in a post about shared_ptr in the future.
  
  Reply
Plain Pointers as Function Parameters | Simplify C++!
9 years ago Permalink
Martin Ba
10 years ago Permalink

Wrt. reinterpret_cast : I use it when converting btw. char* and anyIntegralType*, that is when casting an int buffer to a char buffer. Using reinterpret_cast directly seems much clearer to me that the other option of doing two static casts via void* (see, e.g. http://stackoverflow.com/questions/24626972/reinterpret-cast-vs-static-cast-for-writing-bytes-in-standard-layout-types).

The only annoying thing is that casting int* to char* via reinterpret_cast is virtually guaranteed to be a safe operation, while casting char* to int*, even if the buffer contents are good, can invoke UB due to alignment issues.

Personally I feel that rather than explaining what reinterpret_cast does or does not, we need to look very hard at what it is still used and needed for and come up with safer wrapper functions. (See e.g. http://stackoverflow.com/a/27237839/321013)

Reply
Marcel
10 years ago Permalink

Your example for const_cast is seriously flawed: It’s the non-const version which should call the const one. Otherwise changing the implementation of the non-const version may break const correctness.

class MyContainer {
int* data;
public:
int& getAt(unsigned index) {
return const_cast(static_cast(*this).getAt(index));
}
const int& getAt(unsigned index) const {
checkIndex(index);
return data[index];
}
};

Reply
1. Arne Mertz
  10 years ago Permalink
  
  Thanks for pointing that out. Fixed!
  
  Reply
  1. Andrea Bigagli
    10 years ago Permalink
    
    Uhu? I still see a couple of errors in the code snippet:
    
    1) the const version of getAt returns “int&” instead of “int const &”
    2) the non-const version of getAt is missing a “static_cast(this)” when delegating to the const version of getAt, which would result in infinite recursion.
    
    Maybe something didn’t work with updating the snippet or my browser is playing (caching) tricks on me?
    
    Reply
    1. Arne Mertz
      10 years ago Permalink
      
      The thing that did not work when updating the snippet was my brain 😉 I hopefully fixed it now.
      
      Reply
Evgeny Muralev
10 years ago Permalink

Nice post! Very good overview of casts.

Also, I would add a point that in some cases you can’t implicity add ‘const’ to type. For example:

class Test {};

Test* t;
const Test ct;
Test ** t1 = &t; //OK
Test const ** t2 = t1; //ERROR. Should use Test const * const *

//Otherwise you could do something like this:
*t2 = &ct;
//Now you can modify const ct object through the t pointer

Reply
1. Arne Mertz
  10 years ago Permalink
  
  Hi Evgeny, thanks for your comment. I would consider your example an edge case or rather a purely academic case. To be honest, I have never come across a situation where a pointer to pointer was needed, or useful, and I can’t think of a situation where they should be used. In fact, I would strongly advise against it, since the double indirection adds complexity that should be avoided.
  
  Reply
  1. Evgeny Muralev
    10 years ago Permalink
    
    Sure, it’s generally not very common to see code like this. In my practice I met similar case when was working with a little out-dated API which was receiving array of strings as char const **.
    
    Reply
    1. Arne Mertz
      10 years ago Permalink
      
      Wrap that API so it does not leak the **’s, then dealing with them is a one-time problem 😉
      
      Reply

Write clean and maintainable C++