Modern C++ features: in-place construction

Move constructors are often cheaper than copy constructors, which makes the construction and immediate relocation of objects in modern C++ more effective than in C++03. However, just moving the parts needed to construct the object in the right place can be even more effective. Several standard library functionalities use perfect forwarding to construct objects right where they are needed.

Example

From copy to move

Let’s consider this little C++03 code snippet:

typedef std::vector<int> Numbers;
std::vector<Numbers> numbersContainer;
numbersContainer.reserve(1);

int newNumbers[] = {1, 1, 2, 3, 5};
numbersContainer.push_back( Numbers(newNumbers, newNumbers + sizeof(newNumbers)/sizeof(newNumbers[0])) );

What we are doing here is inserting a new `std::vector<int>` with the content of the array at the end of `numbersContainer`. The vector is initialized with the content of the array `newNumbers`.  Without too much detail, the execution steps for the last line will be roughly the following:

  1. Construct a temporary `std::vector<int>` (aka. `Numbers`) from two pointers
  2. Copy construct a new object from the original constructed in step 1 at the end of `numbersContainer`’s storage:
    1. Allocate memory for the copied content
    2. Set the internal members accordingly (pointer to memory, capacity)
    3. Copy the content and set the internal member for the size accordingly
  3. Adjust the member for the size of `numbersContainer`
  4. Destroy the temporary, including a deallocation

Before I go into the details, here is the same code, polished for C++11:

using Numbers = std::vector<int>;
std::vector<Numbers> numbersContainer;
numbersContainer.reserve(1);

auto newNumbers = std::array<int, 5>{1, 1, 2, 3, 5};
numbersContainer.push_back( Numbers(std::begin(newNumbers), std::end(newNumbers)) );

We are using a type alias here which is the modern equivalent to the `typedef`.  In this case it’s essentially the same, but more convenient, as it defines the type in the same order we are used from other definitions in C++. The other change is the use of `std::array` instead of a plain C array and `std::begin()/end()` instead of manual pointer calculations.  The crucial point however is that `push_back` now has an overload taking an rvalue reference, so it can move the temporary instead of copying it. Here are the execution steps:

  1. Construct a temporary `std::vector<int>` (aka. `Numbers`) from the two iterators/pointers
  2. Move construct a new object from the original constructed in step 1 at the end of `numbersContainer`’s storage:
    1. Copy the internal members of the temporary, “stealing the guts”
    2. Set at least the internal data member of the temporary to 0
  3. Adjust the member for the size of `numbersContainer`
  4. Destroy the empty temporary, which does nothing

Step 1 is equivalent to the C++03 version – `std::array` iterators are plain pointers. Step 3 is the same for both cases, it’s only cheap bookkeeping. Step 2 and 4 are the interesting difference: The allocation and following deallocation does not take place, because we moved the temporary.

We can do better: in-place construction

Let’s analyze if we could do better – at least in theory. We can’t get around the construction of a `vector<int>`, because that’s what is stored in `numbersContainer`. We can’t get rid of step 3 either, because the invariants of `numbersContainer` demand the bookkeeping. Step 4 does nothing, so what remains is step 2, the move construction.

In this case that does not look like much: copy three pointers or integrals (data pointer, size, capacity), set another one to 0. However, move constructors need not be that cheap. Objects that store their data on the heap can just swap a few pointers like `std::vector` does, but data stored in the object itself can not be moved, it has to be copied.

So, wouldn’t it be nice if we could get rid of the temporary and the move construction as well? As a matter of fact, since C++11 `std::vector` has a method `emplace_back` that takes an arbitrary number of arguments and uses perfect forwarding to construct the new object right in place:

using Numbers = std::vector<int>;
std::vector<Numbers> numbersContainer;
numbersContainer.reserve(1);

auto newNumbers = std::array<int, 5>{1, 1, 2, 3, 5};
numbersContainer.emplace_back( std::begin(newNumbers), std::end(newNumbers) );

Without further ado, here’s what happens:

  1. Perfectly forward any arguments …
  2. … to normally construct the new object at the end of `numbersContainer`’s storage
  3. Adjust the member for the size of `numbersContainer`

That’s it. Step 2 is the exact same constructor call we had for the temporary before, the one we can’t get around. Step 3 is the bookkeeping we’ll always have. The perfect forwarding is very easily optimized away by the compiler. There’s no unnecessary overhead left.

There are lots of functions like this in the standard library: of course there is `emplace_front` as well. Whenever a container has a `insert` method, there is a corresponding `emplace` method. `std::make_shared` and `std::make_unique` perfectly forward to achieve in-place construction. 

Consider to use emplace functions to achieve in-place construction instead of moves or copies.

Readability

The emplace functions remove some redundancy. In that last example we did not explicitly state that we put a new `Numbers` object into the `numbersContainer` like we had to do with `push_back`. However, if we apply the basic clean code principles of clear variable names and short, concise functions, there is enough information to keep track of what is going on.

Facebooktwittergoogle_plusredditlinkedinFacebooktwittergoogle_plusredditlinkedinby feather

7 Comments

  1. jon

    Am I only one who wishes reserve had been called something different (at least, something that didn’t start with “re”). If I had a dollar for every time I’ve typed reserve when I meant resize and then wondered why my program was crashing, well I probably couldn’t retire but I could at least buy a few rounds of cocktails 🙂

    Reply

  2. Arne,

    Thanks for the post. I have one question which could sound very snarky, but that isn’t the intent. I am being sincere.

    What are you trying to accomplish with this line of code?

    numbersContainer.reserve(1);

    The typical use of reserve() is to preallocate container space for one or both of these reason:
    *) reduce the time cost of multiple reallocations as the container grows.
    *) guarantee that pointers, references, and iterators into a contain remain valid over a series of container additions.

    I don’t see that this does either of these.
    *) Since we are only adding one item to the vector, there is no cost of multiple reallocations. It possibly does the move the time cost from line six to line three, but I don’t see any net time savings.
    *) In this code snippet we don’t have any pointers, references, or iterators into the container before the allocation, so there is nothing that might be invalidating.

    It occurred to me that there is another possible use of reserve(), that is not the typical motivation for its use. Most libraries are likely to implement reserve() by allocating *exactly* the amount of memory required to contain the passed number of elements[0] (assuming any allocation is required). That means that calling reserve not only avoids the time cost of reallocations, it also avoid the overallocation that results from the normal implementation of doubling[0] the buffer size on reallocation.

    I believe both libstdc++ and libc++ both allocate only enough to hold one element[0] on the first push_back(), so in practice the reserve() call won’t make any difference to buffer size either.

    I may be coming across as making too big a deal about this, but since it is being held up as example code, I think it is fair game to ask the question.

    It is also fair to answer with something like: I think it is a good habit to use reserve() whenever you know a containers maximum size and this is just the result of that habit.

    Jon

    [0] There is nothing in the standard that requires this behavior.

    Reply
    1. Arne Mertz

      Hi Jon,
      thanks for your thoughts and for taking the time to write such a long comment! You are right that it’s a habit to use reserve() whenever I know in advance that the vector will grow to a certain size. That’s how it got in there in the first place, when the code was more complicated yet. I then reduced the example to a minimum and wondered for a while if I should take out the call to reserve() as well.

      There are some reasons why I left it in in the end: I have met several developers, new and veterans, who did know what a vector is and how it works roughly, but did not know about reserve. If you look around in the C++ blogsphere, you don’t see it very often because it is not needed for example code. I left it in so some developers can learn something more besides the essence of this blog post, and others get reminded of the good habit to use it before a known number of insertions.

      The other reason is that in all the listings of what happens during the push_back, I would have had to add the reallocation and, for the more general description of push_back to a nonempty vector, the internal copy/move of all elements to the new memory. Adding that step would have distracted from the topic I wanted to cover. Leaving it out would have invited comments about missing details. So I sneakily left the reserve() in to make life simple – you caught me 😉

      I did not know about the implementation details of reserve you mention. But I would not use reserve in example code just to avoid the overallocation of a few bytes. That would be needless overoptimization and contrary to anything I try to write about in this blog.

      Arne

      Reply
  3. OVVYYYXXX

    One more thing about this stuff!

    I would like to have something called destructor when I need it.

    So, what am I talking about, instead of having the destructor we have it now days, it would be nice have possibility to erase whole object when we don’t need it any more.

    Yes we have pointers and then we could use delete to get rid of the memory that is occupied at that time, but destructor will not remove some stuff that is left over.
    It is like programmers do not know what are they doing.

    Reply
    1. Resto

      @OVVYYYXXX Nothing stops you from having a MyObject::Destroy() function that can both be called manually and is automatically called in the destructor (but watch out for double deletes, especially with multi-threading)
      Also be sure to use scopes (ie: { }) properly

      Reply
  4. Jonas Hammarberg

    How about

    using Numbers = std::vector;
    std::vector numbersContainer {{1, 1, 2, 3, 5}};

    Reply
    1. Arne Mertz

      Yes, if you know beforehand that this exact sequence will have to be put in the container, you’re obviously right. The array was meant as an example for “we get some numbers from somewhere”. The point is to explain what happened to C++03’s `push_back` 🙂

      Reply

Leave a Reply

Your email address will not be published. Required fields are marked *