Managing the lifetime of dynamically allocated memory and the objects residing in it is one of the challenges that can be hard to do right. It is usually handled by assigning other objects ownership of the allocated memory, i.e. assigning the responsibility of handling and releasing the memory to those objects. Correctly designing the ownership and its transfer can be a challenge in itself.
One of the last things on my last job was a longer debugging session, examining a crash during application shut down. It did only happen in optimized build, and only if a few specific features like logging were configured the right way.
The crash happened during the cleanup of a cache. The cache contained a few thousand objects, which were deleted one after another. The deletion of the 849th or so object crashed with an access violation.
Debugging on that particular environment is a pain, especially because it does not show the location of the access violation in an optimized build, so incremental enclosing of the problem with lots of breakpoints is the only option, while the debugger itself often enough crashes, erasing any unsaved breakpoints. Yay.
However, after some time debugging it turned out that the crash happened during the destruction of a
shared_ptr had ownership of one of about 30 objects of a class used to configure the business logic. Most of them could be deleted without problems, but this single one crashed. It was always the same.
Only after registering the addresses of all of those objects and setting a breakpoint in their destructor I noticed that the destructor of this particular object got called twice. The call originated from a
shared_ptr destructor both times.
I was confused. A
shared_ptr is supposed to handle shared ownership of an object, so you can have multiple
shared_ptrs to the same object, and only the last
shared_ptr to be destroyed will destroy the object, too. So it should not happen that an object owned by a group of
shared_ptrs gets deleted twice.
The cause was simple: there were two separate groups of
shared_ptrs owning the same object. Once the reference counter of the first group hit 0, it destroyed the object. When the reference counter of the second group hit 0 as well, the destructor was called again, and bad things happened.
How did it happen that there were two groups owning the same object? Those objects were created in another part of the program, using raw pointers and
new. That’s not necessarily a bad thing in itself, although I would strongly discourage such practices in modern C++.
The code, however, is legacy C++98 code which yet has to be refactored to use more robust techniques like smart pointers etc. So, creating them using raw pointers was OK.
Then, however, raw pointers to those objects were requested from the object cache and used to initialize the
shared_ptrs which were part of other objects. Those
shared_ptrs had been introduced recently in a series of refactorings aimed to replace an older, less stable form of ownership management. Usually, this was a 1-1 relationship, so there was always one
shared_ptr claiming ownership on one of the configuration objects.
In this particular case however, with the logging configured differently, there were two objects referring to the same configuration object, and both contained a
shared_ptrs that got initialized with the same raw pointer from the cache, leading to the two separate reference counters.
Lessons to learn
Object ownership has to be designed in a holistic way. You can’t have two different methods of managing object ownership at the same time (e.g. the cache and the
shared_ptrs), because that will be confusing and error prone at best.
Obviously, the best approach would be to have the same method of object ownership management from beginning to the end of an object’s lifetime, but sometimes that is not feasible.
So if you have to change the ownership management at a specific point in an object’s lifetime, make sure to get that change right. Don’t leave remains of the old ownership management behind, because that will be essentially the same as having the old and new methods coexist.
Some of those different methods are specifically designed to be compatible and make the change possible without problems. For example,
shared_ptr has a constructor that takes a
unique_ptr. That way, you can transfer unique ownership into shared ownership, but only by moving the
unique_ptr into the
shared_ptr, so the
unique_ptr will not have any ownership afterwards.
Be extremely careful when (re-)designing ownership management for your objects. Check twice when you change the management during an object’s lifetime.