shared_ptr Versus unique_ptr in Factory Functions

When it comes to factory functions there is often dispute on which kind of smart pointers to return. As it turns out, the choice depends on the circumstances, so here’s a list of pros and cons. 

Alternatives

Before I jump right in, I’d like to get alternative return types right out of the way. Some of them may be a viable option in rare circumstances, but I’ll focus on the more general cases.

Raw pointers with ownership. Don’t. Unless you can’t use a recent library that has smart pointers. And even then, roll your own smart pointer, upgrade your compiler, whatever. Manage your object ownership responsibly.

References and pointers without ownership. If the caller does not own the produced object, the factory needs to assign the ownership to another unrelated object. (Managing ownership in the factory itself would violate the SRP) That in turn can lead to dangling pointers or references, so it’s hardly an option.

Single element containers like optional, variant and any are either restricted to one or a handful of classes or the user needs to know of which class the stored object is or they don’t support runtime polymorphism. In factory functions any of the three impediments usually are a problem.

`std::auto_ptr` is deprecated. Use `std::unique_ptr` instead.

User defined smart pointers may be an option in some case. However, they force users of the factory to use the smart pointers as well, whereas the standard library smart pointers are ubiquitous, making functions that return them usable in a wider context. In addition, if you define a smart pointer class for ownership management, try to provide a conversion from `std::unique_ptr`.

unique_ptr: the pros

Which brings us to one of the strengths of `unique_ptr`:

  • It has the ability to release ownership, so you can give it to practically any other smart pointer, including `shared_ptr`, and to other means of ownership management like Boosts pointer containers.
  • It has (close to) zero overhead. The ownership semantics of `unique_ptr` is very simple, so there is no costly ownership management happening behind the scenes. Resetting the original to null in a move operation is about everything there is to it. In terms of space a normal `unique_ptr` without custom deleter usually is just a pointer, nothing more.

unique_ptr: the cons

The simplicity of `unique_ptr` can become a weakness when the factory function is not as simple. Basically everything that is beyond “`new` and forget” can be problematic.

  • If you have complex memory management in place you may need a custom deleter as one of `unique_ptr`’s template parameters. That way, the deleter type gets exposed to any code using the `unique_ptr`, creating additional dependencies.
  • If your factory function uses caching, `unique_ptr`’s possessive nature can become a problem. You can’t just store plain pointers in your cache, since the objects may get deleted, leaving you with dangling pointers. If you want your cache to get notified of the destruction, you have to use that custom deleter I just talked about.
  • The implicit conversion from `unique_ptr` to `shared_ptr` may give you a great deal of flexibility, but it comes at the cost of having to allocate a second piece of memory for the shared count. Similar performance penalties may apply if you give ownership to other smart pointers. So keep that in mind, if the factory function and its surroundings are likely to be a performance bottleneck.

shared_ptr: the pros

Where `unique_ptr` has its weaknesses, `shared_ptr` can shine and vice versa. So here is the flip side of the above, apart from the obvious difference between shared and unique ownership:

  • Returning a `shared_ptr` from the beginning enables you to use `make_shared` which performs a single memory allocation for both the shared count and the object itself.
  • Caching is easily done with `shared_ptr`’s brother `weak_ptr` which takes not part in the shared ownership but knows if the object has already been destroyed or is still alive.
  • Last but not least, `shared_ptr` uses type erasure to hide the deleter on the free store together with the shared count. That way users of the `shared_ptr` are only dependent on the stored type itself.

shared_ptr: the cons

All the nice things `shared_ptr` can do for us come at some cost of course:

  • `shared_ptr` needs to track the number of pointers to the same object, therefore copy operations etc. are nontrivial and therefore less performant than `unique_ptr`.
  • The number of `shared_ptr`s, `weak_ptr`s and the deleter have to be managed per object on the free store, so if you don’t use `make_shared` you have the overhead of an additional allocation and deallocation for each object in typical implementations.
  • Besides the small overhead in space per object for the bookkeeping, a `shared_ptr` does not only need access to the pointee but also to the bookeeping object, Therefore it usually contains at least two pointers, making it at least twice as large as a basic `unique_ptr`.
  • Once you have committed to shared ownership there is no going back. It is not possible for a `shared_ptr` to release ownership since it might be in fact shared, so a single `shared_ptr` does not have the right to give the ownership away.

Conclusion

The differences in performance and memory footprint of `shared_ptr` and `unique_ptr` are relatively small but can be notable, especially if you do not use `make_shared`. How ever, the rule is as always “measure first, optimize after”. What really is important are the semantic differences.

As a default, consider to use `unique_ptr`, since it gives you more flexibility and simplicity. If the factory function becomes more complex or if you know from the start that you will need shared ownership or `weak_ptr`, return `shared_ptr`.

Facebooktwittergoogle_plusredditlinkedinFacebooktwittergoogle_plusredditlinkedinby feather
Posted in

5 Comments

  1. StefanFlorea

    Hi, very nice article!

    It helped me a lot but I am still a bit confused about using unique_ptr and caching the results.

    Could you tell me why this would be dangerous?

    std::unique_ptr createCachedWidget(int id)
    {
        static std::map cache;
    
        Widget* widget = cache[id];
        std::unique_ptr widgetOjb;
    
        if  (!widget) // (1)
        {
            widgetOjb = std::make_unique(id);
            cache[id]   = widgetOjb.get();
        }
        else
        {
            std::cout << "Widget already in cache" << "\n";
        }
        return widgetOjb;
    }

    The alternative would be to return a shared_ptr and cache it using weak_ptr as it gives me the ability to check for dangling pointers… but I would still have to lock it and check if it is not null. Same as (1)

    So my question would be: checking for nulltpr on a unique pointer isn't the same as having a weak_ptr to a shared_ptr that I lock and check for null?

    Thanks a lot!
    Stefan F.

    Reply
    1. Arne Mertz

      Hi Stefan, thanks for your question and sorry for the formatting problems, I corrected that for you.

      Your example has a serious problem: If there is a cached object, you return a null pointer. Usually a client would expect to get either the cached object or a fresh one, but not null. Getting a null would basically be like “hey, I remember that I created the object you need in the past. No idea if it still exists or not. I have a pointer to it in my cache, but I won’t tell. Go look for it elsewhere.”

      If we try to fix that, we get different problems: The pointer in the cache might be dangling, and we have no way of knowing. We can fix that by removing pointers from the cache when the object is destroyed (e.g. using a custom deleter). But even if we knew for sure it wasn’t dangling, the `unique_ptr` that got returned the first time we called the function is the only pointer having ownership of the object and we can not create a new `unique_ptr`.

      To make things short: We do caching because we want to access the created object from more than one location. That in turn means we need shared ownership. The `shared_ptr/weak_ptr` alternative of your example would look like this:

      std::shared_ptr createCachedWidget(int id)
      {
          static std::map> cache;
      
          auto widget = cache[id].lock(); //1
          if  (!widget)
          {
              widget = std::make_shared(id);
              cache[id] = widget;
          }
          return widget;
      }

      The marked line will give an empty `shared_ptr` if the cache has not contained an object for the id until then (because the map will create an empty `weak_ptr` for you), as well as if the `weak_ptr` has been expired.

      Reply
      1. StefanFlorea

        Hi Arne,

        Thank you for your very fast reply!

        I think this line was the eye opener for me “We do caching because we want to access the created object from more than one location”.

        And it makes a lot of sense that objects in the cache would be accesed from more than one location, if not, caching would not be needed and I would just return a non cached unique_ptr.

        Thanks again,
        It was a pleasure to discover your blog 🙂

        Reply
  2. David Haim

    I’ve being thinking about it a bit.
    I think that in this specific case – we can have 2 more options:

    1. return raw pointer. yes! this way, we basically saying – “it is not the factory job to decide how to manage the resource, catch it either by unique_ptr or shared”. this may seems a bit contrevorsial , but non the less logic. the developer then assign the pointer to whatever ownership he decides.

    2. use templates the let the develope decide hoe the resource returns.
    for example, the factory method may look like:
    auto resource = Factory::produce(args…);
    this way the developer chooses as well how the resouce is managed, and the ownership is granted within the factory itself.

    Reply
    1. Arne Mertz

      Hi David, thanks for your thoughts. While the first option may be intriguing at first, returning raw pointers with ownership has its problems and is easy to use in a way that loses exception safety. Using a template parameter to decide which type of pointer to return can indeed be an option though, if there really is no sensible single type you can use. I’d consider it as the second best option to having a specific return type, because it is more complex and means you allow clients to instantiate the template with a return type it has not been tested with.

      Reply

Leave a Reply

Your email address will not be published. Required fields are marked *