Covariance with Smart Pointers

Covariance can be a useful concept, e.g. when implementing the abstract factory design pattern. However, in modern C++, we should return smart pointers that are not recognized to be covariant like raw pointers by the compiler.

Abstract factory

I will not go into too much detail about the abstract factory pattern, as it is not the point of this article. You can best look it up in the “Gang of Four” book or on the web. For the code on this post I will borrow the very popular example also used on Wikipedia:

Consider an abstract factory returning equally abstract widgets for our GUI. Those widgets can be buttons, text fields, dropdown boxes etc. Depending on the GUI framework you use (e.g. differing by operating system), a concrete factory creates concrete implementations of the widgets.

Abstract factory with smart pointers

I have written about factories returning smart pointers earlier. For this example I’ll take the simpler alternative and use std::unique_ptr. Our basic code may look roughly like this:

#include <iostream>
#include <memory>

struct AbstractButton {
  virtual void click() = 0;
  virtual ~AbstractButton() = default;
};
struct AbstractWidgetFactory {
  virtual std::unique_ptr<AbstractButton> createButton() const = 0;
};

struct FancyButton : AbstractButton {
  void click() final override { std::cout << "You clicked me, I'm so fancy!\n"; }
};
struct FancyWidgetFactory : AbstractWidgetFactory {
  std::unique_ptr<AbstractButton> createButton() const final override {
    return std::make_unique<FancyButton>();  
  }
};

int main() {
  std::shared_ptr<AbstractWidgetFactory> theWidgetFactory = std::make_shared<FancyWidgetFactory>();
   
  auto theButton = theWidgetFactory->createButton();
  theButton->click();
}

A need for covariance

Let’s assume that our factories get some more features. For example, we could have a functionality that creates a simple message window with an “OK” button.

std::unique_ptr<AbstractWindow> createMessageWindow(std::string const& text) {
  auto theWindow = theWidgetFactory->createWindow();
  theWindow->addText(text);
 
  auto theButton = theWidgetFactory->createButton();
  theButton->setText("OK");
  theWindow->add(std::move(theButton));
  return theWindow;  
}

This is pretty abstract, and given the proper interface on AbstractButton and AbstractWindow it is completely agnostic of the concrete classes we have. But what if there are specialties for message windows?

If we implement that algorithm in the FancyWidgetFactory we do not win much, because createButton still returns a unique_ptr&lt;AbstractButton&gt;. We know that it is in fact a FancyButton, but we can not use that unless we apply some ugly downcasts.

std::unique_ptr<AbstractWindow> FancyWidgetFactory::createMessageWindow(std::string const& text) final override {
  auto theWindow = createWindow();
  theWindow->addText(text);
 
  auto theButton = createButton(); //unique_ptr<AbstractButton>
  static_cast<FancyButton*>(theButton.get())->doFancyStuff(); //EWW!
  theButton->setText("OK");
  theWindow->add(std::move(theButton));
  return theWindow;  
}

Covariance with raw pointers

In the olden days where open fire was considered more romantic than dangerous we would use raw pointers as return values from our factories. The callers would have to deal with ownership management, failing and burning down the house on a regular basis.

In those days, covariant return values were easy: A virtual function that returns a (raw) pointer can be overridden by a function that returns a pointer to a more derived class:

AbstractButton* OldschoolAbstractWidgetFactory::createButton();
FancyButton* OldschoolFancyWidgetFactory::createButton();

Since a FancyButton is an AbstractButton, this makes perfect sense, and the compiler knows that. With smart pointers it’s not so easy, since for the compiler they are only templates instantiated with two classes that happen to be related.

That relation does not transfer to the template instantiations, since it usually makes no sense. A std::vector&lt;Base&gt; is not related to a std::vector&lt;Derived&gt; as a Base* is related to a Derived*.

Achieving covariance with smart pointers

So now we know the problem. How do we solve it with the means the language allows us? Let’s analyze the situation:

  • We want `createButton` on an `AbstractWidgetFactory` to return something that holds a button. Which concrete button that will be depends on the concrete factory.
  • We want `createButton` on an `FancyWidgetFactory` to return something that holds a `FancyButton`, so we don’t need to cast.
  • We want to have smart pointers, but those are not considered covariant by the language.

The latter leads us to a simple conclusion: If we want the first two points to be true, createButton simply can not be virtual. The solution is, as it is so often, another layer of indirection. We can just give the factory classes a nonvirtual interface and let the virtual call take place in another function:

struct AbstractWidgetFactory {
  std::unique_ptr<AbstractButton> createButton() const {
    return doCreateButton();
  }
  // ...
private:
  virtual std::unique_ptr<AbstractButton> doCreateButton() const = 0;
};

struct FancyWidgetFactory : AbstractWidgetFactory {
  std::unique_ptr<FancyButton> createButton() const {
    return std::make_unique<FancyButton>();  
  }
  // ...
private:
  virtual std::unique_ptr<AbstractButton> doCreateButton() const final override {
    return createButton();
  }
};

We can now write the creation of our fancy message window without any ugly casts:

  std::unique_ptr<AbstractWindow> createMessageWindow(std::string const& text) final override {
    auto theWindow = createWindow();
    theWindow->addText(text);
 
    auto theButton = createButton(); //unique_ptr<FancyButton>
    theButton->doFancyStuff();       //no more casts
    theButton->setText("OK");
    theWindow->add(std::move(theButton));
    return theWindow;  
  }  

All this just works because std::unique_ptr to derived classes can always be converted into std::unique_ptr to their base class. Since this also applies to std::shared_ptr the same pattern can be used to achieve covariance with those.

But there’s a problem

As was discussed by rhalbersma in the comments, having the nonvirtual createButton method redefined in the derived class can lead to several issues. The most important one is that the behavior can be surprising to users, which never is a good thing.

The straight forward solution is to rename the method in the derived class, e.g. createFancyButton. That way the overall functionality remains, albeit it is more explicit and less surprising. This may not be the “true form” of covariance any more, but that are the kind of compromises we have to make.

You can find the full code on my GitHub repository.

Conclusion

If you really need covariance with smart pointers, it is manageable, although you have to add that extra layer of indirection. There should be better alternatives though, since C++ is not (only) an object oriented language.

There obviously is no perfect solution to the problem, but I hope I could show a possible approach to problems like this: If there is no single functionality that provides what we need, we can try to add another layer of indirection and combine the layers to produce the desired result.

Thanks to Jason Turner and Joshua Ogunyinka for bringing this topic up on twitter recently:

Previous Post
Next Post

20 Comments


  1. I don’t understand why having

    unique_ptr createButton() {
    return make_unique();
    }
    in FancyWidgetFactory will return a unique pointer to abstract button and have to do some indirect interface for that. What about
    unique_ptr createButton() {
    return make_unique();
    }
    why don’t we have to go through indirect interface to get the FancyWindow and not the AbstractWindow?

    Reply

    1. It seems your template parameters have been eaten. Could you use markdown backticks or code tags to fix that?

      Reply

  2. This is a second try. The contents in the template brackets was removed in my previous post on submission, which makes the text unreadable. I avoid using template brackets in this post.

    The root problem is the fundamental concept of template introduced by C++. Why std::unique_ptr of FancyButton cannot be a subtype of std::unique_ptr of AbstractButton? Because in std::unique_ptr of AbstractButton the type parameter is already bound to AbstractButton, this makes it impossible to have any subtype to re-bind the type parameter to a subtype of AbstractButton. In fact, AbstractWidgetFactory should return a smart pointer that has the type parameter not bound to AbstractButton, because instance of AbstractButton does not exist at all. It should have the output type something like “template of ButtonType on std::unique_ptr of ButtonType where ButtonType is a subtype of AbstractButton”. Then you can have a subtype of std::unique_ptr with FancyButton. However, in C++, a template type is not a type, so there is no way you can have this subtype relation. Well, you may make the createButton a template function and use ‘concept’ and ‘requires’ to constrain the type parameter for the smart pointer. But this make the whole thing much more complicated than necessary. The raw pointer or reference is specially treated in C++ where even the referenced type is unbound, the raw pointer or reference is still considered as a type. This makes the concept of a smart pointer inconsistent with raw pointer and reference.

    Reply

  3. The root problem is the fundamental concept of template introduced by C++. Why std::unique_ptr cannot be a subtype of std::unique_ptr? Because in std::unique_ptr, the type parameter is already bound to AbstractButton, this makes it impossible to have any subtype to re-bind the type parameter to a subtype of AbstractButton. In fact, AbstractWidgetFactory should return a smart pointer that has the type parameter not bound to AbstractButton, because instance of AbstractButton does not exist at all. It should have the output type something like “template std::unique_ptr where ButtonType is a subtype of AbstractButton>”. Then you can have a subtype of std::unique_ptr of it. However, in C++, a template type is not a type, so there is no way you can have this subtype relation. Well, you may make the createButton a template function and use ‘concept’ and ‘requires’ to constrain the type parameter for the smart pointer. But this make the whole thing much more complicated than necessary. The raw pointer or reference is specially treated in C++ where even the referenced type is unbound, the rap pointer or reference is still considered as a type. This makes the concept of a smart pointer inconsistent with raw pointer and reference.

    Reply

  4. why covariant is not allowed for smart pointer? Is it hard to the compiler?

    Reply

    1. Smart pointers are a pure library construct, while allowing covariance for function overloads is a language construct. One could extend the compiler to allow covariant unique_ptr and shared_ptr, but that wouldn’t solve the problem for handwritten smart pointers. In principle, the compiler can’t possibly decide what is a pointer and what isn’t.

      Reply

  5. This issue is more general than factory functions. The covariance requirement is for a transform F, such that if S < T, then F (S) F(T).

    This problem litters C++ because the type system is badly constructed. Here is another example: const. Const doesn’t propagate through templates.

    The heart of this problem is that nominal typing is specifically designed to require manual specification of all operations. If you want something done automatically you have to use structural typing with a carefully crafted minimal set of combinators.

    So what can you do? If you want covariant smart pointers then, you just have to implement the conversions. That is the whole intent of nominal typing. And you have to apply them as well. Can you do it? Well actually its trivial, for some “smart” pointer type P with a method get()->T* and a method set(T) you can obviously just grab the pointer, do the cast, and set pointer in the result object. Templates can’t be type checked so you can write the code. If you use it where the conversion S -> T* doesn’t exist your code will break on instantiation. So all you need now is to make that a constructor and you’re done.

    Reply

  6. Typo here: AnstractWidgetFactory

    Reply

  7. I actually doubt that a Factory should return a smart pointer. Smart pointers represent the RAII idiom and one should not make every pointer smart. Smart pointers are the source of resource management. Factory itself obviously does not hold that resource so it should not take responsibility for resource management. Moreover, callers may have their own resource management schemes, so wrapping the pointer just to unwrap it the next step is a bit strange.

    Reply

    1. The factory does not hold the resource, but it creates it. Therefore it should not return raw pointers but smart pointers. unique_ptr does not have overhead compared to owning raw pointers, so it is a good default. Any good custom resource management has built in ways to take over ownership from unique_ptr so this should not be an issue either.

      Reply

  8. Minor correction: in “Achieving covariance with smart pointers” you need the factory on this line:

    auto theButton = createButton(); //unique_ptr

    Reply

  9. I never understood this problem very well, or the language design about this particular corner or the real uses.

    But isn’t it is telling us that factories should still return raw pointers and it is your responsibility to use the unique_ptr to get capture its value? (the real type information is still inside the unique_ptr)

    As an orthogonal justification, why should the factory know that you want a unique_ptr and not a shared_ptr? Perhaps raw pointers have its uses still, for example as return values.

    Reply

    1. Having raw pointers returned from the factory is a design that may lead to exception safety issues. Think for example someFunc(factory.createButton(), factory.createWindow()). I discussed returning smart pointers from factory functions in another post here.

      Reply

  10. Sorry for an unfortunate error in my previous comment, please replace all occurrences of createWindow to createButton (which is the one you redefine). The argument itself remains the same.

    Another argument from Effective C++, Item 36: “Never redefine an inherited non-virtual function”. You get different behavior if you (or your users) accidently create another a reference/pointer of typeAbstractWidgetFactory to your theWidgetFactory (because createButton is statically bound).

    Reply

    1. Hi rhalbersma,
      thank you for pointing out those issues. Some I would argue against, but the most compelling argument is the surprise for the user. I’ll add an update to the post.

      Reply

  11. Thanks for sharing this useful technique.
    It is a nice alternative to using a template function, that would be specialized for fancy widgets.

    Reply

  12. Nice blog post! One nitpick: FancyWidgetFactory::createButton hides the name of AbstractWidgetFactory::createButton. It’s probably better to rename it to FancyWidgetFactory::createFancyButton, and also to make it a private helper function.

    Reply

    1. Hi, and thanks for your comment.
      The whole point of createButton in the derived class is to override the behavior of the one in the base class by hiding its name. That way you mimic true covariant functions by having a function with the same name returning a more derived pointer.

      Reply

      1. I understand the mechanism but I think it is fragile and underhanded.

        It is fragile because redefining names from a base class in a derived class hides all names from the base. E.g. if AbstractWidgetFactory would add another overload createWindow(int width, int height), then this would be hidden in FancyWidgetFactory. (I suspect that is why you have a createMessageWindow(const char*) instead of an overload createWindow(const char*))

        To remedy this, you could write a using AbstractWidgetFactory::createWindow inside FancyWidgetFactory. If you want to guard your FancyWidgetFactory against future overload extensions in AbstractWidgetFactory, then you must write these using statements already today.

        The mechanism is also underhanded because you violate the is-a relationship between the base and derived factory class. E.g. if AbstractWidgetFactory::createWindow uses the NVI to do logging before calling do_createWindow, then you miss that in the derived class.

        I can’t think of an easy way to remedy the NVI, except to manually add logging in your redefined createWindow. This makes your derived factory vulnerable to future updates of the abstract factory (the logging mechanism might even be a private implementation detail of the base class, and you would have to guess about its exact reimplementation, especially if you don’t have the source of the base factory).

        It’s a pity that covariant return types don’t extend to smart pointers, but the combination of using public inheritance for your factory and redefining member functions is surprising and dangerous to users. IMO, they would be better served by having to explicitly call FancyWidgetFactory::createFancyWindow instead if they insist on receiving a unique_ptr<<i></i>FancyWindow>.

        Reply

Leave a Reply

Your email address will not be published. Required fields are marked *