Lambdas Part 2: Capture Lists and Stateful Closures

Contents

In the last post of my series about (relatively) new C++ features I introduced lambda expressions, which define and create function objects on the fly. I left a few details untouched, such as what the capture list is and how it works, details about several closure type member functions that would make only sense if the closure has member variables, and how to solve problems that would require function objects with state. 

Luckily, all these things fit nicely together: You can provide state to the closure objects, by giving them member variables, so the member function details make sense. You might have guessed it: this is achieved by the capture list.

Back to examples

Let’s recap the C++03 example problem I did not solve in the last post:

struct HasLessGoldThan {
  unsigned threshold;
  bool operator()(Hero const& hero) {
    return hero.inventory().gold() < threshold;
  }
  HasLessGoldThan(unsigned ui) : threshold(ui) {}
};
 
vector<Hero> heroes;
//...
vector<Hero>::iterator newEnd = remove_if(heroes.begin(), heroes.end(), HasLessGoldThan(5u));
heroes.erase(newEnd, heroes.end());

This can in fact be solved with a stateless lambda expression:

vector<Hero> heroes;
//...
auto newEnd = remove_if(begin(heroes), end(heroes), 
  [](Hero const& hero){
    return hero.inventory().gold() < 5u;    
  }
);
heroes.erase(newEnd, heroes.end());

The crux with this code is that we encoded the constant `5u` directly into the lambda. What if it is not a constant but a calculated value?

unsigned goldThreshold = /* calculate... */ 5u;
auto newEnd = remove_if(begin(heroes), end(heroes), HasLessGoldThan(goldThreshold));

Like with the handcrafted function object above, we’d like to just pass the calculated value into the lambda, and preferably use it the same way we used the constant above. If we just replace the `5u` with `goldThreshold`, the compiler will complain about it.

Capturing state

However, we can add just a little extra, and the lambda expression will do exactly what we need:

unsigned goldThreshold = /* calculate... */ 5u;
auto newEnd = remove_if(begin(heroes), end(heroes), 
  [goldThreshold](Hero const& hero){
    return hero.inventory().gold() < goldThreshold;    
  }
);

Here we mention the external variable `goldThreshold` in the capture list of the lambda expression, to make it accessible inside the lambda. The capture list is a comma separated list, so we can just as easily capture two or more variables:

auto goldThreshold = /* calculate... */ 5u;
auto offset = 2u;
//...

  [goldThreshold, offset](Hero const& hero){
    return hero.inventory().gold() < (goldThreshold - offset);    
  }

Capture by value versus capture by reference

In the example above, the `goldThreshold` is captured by value. That means that the closure has a member variable (with the same name) that is a copy of the `goldThreshold` variable we calculated outside.

Capture by value implies that if we were to change the original value before invoking the closure, it would have no effect, since we did not change the closure’s member variable. In addition the lambda body can not modify the captured value, because as I described in the last post, the function call operator is const qualified – so at least that makes sense now.

The alternative is capture by reference: The member variable of the closure then is not a copy, but a reference to the original, so the function call operator behaves differently if we change the outside value, and in turn it can modify the member and the outside value itself.

To capture by reference instead of by value, prefix the variable name with an ampersand in the capture list:

unsigned richCounter = 0;
unsigned poorCounter = 0;

for_each(begin(heroes), end(heroes),
  // capture both counters by reference
  [&richCounter, &poorCounter](Hero const& hero){
    auto const& gold = hero.inventory().gold();
    if (gold > 1000) {
      ++richCounter;
    }
    else if (gold < 10) {
      ++poorCounter;
    }
  }
);

cout << richCounter << " rich heroes and " 
     << poorCounter << " poor heroes found!\n";

Capturing member variables

If you create a lambda inside a member function and want it to access member variables of the object the function is called on, then you can not simply capture those variables. Instead you have to capture the this pointer.

Luckily, there is no need to prefix the members with `this->` every time inside the lambda. The compiler will figure that out for us.

struct Beast {
  unsigned strength;
  
  void attack(vector<Hero>& heroes) {
    for_each(begin(heroes), end(heroes),

      [this](Hero& hero){
        hero.applyDamage(strength);
      }

    );
  }
};

The this pointer can only be captured by value, not by reference, and it is const qualified, if the method in which the lambda expression is used, is const qualified as well.

Dealing with multiple captures

If you have to use a lot of external variables inside the lambda, the capture list can become a bit long. Besides the fact that this may be a good point to rethink your design (like long function parameter lists, long capture lists are a code smell), there is help in form of default captures:

At the beginning of the capture list, you can provide either an `&` or a `=` to declare all variables used in the lambda expression implicitly captured by reference or by value, respectively. Once you have done so, you can not explicitly capture single variables, including the this pointer, the same way.

[=, &a, &b]  //default: by value, but capture a and b by reference
[&, c]       //default: by reference, but capture c by value
[=, this, d] //ERROR: this and d may not be captured by value,
             //since default is already capture by value

Probably the most used and most “natural” capture list is [=]. It is sufficient in most cases, prevents dangling references and includes the this pointer. Consider using it as a default capture list.

Init captures

Until now we have only treated capturing existing variables by using their name, and capturing by value always gave us a copy. C++14 introduces a means to work around those limitations by allowing us to create new member variables for the closure and initializing them with whatever we like to:

auto uPtrOutside = make_unique<Beast>();

thread newThread{ 
  [uPtrInside = move(uPtrOutside), anotherUPtr = make_unique<Hero>()] () {
    //...
  }
};

Here, `uPtrInside` is moved from `uPtrOutside`, and `anotherUPtr` is the result of a function call – both are member values of the closure, not references, and both are initialized with a move, not a copy.

You can capture references with init captures as well, again by prefixing the name with an ampersand. You can also reuse names from the outer scope. For example, if the `uPtrOutside` had a meaningful name, the init capture for it could look like this:

[uPtrMeaningfulName = move(uPtrMeaningfulName)]

Closure member variable types

The rules for deducing the types for all those closure member variables are mostly the same rules as for `auto` variables, i.e. as for templates. That includes the problems with braced initializers, so better stay clear of those in init captures as well.

However, when capturing by value, the closure members retain the const and volatile qualifiers from their originals, i.e. capturing a `const string` by value will create a const copy inside the closure object. This does not apply to init captures, so if you need a nonconst capture of a const variable, use an init capture with the same name, like `[a = a]`.

Returning to closure member functions

In the first post about lambdas I wrote about the different member functions that are present in the closure type. In the light of stateful closures, let’s have a look at them again:

Constructors and destructor

The defaulted copy and move constructor as well as the defaulted destructor make sense now. You can, copy and/or move a closure object or you can’t, depending on its members. A noncopyable and nonmovable closure would not be of much use, so be cautious before you do fancy stuff with init captures. The destructor simply destroys the closure members as it should.

Conversion to function pointer

Lambdas and closures are no magic, so since the compiler has no way to hide the additional state behind a plain function pointer, the conversion is not available in lambdas that have a capture list.

Function call operator

The function call operator is implicitly declared const. That way, closures can not change their captured state from call to call, which makes sense. After all, they are little helper objects, not full grown classes with mutable state that happen to have only a single method.

However, if you truly need to work around that fact, you can do so by explicitly declaring the lambda mutable. The parameter list is no longer optional in that case:

auto lam = [callcount = 0u] () mutable { 
  cout << ++callcount; 
};

Conclusion

Lambdas are a mighty feature in the new C++ landscape, equipped with a lot of extras to make corner cases work. They can simplify your code a good measure, as long as you don’t make the lambdas themselves too complicated.

Use lambdas where they can make your code simpler and more readable. Stick to a handful of lines and only a few captures. When a lambda becomes more complicated, consider using a normal class to give it a name, factor out helper methods etc.

Previous Post
Next Post

Leave a Reply

Your email address will not be published. Required fields are marked *