Ranges: the STL to the Next Level

Today’s post is the first guest post on this blog. Jonathan Boccara tells us about some exciting features that will hit the standard library after C++17: Ranges. Jonathan develops and teaches C++ at Murex and his focus is on writing expressive code. You can reach him @JoBoccara or find him on his blog Fluent C++.

The C++ Standard Template Library (STL) is a fantastic tool for making code more correct and expressive. It is mainly composed of two parts:

  • The containers, such as std::vector or std::map for instance,
  • The algorithms, a fairly large collection of generic functions that operate amongst others on containers. They are mostly found under the algorithm header.

A lot of manual operations performed on containers with for loops can be replaced by calls to algorithms of the STL. This has the effect of making the code clearer, because instead of having to mentally parse a complex for loop, a reader of the code can instantly understand what is going on if the offending for loops are replaced with explicit names such as std::copy, std::partition or std::rotate.

In his very popular talk on this topic, Sean Parent shows impressive refactorings of code by using a couple of these algorithms.

However, the STL has several aspects that can be improved. In this post we focus on two of them:

  • All algorithms manipulate iterators pointing into the collection they operate on. While this is handy in specific cases like stopping at a precise point in a container, the largely general case is to traverse the whole container from its .begin() to its .end().

    Therefore, portions of code that use the STL end up being littered with iterators:

    std::copy(v1.begin(), v1.end(), std::back_inserter(v2));
    std::set_difference(v2.begin(), v2.end(), v3.begin(), v3.end(), std::back_inserter(v4));
    std::transform(v3.begin(), v3.end(), std::back_inserter(v4));
    

    (Note: the std::back_inserter used above is an output iterator that does a push_back into the container it is passed, every time it is assigned to. This relieves the programmer from the sizing of the output)

  • Algorithms do not compose well. I found that a recurring need encountered by C++ developers who use the STL is to apply a function only on elements of a collection that satisfy a predicate.

    Applying a function f on all the elements of a collection input and putting the results in a vector output is achieved by std::transform:

    std::transform(input.begin(), input.end(), std::back_inserter(output), f);
    

    And filtering the elements on a predicate p is done with std::copy_if:

    std::copy_if(input.begin(), input.end(), std::back_inserter(output), p);
    

    But there is no easy way to combine these two calls, and there is no such thing as a “transform_if” algorithm.

Ranges provide a different approach to the STL that solves these two issues in a very elegant manner. Ranges were initially introduced in Boost, and are now on their way to standardization. I believe they will have a major impact on the way we deal with collections in code.

The concept of Range

At the center of all this is the concept of Range. Essentially, a range is something that can be traversed. More precisely, a range is something that has a begin() and an end() method, that return objects (iterators) that let you iterate over the range (that is, move along the elements of the range, and be dereferenced to access these elements).

Expressed in pseudocode a range would be something that complies to the following interface:

Range {
  begin()
  end()
}

In particular, this implies that all STL containers are themselves ranges.

Ranges were already used in some way by code using the STL before the Range concept was defined, but clumsily. As seen at the beginning of this post they were manipulated directly with two iterators, typically a begin and an end. With ranges though, you generally don’t see iterators. They are here, but abstracted away by the concept of range.

This is important to understand. Iterators are technical constructs that let you iterate over a collection, but they are generally too technical for your functional code. Most of the time, what you are really trying to represent is a range, which corresponds better to the level of abstraction of your code. Like a range of cash flows, a range of lines in a screen, or a range of entries coming up from the database.

So coding in terms of ranges is a huge improvement, because in that sense iterators violate the principle of respecting levels of abstraction, which I deem is the most important principle for designing good code.

In range libraries, STL algorithms are redefined to take directly ranges as parameters, instead of two iterators, like:

ranges::transform(input, std::back_inserter(output), f);

As opposed to:

std::transform(input.begin(), input.end(), std::back_inserter(output), f);

Such algorithms reuse the STL versions in their implementation, by forwarding the begin and the end of the range to the native STL versions.

Smart iterators

Even though they are abstracted away by ranges, range traversals are implemented with iterators. The full power of ranges comes from its combination with smart iterators. Generally speaking, an iterator of a collection has two responsibilities:

  • Moving along the elements of the collection (++, –, etc.)
  • Accessing the elements of the collection (*, ->)

For example, a vector iterator does just this. But “smart” iterators that originated in boost customize one or both of these behaviours. For instance:

  • The transform_iterator is constructed with another iterator it and a function (or function object) f, and customizes the way it accesses elements: when dereferenced, the transform_iterator applies f to *it and returns the result.
  • The filter_iterator is constructed with another iterator it and a predicate p. It customizes the way its moves: when advancing by one (++) a filter_iterator, it advances its underlying iterator it until it reaches an element that satisfies the predicate or the end of the collection.

Combining Ranges and Smart iterators: Range adaptors

The full power of ranges comes with their association with smart iterators. This is done with range adaptors.

A range adaptor is an object that can be combined with a range in order to produce a new range. A subpart of them are view adaptors: with them, the initial adapted range remains unchanged, while the produced range does not contain elements because it is rather a view over the initial one, but with a customized iteration behaviour.

To illustrate this let’s take the example of the view::transform adaptor. This adaptor is initialized with a function and can be combined with a range to produce a view over it, that has the iteration behaviour of a transform_iterator over that range. Range adaptors can be combined with ranges with operator|, which gives them a elegant syntax.

With the following collection of numbers:

std::vector numbers = { 1, 2, 3, 4, 5 };

The range

auto range = numbers | view::transform(multiplyBy2);

is a view over the vector numbers that has the iteration behaviour of a transform_iterator with the function multiplyBy2. So when you iterate over this view, the results you get are all these numbers, multiplied by 2. For instance:

ranges::accumulate(numbers | view::transform(multiplyBy2), 0);

returns 1*2 + 2*2 + 3*2 + 4*2 + 5*2 = 30 (similarly to std::accumulate, ranges::accumulate does the sum of the elements of the range it is passed to).

There are plenty of other range adaptors. For example, view::filter takes a predicate and can be combined with a range to build a view over it with the behaviour of a filter_iterator:

ranges::accumulate(numbers | view::filter(isEven), 0);

returns 2 + 4 = 6.

An important thing to note is that the ranges resulting from associations with range adaptors, although they are merely view over the ranges they adapt and don’t actually store elements, answer to the range interface (begin, end) so they are themselves ranges. Therefore adaptors can adapt adapted ranges, and can be effectively combined the following way:

ranges::accumulate(numbers | view::filter(isEven) | view::transform(multiplyBy2), 0);

returns 2*2 + 4*2 = 12. And this gives a solution to the initial problem of not being able to combine algorithms together.

Conclusion

Ranges raise the level of abstraction of code using the STL, therefore clearing up code using the STL from superfluous iterators. Range adaptors are a very powerful and expressive tool to apply operations on elements of a collection, in a modular way.

Ranges are the future of the STL. To go further you can have a look at the initial range library in boost or to the proposal for standardization from Eric Niebler. As this proposal depends on concepts, that were not included in C++17, ranges have not been standardized yet. Until they are, you can dig into Eric Niebler’s range library range-v3 that is compatible with the current versions of the C++ language. It is available in Visual Studio 2015 Update 3 with a fork of the popular range-v3 library.

Previous Post
Next Post

2 Comments


Leave a Reply

Your email address will not be published. Required fields are marked *