Implementing Array Access for Tuple

Contents

This week I exchange guest posts with Jonathan Müller about accessing tuple elements. Jonathan is a CS student passionate about C++. He’s working on various C++ projects like memory, an allocator library, or standardese, a C++ documentation generator. You can find him online at his blog and on Twitter.

std::tuple is a generalization of std::pair for a variadic number of arguments, not just two. And it is a great generalization, except for one – crucial – thing: access. std::get<0>(tuple) is horrible compared to pair.first.

Obviously we can’t implement std::tuple with a variadic number of members, easch with names like first, second, third and so on. But since std::tuple is basically an array where each element can have a different type, is it actually possible to access a tuple with the operator[] syntax just like an array?

It is and this post shows how an operator[] for std::tuple could be written.

Note: As operator[] must be a member function you can’t really extend std::tuple directly. Instead you need to provide a wrapper, but this post doesn’t focus on that.

A first attempt that doesn’t work

C++17 adds if constexpr: you can have an if statement based on a compile-time constant, where only one branch is properly compiled. Maybe we can use something like this:

decltype(auto) operator[](std::size_t idx)
{
    if constexpr (idx == 0u)
        return std::get<0>(*this);
    else if constexpr (idx == 1u)
        return std::get<1>(*this);
    // extends further on
}

In case you are not familiar with decltype(auto): Since C++14, you have automatic type deduction for functions, i.e. write auto instead of a return type, and the compiler figures it out. decltype(auto) is also automatic type deduction, but instead of using the auto rules, it uses the decltype() rules. Here it means that it will return a reference to the element, not a copy. Arne has written a post about decltype, you can find it here.

Ignoring the fact that we can’t hard-code all possible branches but would need some kind of recursion, this doesn’t work for two reasons: First, decltype(auto) doesn’t work the way we want it here. ~~If there are different return types (which is usually the case for std::tuple), that’s an error~~ (This is wrong, due to the if constexpr only one branch will ever be active, so ther is no problem). Second, the parameter isn’t a compile-time constant, so we can’t use if constexpr. This is true even if we call it with a compile-time constant as in tuple[0].

Furthermore, a function may only have one return type, it can’t change depending on the parameters – unless that parameter is a template parameter. That’s the reason std::get works: it’s a template so the return type can change.

So let’s make operator[] a template:

A second attempt that doesn’t work either

template <std::size_t I>
decltype(auto) operator[](std::size_t idx)
{
    // what to do with parameter idx?
    return std::get<I>(*this); 
}

While this would work, there is a problem: There is no syntax to specify the template parameter:

tuple<0>[0] = 0; // won't work

The only way to call that overload is like this…

tuple.operator[]<0>(0) = 0;

…and that’s somehow worse than std::get<0>(tuple).

A third attempt that works but is ugly

But we’re really close: All we need to do is trick the compiler into deducing the template parameters for us. If a template parameter depends on a function parameter, there is no need to specify it, the compiler can deduce that.

But how do we trick the compiler into doing work for us? We need to be more flexible with the parameter. Remember, this is C++, we can do crazy stuff with operator overloading. For example, we are not limited to integral types for an operator[], we can use any type we want.

We need a template that is parametrized on the index we want to access, let’s just call it index:

template <std::size_t I>
struct index {};

index doesn’t actually need to do anything, it is just a tag. Check out this post by Arne for more information about tag types and templates.

Then we can overload our operator[] so that it accepts index:

template <std::size_t I>
decltype(auto) operator[](index<I>)
{
    return std::get<I>(*this);
}

And this finally works:

tuple[index<0>{}] = 0;

We now have to pass a parameter of type index, so we create a temporary. The compiler sees the type of the argument and deduces the template parameter for us, which is then a compile-time constant we can pass to std::get.

This technique is also something I’ve described in a blog post: Function templates – deduce template arguments or pass explicitly?.

However, it is still kind of ugly. With some variable template we can get rid of the braces, but it’s still not quite tuple[0].

A fourth attempt that works and is beautiful

One C++11 feature can help to make this pretty though: user-defined literals. We can create an integral literal – let’s call it _i for index – that creates an index object for us.

If you are not familiar with user-defined literals, Arne got you covered as well.

But again we run into the same problem: a function parameter is not a compile-time constant. So using the simple overload for an integral user-defined literal doesn’t work:

auto operator"" _i(unsigned long long idx)
{
    return index<idx>{}; // error: idx not a compile-time constant
}

Are we back to square one?

No, because for user-defined literals there is a way to get the parameter as compile-time constant directly: You can create an overload that gets the raw literal as character sequence in the template arguments. With that overload we can create our correct index:

template <char... Digits>
auto operator"" _i()
{
    return index<parse<Digits...>()>{};
}

Where parse is a constexpr function that parses the string literal for us:

template <char... Digits>
constexpr std::size_t parse()
{
    // convert to array so we can use a loop instead of recursion
    char digits[] = {Digits...}; 

    // straightforward number parsing code
    auto result = 0u;
    for (auto c : digits)
    {
        result *= 10;
        result += c - '0';
    }
    
    return result;
}

Putting it all together

With the index template access to std::tuple looks like this:

tuple[index<0>{}] = 0;

And with the literal operator it looks like this:

tuple[0_i] = 0;

And that’s a lot nicer than either std::get<0>(tuple) or even pair.first. You can find the full code to play with here.

We’ve combined two techniques here: using tag templates to let the compiler deduce parameters and using literal operators to convert values to types. The second technique is really powerful and used a lot in the meta programming library Boost.Hana. It provides a literal _c that converts an integer to std::integral_constant.

Now that we have an operator[] for our tuple, it is one step closer to array. So a sensible question is: can we actually iterator over a tuple? Can we write:

for (auto& element : tuple)
    do_sth(element);

Head over to my blog to see Arne’s solution: http://foonathan.net/blog/2017/03/01/tuple-iterator.html.

5 Comments

Guille Becker
8 years ago Permalink

result *= multiplier;
….
multiplier *= 10u;

is wrong, works only up to 2 digits and fails in the 3r iteration.

1. Arne Mertz
  8 years ago Permalink
  
  Thanks for pointing out the error. You might want to reload the page, though it has been fixed a few days ago 😉
  
Sergey Shamov
8 years ago Permalink

Not bad. Especially trick with the parsing of raw chars. But I prefer to use std::placeholders::_1-_N as a tag types. This seems to me more standard way of doing things

1. Arne Mertz
  8 years ago Permalink
  
  I disagree. std::placeholders are, as the name says, meant to be placeholders for something else. Usually, they are used to denote parameters in std::bind that are left open and result in parameters in the generated function object. In this case, we are dealing with indices, which are values, not placeholders. Therefore, using different types and a different notation is good practice and less confusing.
  
  1. Sergey Shamov
    8 years ago Permalink
    
    The point make sense. But I am not so scrupulous and do not make a so subtle distinctions. After all, this is C++. Here a value can be a template parameter. And types can be used as function parameters. This is especially true for a type tags. They are invented for this. Yes, strictly speaking, we pass a value to the function. But this value does not interest us. Only its type is important.

Write clean and maintainable C++