Generating Iterators

Created 30th July, 2017 03:58 (UTC), last edited 27th August, 2017 06:43 (UTC)

Now that we have co_yield working and understand how it works we should be able to quite easily convert our generator to provide iterators so it acts like a normal C++ collection (or range if you're following the ranges TS work).

int main() {
    for ( auto v : count() )
        std::cout << ">> " << v << std::endl;
    return 0;
}

The refactoring is pretty much a matter of just moving code around, but there are a couple of things worth looking at more closely. First though some easy bits:

class iterator {
    generator &owner;
    bool done;
    void advance() {
        std::cout << "Moving to next" << std::endl;;
        owner.coro.resume();
        std::cout << "Are we done? ";
        auto still_going = not owner.coro.done();
        std::cout << (still_going ? "There is another" : "We're done") << std::endl;
        done = not still_going;
    }
};

The done flag is interesting. Because we don't have any sort of position, and certainly no way of rolling back there isn't much we can do for equality of these iterator instances other than tell if they're both at the end or not.

All we can do with the coroutine handle is either destroy or resume it. We can't copy them and we can't step back to a previous suspension point. Our options here are very limited. This leads to the following for !=:

bool operator != (const iterator &r) const {
    return done != r.done;
}

And now it becomes clear that our iterator constructor should mark done as true for the end iterator:

iterator(generator &o, bool d)
: owner(o), done(d) {
    if ( not done ) advance();
}

In generator we now know how to make and return the iterators:

iterator begin() {
    return iterator{*this, false};
}
iterator end() {
    return iterator{*this, true};
}

This just leaves the final two iterator members we need for our ranged for loop:

iterator &operator ++ () {
    advance();
    return *this;
}
int operator * () const {
    return owner.coro.promise().current_value;
}

Running gives us what we expect:

New promise created
generator
Initial suspend
~generator (empty)
Moving to next
Going to yield 1
Yielded 1
Are we done? There is another
>> 1
Moving to next
Going to yield 2
Yielded 2
Are we done? There is another
>> 2
Moving to next
Going to yield 3
Yielded 3
Are we done? There is another
>> 3
Moving to next
count() is done
return_void
Final suspend
Are we done? We're done
~generator (contains a coroutine)
Promise died

To make our generator properly and generally useful we should now turn it into a template and fill in the rest of the iterator so it has all of the expected members. I'll leave that as an exercise.

Yielding primes

So let's look at yielding something a little bit more interesting:

generator primes(int limit) {
    if ( limit >= 2 ) co_yield 2;
    for ( int n = 3; n <= limit; n += 2 ) {
        if ( is_prime(n) ) co_yield n;
    }
}

And we can call it like this:

int main() {
    std::cout << "coro4" << std::endl;
    for ( auto v : primes(10) )
        std::cout << ">> " << v << std::endl;
    return 0;
}

It's true that iterators can also be used as generators and we don't need this co_yield machinery to produce primes, but it's also true that this code is simpler, and that this just works. We haven't needed to alter any of our generator class code, or the way that the iterators work.

This is one of the big benefits of using coroutines. In the pre-coroutine world, ff you want to do different things then each one is a different iterator implementation where you have to break the code up in exactly the right way. With co_yield the compiler does this for you.

We passed a parameter to the coroutine this time. How does that work? We've already seen the coroutine_handle class that we use to deal with the promise_type and we use to resume and destroy the coroutine itself. We've sometimes referred to this as a stack frame, because that's also what it is. The compiler arranges to allocate some memory for us where information about the coroutine can be stored for as long as we keep it alive.

As well as our promise_type the compiler keeps whatever bookkeeping it needs to remember where the coroutine is up to (which suspension point we're currently at when the coroutine isn't running) and it also keeps the arguments and all the local parameters there. This “stack frame” that is kept on the heap is what allows us to remember everything we need to run inside the coroutine as we bounce between it and the code that uses the results.

Because there is only a single stack frame that is on the heap these sorts of coroutine are, rather confusingly, referred to as “stackless” coroutines. This is because we have only a single frame that is remembered on the heap rather than a whole alternative stack, which explains the name: There's a stack frame, but not a whole stack.

If we were to have a whole stack then the sorts of coroutines we would be talking about would be “stackful coroutines”. Each type has their own set of trade offs, but on most platforms stackful coroutines are easy to implement as a library and don't require compiler support. The stackless coroutines are most easily done with compiler support and it's stackless coroutines that look to be standardised first (hopefully in C++20). The Boost coroutines libraries implement stackful coroutines if you want to play with those.

One of the biggest downsides of stackless coroutines is that they are “infectious”, in the sense that once you start to use them, more and more code become coroutines. This isn't so much a problem when you're only using co_return and co_yield, but we'll see it when we start to want to use co_await.

Stackful coroutines don't suffer from this, but the requirement to have a full stack means that each coroutine uses a lot more memory, meaning they don't scale to as many coroutines as you can with stackless ones.

Yielding an infinity

Let's play around with our prime numbers a bit more. Right now we have to know which the largest number we want to deal with is going to be up front. Let's change that:

generator primes() {
    co_yield 2;
    for ( int n = 3; true; n += 2 ) {
        if ( is_prime(n) ) co_yield n;
    }
}
int main() {
    for ( auto v : primes() )
        std::cout << ">> " << v << std::endl;
    return 0;
}

Normally an infinite loop like this is a Bad Idea™ and indeed when we run this it does loop forever, so probably some limit is in order. Let's try to put it on the consumer side:

int main() {
    for ( auto v : primes() ) {
        if ( v > 15 ) break;
        std::cout << ">> " << v << std::endl;
    }
    return 0;
}

This is the output:

New promise created
generator
Initial suspend
~generator (empty)
Moving to next
Yielded 2
Are we done? There is another
>> 2
Moving to next
Yielded 3
Are we done? There is another
>> 3
Moving to next
Yielded 5
Are we done? There is another
>> 5
Moving to next
Yielded 7
Are we done? There is another
>> 7
Moving to next
Yielded 11
Are we done? There is another
>> 11
Moving to next
Yielded 13
Are we done? There is another
>> 13
Moving to next
Yielded 17
Are we done? There is another
~generator (contains a coroutine)
Promise died

If you look at the end of this you can see that despite the fact that the coroutine never finishes the promise destructor worked just fine. It turns out that everything that is in the stack frame is also properly dealt with. When the coroutine is destroyed through its handle then the compiler makes sure that the correct destructors for everything that is used are properly called.

We can absolutely rely on this behaviour in the same way that we can rely on the compiler calling the correct destuctors when an exception bypasses a function. So long as a coroutine is suspended it's always safe to destroy it and never get around to continuing it. Later on we'll play with some more things this can be used for. For now though, we'll move on to the last of the key words: co_await.


Categories: