kirit.com

Created 26th April, 2005 13:12 (UTC), last edited 22nd December, 2016 03:24 (UTC)

Writing about C++, Programming, Fost 5, the web, Thailand and anything else that catches my attention—with some photos thrown in

Awaiting

Created 27th August, 2017 06:37 (UTC), last edited 27th August, 2017 06:55 (UTC)

We've covered co_return and co_yield now. The final new keyword introduced by the coroutines TS is co_await. Where the first two keywords allow a coroutine to suspend whilst its consumer gets ready to use results, co_await is used to allow the coroutine to suspend whilst it waits for things it needs. I tend to think of co_return and co_yield as servicing the outside of the coroutine and co_await as dealing with the inside.

We're going to start with the refactored lazy class that we had a couple of tutorials ago and make it so that our lazy class supports co_await.

lazy<int> answer() {
    std::cout << "Thinking deep thoghts..." << std::endl;
    co_return 42;
}

sync<int> await_answer() {
    std::cout << "Started await_answer" << std::endl;
    auto a = answer();
    std::cout << "Got a coroutine, let's get a value" << std::endl;
    auto v = co_await a;
    std::cout << "And the coroutine value is: " << v << std::endl;
    v = co_await a;
    std::cout << "And the coroutine value is still: " << v << std::endl;
    co_return 0;
}

int main() {
    return await_answer().get();
}

The first thing to notice here is that we now have two coroutines, both answer and await_answer. We can't perform the await inside main because main (along with constructors and destructors) isn't allowed to be a coroutine.

The reason for this is that a coroutine's return value needs to be properly dealt with from its calling context, so that the coroutine does its work properly. If we tried to turn main into a coroutine there'd be nothing to handle this for us. There're similar problems for both constructors and destructors, neither of which return values.

Because there are two coroutines in play we'll only keep the prints for the lifetime tracking of the lazy instance and not the sync one. We're using sync only as a mechanism to allow us to enter a context within which we can use coroutines and it doesn't play any part in the example other than that.

The co_await API

Just like our co_return and co_yield examples, the co_await requires a particular API on anything we want to await. There are three parts:

  1. await_ready returns a boolean to describe whether or not the suspend is needed, true for don't suspend and false for suspend.
  2. await_suspend which is called when the suspend is needed because the value isn't ready yet.
  3. await_resume which returns the value that is being awaited on. This becomes the result of the co_await expression.

Let's look at what these would look like for our lazy class.

bool await_ready() {
    const auto ready = coro.done();
    std::cout << "Await " << (ready ? "is ready" : "isn't ready") << std::endl;
    return coro.done();
}

We can easily tell if the value is available by looking to see if the lazy is done or not. If it is done then we can move directly on to returning the value to the co_await. If it isn't done we're going to need to resume the lazy body so it can co_return something into the promise type for us to pass on.

void await_suspend(std::experimental::coroutine_handle<> awaiting) {
    std::cout << "About to resume the lazy" << std::endl;
    coro.resume();
    std::cout << "About to resume the awaiter" << std::endl;
    awaiting.resume();
}

Here we can directly see the two coroutines in our example. The function is a member of the lazy returned by answer so the class member coro is the handle to the answer coroutine. We always resume this because we can only end up here if the coroutine isn't done already (checked in the await_ready member).

The passed in awaiting handle is for the sync instance associated with the await_answer coroutine, which is where the co_await appears. Once the lazy is done we resume this coroutine.

The last part (returning the value) is trivial:

auto await_resume() {
    const auto r = coro.promise().value;
    std::cout << "Await value is returned: " << r << std::endl;
    return r;
}

When we run this we can see things working as expected (remember that we only print lifetime information for the lazy instance and not the sync one):

Started await_answer
Promise created
Send back a lazy
Created a lazy object
Started the coroutine, wait for the go!
Move constructed a lazy object
Lazy gone
Got a coroutine, let's get a value
Await isn't ready
About to resume the lazy
Thinking deep thoghts…
Got an answer of 42
Finished the coro
About to resume the awaiter
Await value is returned: 42
And the coroutine value is: 42
Await is ready
Await value is returned: 42
And the coroutine value is still: 42
Promise died
Lazy gone

The first time await_ready is called the lazy hasn't yet run so we can see answer being entered when it is resumed and then the execution path as the value makes it way back out. The second time, however, the lazy is already done so we don't go down the resume path as we skip directly on to returning a second copy of the value.

Comparison with get

Let's look again at our get implementation:

T get() {
    std::cout << "We got asked for the return value..." << std::endl;
    if ( not coro.done() ) coro.resume();
    return coro.promise().value;
}

You'll notice that all the three parts of our new API appear in this:

  1. The if statement's conditional expression appears as await_ready
  2. The resume appears in the await_suspend.
  3. The return is handled by the await_resume.

There is one rather important difference though. The await_suspend also gets given the handle to the second coroutine, and this allows it to do a number of interesting things with it, all of which we'll have to wait for future tutorials for. We'll see our first example in the next article where we'll use co_await for something a bit more cool and surprising.

Addendum 1: operator co_await

The thing that is a bit nasty about this code is that the await API is implemented directly on the lazy instance and this allows it be used (and abused) by anybody who has a lazy instance. Luckily there is something we can do about that. We can add await support to any type by adding support for operator co_await. Like many of these operators it can be defined as a method, or stand alone. The signature for the the member version would be:

awaitable_type operator co_await();

Where awaitable_type now implements the above API. If we want it to be stand alone then it would be:

template<typename T>
awaitable_type operator co_await(lazy<T> &);

This is analagous to operators like == and <<.

Using this, and the new trick of local types we can use the following:

auto operator co_await() {
    struct awaitable_type {
        handle_type coro;
        bool await_ready() {
            const auto ready = coro.done();
            std::cout << "Await " << (ready ? "is ready" : "isn't ready") << std::endl;
            return coro.done();
        }
        void await_suspend(std::experimental::coroutine_handle<> awaiting) {
            std::cout << "Got to resume the lazy" << std::endl;
            coro.resume();
            std::cout << "Got to resume the awaiter" << std::endl;
            awaiting.resume();
        }
        auto await_resume() {
            const auto r = coro.promise().value;
            std::cout << "Await value is returned: " << r << std::endl;
            return r;
        }
    };
    return awaitable_type{coro};
}

Addendum 2: Tail call optimisation

In the current TS we have to resume both coroutines in the await_suspend. The lazy coroutine's body is executed first and when it suspends after the co_return our await_suspend resumes execution and then continues execution of the awaiting coroutine, in this case

void await_suspend(std::experimental::coroutine_handle<> awaiting) {
    std::cout << "About to resume the lazy" << std::endl;
    coro.resume();
    std::cout << "About to resume the awaiter" << std::endl;
    awaiting.resume();
}

It's not quite as dire as it seems though, because the call is in the tail position. This allows the compiler to perform tail call optimisation to remove the stack frame, something that the optimsers in modern compilers are pretty good at. This means optimised production builds should be fine, but it may prove a problem for debug builds.

To deal with this better a future version of the TS is likely to allow await_suspend to return a coroutine handle to be used as a continuation (that is, to be called immediately after await_suspend returns). This looks like:

auto await_suspend(std::experimental::coroutine_handle<> awaiting) {
    std::cout << "About to resume the lazy" << std::endl;
    coro.resume();
    std::cout << "Returning awaiting coroutine as a continuation" << std::endl;
    return awaiting;
}

In this version the stack frame for our await_suspend is completely gone before the awaiting coroutine is resumed.


Categories:
Posted: 27th August, 2017 07:52 (UTC)
First announced.

Generating Iterators

Created 30th July, 2017 03:58 (UTC), last edited 27th August, 2017 06:43 (UTC)

Now that we have co_yield working and understand how it works we should be able to quite easily convert our generator to provide iterators so it acts like a normal C++ collection (or range if you're following the ranges TS work).

int main() {
    for ( auto v : count() )
        std::cout << ">> " << v << std::endl;
    return 0;
}

The refactoring is pretty much a matter of just moving code around, but there are a couple of things worth looking at more closely. First though some easy bits:

class iterator {
    generator &owner;
    bool done;
    void advance() {
        std::cout << "Moving to next" << std::endl;;
        owner.coro.resume();
        std::cout << "Are we done? ";
        auto still_going = not owner.coro.done();
        std::cout << (still_going ? "There is another" : "We're done") << std::endl;
        done = not still_going;
    }
};

The done flag is interesting. Because we don't have any sort of position, and certainly no way of rolling back there isn't much we can do for equality of these iterator instances other than tell if they're both at the end or not.

All we can do with the coroutine handle is either destroy or resume it. We can't copy them and we can't step back to a previous suspension point. Our options here are very limited. This leads to the following for !=:

bool operator != (const iterator &r) const {
    return done != r.done;
}

And now it becomes clear that our iterator constructor should mark done as true for the end iterator:

iterator(generator &o, bool d)
: owner(o), done(d) {
    if ( not done ) advance();
}

In generator we now know how to make and return the iterators:

iterator begin() {
    return iterator{*this, false};
}
iterator end() {
    return iterator{*this, true};
}

This just leaves the final two iterator members we need for our ranged for loop:

iterator &operator ++ () {
    advance();
    return *this;
}
int operator * () const {
    return owner.coro.promise().current_value;
}

Running gives us what we expect:

New promise created
generator
Initial suspend
~generator (empty)
Moving to next
Going to yield 1
Yielded 1
Are we done? There is another
>> 1
Moving to next
Going to yield 2
Yielded 2
Are we done? There is another
>> 2
Moving to next
Going to yield 3
Yielded 3
Are we done? There is another
>> 3
Moving to next
count() is done
return_void
Final suspend
Are we done? We're done
~generator (contains a coroutine)
Promise died

To make our generator properly and generally useful we should now turn it into a template and fill in the rest of the iterator so it has all of the expected members. I'll leave that as an exercise.

Yielding primes

So let's look at yielding something a little bit more interesting:

generator primes(int limit) {
    if ( limit >= 2 ) co_yield 2;
    for ( int n = 3; n <= limit; n += 2 ) {
        if ( is_prime(n) ) co_yield n;
    }
}

And we can call it like this:

int main() {
    std::cout << "coro4" << std::endl;
    for ( auto v : primes(10) )
        std::cout << ">> " << v << std::endl;
    return 0;
}

It's true that iterators can also be used as generators and we don't need this co_yield machinery to produce primes, but it's also true that this code is simpler, and that this just works. We haven't needed to alter any of our generator class code, or the way that the iterators work.

This is one of the big benefits of using coroutines. In the pre-coroutine world, ff you want to do different things then each one is a different iterator implementation where you have to break the code up in exactly the right way. With co_yield the compiler does this for you.

We passed a parameter to the coroutine this time. How does that work? We've already seen the coroutine_handle class that we use to deal with the promise_type and we use to resume and destroy the coroutine itself. We've sometimes referred to this as a stack frame, because that's also what it is. The compiler arranges to allocate some memory for us where information about the coroutine can be stored for as long as we keep it alive.

As well as our promise_type the compiler keeps whatever bookkeeping it needs to remember where the coroutine is up to (which suspension point we're currently at when the coroutine isn't running) and it also keeps the arguments and all the local parameters there. This “stack frame” that is kept on the heap is what allows us to remember everything we need to run inside the coroutine as we bounce between it and the code that uses the results.

Because there is only a single stack frame that is on the heap these sorts of coroutine are, rather confusingly, referred to as “stackless” coroutines. This is because we have only a single frame that is remembered on the heap rather than a whole alternative stack, which explains the name: There's a stack frame, but not a whole stack.

If we were to have a whole stack then the sorts of coroutines we would be talking about would be “stackful coroutines”. Each type has their own set of trade offs, but on most platforms stackful coroutines are easy to implement as a library and don't require compiler support. The stackless coroutines are most easily done with compiler support and it's stackless coroutines that look to be standardised first (hopefully in C++20). The Boost coroutines libraries implement stackful coroutines if you want to play with those.

One of the biggest downsides of stackless coroutines is that they are “infectious”, in the sense that once you start to use them, more and more code become coroutines. This isn't so much a problem when you're only using co_return and co_yield, but we'll see it when we start to want to use co_await.

Stackful coroutines don't suffer from this, but the requirement to have a full stack means that each coroutine uses a lot more memory, meaning they don't scale to as many coroutines as you can with stackless ones.

Yielding an infinity

Let's play around with our prime numbers a bit more. Right now we have to know which the largest number we want to deal with is going to be up front. Let's change that:

generator primes() {
    co_yield 2;
    for ( int n = 3; true; n += 2 ) {
        if ( is_prime(n) ) co_yield n;
    }
}
int main() {
    for ( auto v : primes() )
        std::cout << ">> " << v << std::endl;
    return 0;
}

Normally an infinite loop like this is a Bad Idea™ and indeed when we run this it does loop forever, so probably some limit is in order. Let's try to put it on the consumer side:

int main() {
    for ( auto v : primes() ) {
        if ( v > 15 ) break;
        std::cout << ">> " << v << std::endl;
    }
    return 0;
}

This is the output:

New promise created
generator
Initial suspend
~generator (empty)
Moving to next
Yielded 2
Are we done? There is another
>> 2
Moving to next
Yielded 3
Are we done? There is another
>> 3
Moving to next
Yielded 5
Are we done? There is another
>> 5
Moving to next
Yielded 7
Are we done? There is another
>> 7
Moving to next
Yielded 11
Are we done? There is another
>> 11
Moving to next
Yielded 13
Are we done? There is another
>> 13
Moving to next
Yielded 17
Are we done? There is another
~generator (contains a coroutine)
Promise died

If you look at the end of this you can see that despite the fact that the coroutine never finishes the promise destructor worked just fine. It turns out that everything that is in the stack frame is also properly dealt with. When the coroutine is destroyed through its handle then the compiler makes sure that the correct destructors for everything that is used are properly called.

We can absolutely rely on this behaviour in the same way that we can rely on the compiler calling the correct destuctors when an exception bypasses a function. So long as a coroutine is suspended it's always safe to destroy it and never get around to continuing it. Later on we'll play with some more things this can be used for. For now though, we'll move on to the last of the key words: co_await.


Categories:
Posted: 30th July, 2017 04:28 (UTC)
First announced.

Yielding Generators

Created 15th July, 2017 05:28 (UTC), last edited 30th July, 2017 03:59 (UTC)

We've seen how the promise_type together with the coroutine return type handles the interactions between the caller and the coroutine itself.

Our target is to be able to do something pretty simple:

generator count() {
    std::cout << "Going to yield 1" << std::endl;
    co_yield 1;
    std::cout << "Going to yield 2" << std::endl;
    co_yield 2;
    std::cout << "Going to yield 3" << std::endl;
    co_yield 3;
    std::cout << "count() is done" << std::endl;
}

co_yield is actually very similar to our co_return example previously. There is really just one additional component, and one small change:

  1. Because our generator doesn't have any sort of return value there is an implicit return that produces nothing. This means that instead of return_value we are going to need to fill in a return_void method.
  2. For each value that is yielded the compiler will arrange to make a call to yield_value.

These two things together mean that we need to suspend at the yields (so the user of the generator gets a chance to read the value) and we know we're finished yielding values when the return_void promise method gets called.

Let's get started

Let's sketch out something that should look fairly familiar:

struct generator {
    struct promise_type {
    };
    using handle_type =
        std::experimental::coroutine_handle<promise_type>;
private:
    generator(handle_type h)
    : coro(h) {
        std::cout << "generator" << std::endl;
    }
    handle_type coro;
};

This should look fairly similar to our lazy and sync examples.

We can track the lifetime of the promise to help us see what's going on and we also suspend the coroutine both at its start and end, all by adding these members to our promise_type:

promise_type() {
    std::cout << "Promise created" << std::endl;
}
~promise_type() {
    std::cout << "Promise died" << std::endl;
}
auto initial_suspend() {
    std::cout << "Started the coroutine, let's pause!" << std::endl;
    return std::experimental::suspend_always{};
}
auto final_suspend() {
    std::cout << "Finished the coroutine" << std::endl;
    return std::experimental::suspend_always{};
}

We still need to return the generator object from our promise_type, and we also have to do something if an exception happens.

auto get_return_object() {
    std::cout << "Send back a generator" << std::endl;
    return generator{handle_type::from_promise(*this)};
}
void unhandled_exception() {
    std::cout << "Exception...." << std::endl;
    std::exit(1);
}

For our co_return based lazy and sync classes we also had to provide a mechanism for the co_return value to find its way to the caller. We actually still have to do this, but our coroutine example doesn't have a return statement, so now we have to implement the return_void member:

auto return_void() {
    std::cout << "return_void" << std::endl;
    return std::experimental::suspend_never{};
}

We're not going to suspend here because we want to just carry on and suspend at the final suspension point (where we will stop so the generator can clean up for us).

Yielding values

The last thing we need is to handle the yield itself. When the coroutines gets to a co_yield the compiler will call yield_value on our promise_type so we can see the value and decide what to do.

auto yield_value(int value) {
    std::cout << "Yielded " << value << std::endl;
    current_value = value;
    return std::experimental::suspend_always{};
}

It's also important that we suspend here. If we don't then our coroutine will carry on and run another co_yield. This will cause yield_value to be called again and we'll lose the value that was just given to us—it'll get overwritten by the new one.

As we saw before, we now have the coroutine performing its half of the control flow. The rest of the control flow that happens needs to be in the generator code because that is what our caller will interact with.

There is a dance between the generator and the promise_type to control the resumption of the coroutine and generation of values in exactly the right way. We need to make sure that we suspend and resume so that we get all of the values and don't skip any (especially easy at the start and end).

We need some sort of API that users of generator are going to use to fetch values. The obvious one is some sort of iterator based interface, but let's start with something a little bit simpler that involves less code.

int main() {
    auto g = count();
    while ( g.move_next() )
        std::cout << ">> " << g.current_value() << std::endl;
    return 0;
}

The implication here is that move_next will resume the coroutine until a co_yield or a return happens. If a co_yield happens we're going to return true, if the coroutine returns then we're going to return false to signal the end of the yielded sequence. Inside the loop we expect the current_value method to return the last value yielded.

The latter is very straight forward:

int current_value() {
    return coro.promise().current_value;
}

This is exactly as we had with the earlier sync and lazy examples. We store the last yielded value in the promise and can easily retrieve it through the coroutine handle we store in the generator.

For the sequence for fetching values we want this:

  1. Enter the coroutine, but suspend before doing anything.
  2. When the move_next is called resume the coroutine so that it can either return or yield a value.
  3. If the corotune is now done we didn't get a value so move_next returns false.
  4. If we're not done then return true so that move_next can be called again (going back to step 2).

Let's do that and also track what's going on:

bool move_next() {
    std::cout << "Moving to next" << std::endl;;
    coro.resume();
    std::cout << "Are we done? ";
    auto still_going = not coro.done();
    std::cout << (still_going ? "There is another" : "We're done") << std::endl;
    return still_going;
}

We now have enough to run:

Promise created
Send back a generator
generator
Started the coroutine, let's pause!
Moving to next
Going to yield 1
Yielded 1
Are we done? There is another
>> 1
Moving to next
Going to yield 2
Yielded 2
Are we done? There is another
>> 2
Moving to next
Going to yield 3
Yielded 3
Are we done? There is another
>> 3
Moving to next
count() is done
return_void
Finished the coroutine, brakes on
Are we done? We're done

Examining this we can see the dance between the two sides, the promise_type and the generator, play out in the right way.

  1. The promise creates and returns the generator.
  2. The coroutine is entered, but suspended.
  3. The generator's move_next is called and it advances the coroutine.
  4. The coroutine yields its first value
  5. move_next finds that the coroutine has yielded so returns true
  6. The value is printed
  7. Steps 3 to 7 are repeated another couple of times

The last time though is a bit different. The final move_next results in a call to return_void rather than yield_value and the coroutine ends up suspended at the final suspension point. coro.done() now returns true so we know that the coroutine has yielded everything it can (or wants to) and move_next returns false and we're all done.

To see the importance of getting this dance right let's change the initial_suspend to keep going:

auto initial_suspend() {
    std::cout << "Started the coroutine, keep going!" << std::endl;
    return std::experimental::suspend_never{};
}

Now we'll see this sequence:

Promise created
Send back a generator
generator
Started the coroutine, keep going!
Going to yield 1
Yielded 1
Moving to next
Going to yield 2
Yielded 2
Are we done? There is another
>> 2
Moving to next
Going to yield 3
Yielded 3
Are we done? There is another
>> 3
Moving to next
count() is done
return_void
Finished the coroutine, brakes on
Are we done? We're done

And because we didn't stop at the beginning we've resumed before the first value could be read. We can fix this by changing the way the generator is used:

int main() {
    std::cout << "coro3" << std::endl;
    auto g = count();
    do {
        std::cout << ">> " << g.current_value() << std::endl;
    } while ( g.move_next() );
    return 0;
}

Now we see all of our values:

Promise created
Send back a generator
generator
Started the coroutine, let's pause!
Going to yield 1
Yielded 1
>> 1
Moving to next
Going to yield 2
Yielded 2
Are we done? There is another
>> 2
Moving to next
Going to yield 3
Yielded 3
Are we done? There is another
>> 3
Moving to next
count() is done
return_void
Finished the coroutine, brakes on
Are we done? We're done

But of course we have another problem. The code that uses the generator now assumes that the generator must yield at least one value. Probably our first API is the best for a general use generator so let's put that back.

Fixing the leak

We still have a more subtle problem though. Notice we never see the “Promise died” message. This is because we're not managing the life time of the coroutine itself yet. We have the same issue that we saw with sync and lazy. We need to make the generator movable, but not copyable and then destroy the coroutine at the correct point in time.

~generator() {
    std::cout << "~generator "
        << (not coro ? "(empty)" : "(contains a coroutine)")
        << std::endl;
    if ( coro ) coro.destroy();
}
generator(const generator &) = delete;
generator(generator &&g)
: coro(g.coro) {
    g.coro = nullptr;
};

Now we can see the coroutine itself is destroyed when our generator goes out of scope:

Promise created
Send back a generator
generator
Started the coroutine, let's pause!
~generator (empty)
Moving to next
Going to yield 1
Yielded 1
Are we done? There is another
>> 1
Moving to next
Going to yield 2
Yielded 2
Are we done? There is another
>> 2
Moving to next
Going to yield 3
Yielded 3
Are we done? There is another
>> 3
Moving to next
count() is done
return_void
Finished the coroutine, brakes on
Are we done? We're done
~generator (contains a coroutine)
Promise died

Next we're going to convert this to an iterator, a straightforward enough procedure, but with a few things worth pointing out.


Categories:
Posted: 16th July, 2017 03:55 (UTC)
First announced.

Moved to HTTPS

Posted 14th July, 2017 12:06 (UTC), last edited 14th July, 2017 12:09 (UTC)

Thanks to the magic of Let's Encrypt, this site is now on HTTPS with a proper certificate. Yay :)

I've also dropped the www. part of the URL, which has long annoyed me.

A more realistic coroutine

Created 29th June, 2017 04:57 (UTC), last edited 15th July, 2017 05:39 (UTC)

Having gotten something working, we still have a small problem. We have a coroutine that we can start, and we can choose when to suspend it, but we can't yet resume it.

The coroutines TS describes a class std::experimental::coroutine_handle which is our interface to the coroutine itself. It's a template which is supposed to be told the promise_type we're using.

We can add the following to our sync class:

struct promise_type;
using handle_type = std::experimental::coroutine_handle<promise_type>;

We should store the handle in the sync object, so we can replace our default constructor with this:

sync(handle_type h)
: coro(h) {
    std::cout << "Created a sync object" << std::endl;
}
handle_type coro;

Now in our promise_type we will need to pass the correct handle when we create the sync object. The coroutine_handle has a static member that allows us to do this:

auto get_return_object() {
    std::cout << "Send back a sync" << std::endl;
    return sync<T>{handle_type::from_promise(*this)};
}

Now that we have that we should change our shared_ptr members. In the promise_type we'll replace ptr with with a plain T member called value, and we can now remove it totally from our sync object. This does mean our sync::get member needs to pull the value from the coroutine_handle, but there's an API for that:

T get() {
    std::cout << "We got asked for the return value..." << std::endl;
    return coro.promise().value;
}

We clearly also have to change the way we record the return value in the promise_type:

auto return_value(T v) {
    std::cout << "Got an answer of " << v << std::endl;
    value = v;
    return std::experimental::suspend_never{};
}

If we try to run this we'll find we still have a problem with the lifetimes. The sync instance still outlives our promise_type instance so we hit undefined behaviour. The lifetime issue can be solved by suspending before we run off the end of the coroutine:

auto final_suspend() {
    std::cout << "Finished the coro" << std::endl;
    return std::experimental::suspend_always{};
}

I've rather arbitrarily changed the final_suspend member here, but you could also change return_value. Just so long as the value in the promise_type gets filled in before we suspend we'll be fine.

Now it leaks again of course, but we can plug the leak by destroying the coroutine through its handle in the sync destructor:

~sync() {
    std::cout << "Sync gone" << std::endl;
    if ( coro ) coro.destroy();
}

Now everything lines up and it all works, although there is still a subtle problem. If we copy the sync object around then we're going to end up destroying the coroutine_handle from more than one place.

When we deal with a coroutine_handle we have to remember that this is a resource and needs to be managed properly. The simplest way to deal with this is to make our sync movable but not copyable.

sync(const sync &) = delete;
sync(sync &&s)
: coro(s.coro) {
    std::cout << "Sync moved leaving behind a husk" << std::endl;
    s.coro = nullptr;
}
~sync() {
    std::cout << "Sync gone" << std::endl;
    if ( coro ) coro.destroy();
}
sync &operator = (const sync &) = delete;
sync &operator = (sync &&s) {
    coro = s.coro;
    s.coro = nullptr;
    return *this;
}

This final version deals correctly with all lifetime issues and outputs something along these lines:

Promise created
Send back a sync
Created a sync object
Started the coroutine, don't stop now!
Thinking deep thoughts…
Got an answer of 42
Finished the coro
Sync gone
Got a coroutine, let's get a value
We got asked for the return value…
And the coroutine value is: 42
Sync gone
Promise died

A lazy coroutine

We now have a working coroutine that executes its body when it's entered. Let's instead see if we can make a lazy coroutine that doesn't execute its body until it gets asked for the value.

Our general strategy is going to be this:

  1. Start the coroutine
  2. Return the lazy instance that will co-ordinate the return value.
  3. Suspend the coroutine.
  4. When the value in the lazy object is asked for, resume the coroutine.
  5. Suspend the coroutine after the co_return.
  6. The lazy object will return the value.
  7. Destroy the coroutine when the lazy is destructed.

There aren't actually that many changes we need to make. First of all just duplicate the sync class and call it lazy—we'll refactor this at the end.

Just to check nothing boneheaded happened first alter answer to return a lazy<int> rather than the sync and make sure we observe the same behaviour.

In order to delay the execution of the coroutine body we need to return suspend_always from initial_suspend:

auto initial_suspend() {
    std::cout << "Started the coroutine, put the brakes on!" << std::endl;
    return std::experimental::suspend_always{};
}

Now when we run it the coroutine body is never run so we don't see the proper value.

Promise created
Send back a lazy
Created a lazy object
Started the coroutine, put the brakes on!
Lazy gone
Got a coroutine, let's get a value
We got asked for the return value…
And the coroutine value is: 0
Lazy gone
Promise died

Luckily the coroutine_handle has a member that allows us to resume the execution of the coroutine at a time of our choosing.

T get() {
    std::cout << "We got asked for the return value..." << std::endl;
    coro.resume();
    return coro.promise().value;
}

With this additional change we now get the right result again, and we can see that things happen when we wanted them to:

Promise created
Send back a lazy
Created a lazy object
Started the coroutine, put the brakes on!
Lazy gone
Got a coroutine, let's get a value
We got asked for the return value…
Thinking deep thoughts…
Got an answer of 42
Finished the coro
And the coroutine value is: 42
Lazy gone
Promise died

Control the return type, control the coroutine

This describes the general pattern when working with coroutines. The way that the coroutine works in detail is fully described by the way that its return type controls the coroutine machinery provided the language. This has the unfortunate side effect of rendering useless auto type deduction for the return types. Oh well.

Notice how in both our sync and lazy examples the body of answer not changed at all. In both cases the literal 42 is returned (an int), but we wrap in something that controls how that value finds it way to the caller of answer.

Now that we understand that, we can refactor our code to remove the redundancy and highlight the important differences between the two implementations.

With that done it's easy to see that the promise types for two differ in whether they suspend or not at the initial suspend point, they also (obviously) differ in the types of object returned by get_return_object. And the final place they differ is in the get implementation which for lazy must also resume the coroutine:

template<typename T>
struct sync : public coreturn<T> {
    using coreturn<T>::coreturn;
    using handle_type = typename coreturn<T>::handle_type;
    T get() {
        std::cout << "We got asked for the return value..." << std::endl;
        return coreturn<T>::get();
    }
    struct promise_type : public coreturn<T>::promise {
        auto get_return_object() {
            std::cout << "Send back a sync" << std::endl;
            return sync<T>{handle_type::from_promise(*this)};
        }
        auto initial_suspend() {
            std::cout << "Started the coroutine, don't stop now!" << std::endl;
            return std::experimental::suspend_never{};
        }
    };
};


template<typename T>
struct lazy : public coreturn<T> {
    using coreturn<T>::coreturn;
    using handle_type = typename coreturn<T>::handle_type;;
    T get() {
        std::cout << "We got asked for the return value..." << std::endl;
        this->coro.resume();
        return coreturn<T>::get();
    }
    struct promise_type : public coreturn<T>::promise {
        auto get_return_object() {
            std::cout << "Send back a lazy" << std::endl;
            return lazy<T>{handle_type::from_promise(*this)};
        }
        auto initial_suspend() {
            std::cout << "Started the coroutine, put the brakes on!" << std::endl;
            return std::experimental::suspend_always{};
        }
    };
};

The common part is now:

template<typename T>
struct coreturn {
    struct promise;
    friend struct promise;
    using handle_type = std::experimental::coroutine_handle<promise>;
    coreturn(const coreturn &) = delete;
    coreturn(coreturn &&s)
    : coro(s.coro) {
        std::cout << "Coreturn wrapper moved" << std::endl;
        s.coro = nullptr;
    }
    ~coreturn() {
        std::cout << "Coreturn wrapper gone" << std::endl;
        if ( coro ) coro.destroy();
    }
    coreturn &operator = (const coreturn &) = delete;
    coreturn &operator = (coreturn &&s) {
        coro = s.coro;
        s.coro = nullptr;
        return *this;
    }
    struct promise {
        friend struct coreturn;
        promise() {
            std::cout << "Promise created" << std::endl;
        }
        ~promise() {
            std::cout << "Promise died" << std::endl;
        }
        auto return_value(T v) {
            std::cout << "Got an answer of " << v << std::endl;
            value = v;
            return std::experimental::suspend_never{};
        }
        auto final_suspend() {
            std::cout << "Finished the coro" << std::endl;
            return std::experimental::suspend_always{};
        }
        void unhandled_exception() {
            std::exit(1);
        }
    private:
        T value;
    };
protected:
    T get() {
        return coro.promise().value;
    }
    coreturn(handle_type h)
    : coro(h) {
        std::cout << "Created a coreturn wrapper object" << std::endl;
    }
    handle_type coro;
};

A final problem

There is one final thing we need to think about. What happens if we fetch the answer from the coroutine more than once? We can change main like this:

int main() {
    std::cout << "coro1" << std::endl;
    auto a = answer();
    std::cout << "Got a coroutine, let's get a value" << std::endl;
    auto v = a.get();
    std::cout << "And the coroutine value is: " << v << std::endl;
    v = a.get();
    std::cout << "And the coroutine value is still: " << v << std::endl;
    return 0;
}

On my machine I get this output:

Promise created
Send back a lazy
Created a coreturn wrapper object
Started the coroutine, put the brakes on!
Coreturn wrapper moved
Coreturn wrapper gone
Got a coroutine, let's get a value
We got asked for the return value…
Thinking deep thoughts…
Got an answer of 42
Finished the coro
And the coroutine value is: 42
We got asked for the return value…

And then it crashes. What happens of course is that we resume a coroutine which is already at its final suspension point, which means we run off the end of it and it's automatically destroyed for us. This ends badly.

What we need to do is make sure we only resume the coroutine the first time get is called. Luckily there is a simple way to do this:

T get() {
    std::cout << "We got asked for the return value..." << std::endl;
    if ( not this->coro.done() ) this->coro.resume();
    return coreturn<T>::get();
}

done will return true only when the coroutine is at its final suspension point (the one after the body is exited). With that change we have a nice reliable lazy function that will only ever be evaluated once.


Next time we're going to extend this further and start to look at the co_yield mechanism which will allow us to build some generators.


Categories:
Posted: 2nd July, 2017 16:12 (UTC)
First announced.