(cache) Python coroutines with async and await [LWN.net]

Welcome to LWN.net

The following subscription-only content has been made available to you by an LWN subscriber. Thousands of subscribers depend on LWN for the best news from the Linux and free software communities. If you enjoy this article, please consider accepting the trial offer on the right. Thank you for visiting LWN.net!

Free trial subscription

Try LWN for free for 1 month: no payment or credit card required. Activate your trial subscription now and see why thousands of readers subscribe to LWN.net.

By Jake Edge
May 13, 2015

It is already possible to create coroutines for asynchronous processing in Python. But a recent proposal would elevate coroutines to a full-fledged language construct, rather than treat them as a type of generator as they are currently. Two new keywords, async and await, would be added to the language to support coroutines as first-class Python features.

A coroutine is a kind of function that can suspend and resume its execution at various pre-defined locations in its code. Subroutines are a special case of coroutines that have just a single entry point and complete their execution by returning to their caller. Python's coroutines (both the existing generator-based and the newly proposed variety) are not fully general, either, since they can only transfer control back to their caller when suspending their execution, as opposed to switching to some other coroutine as they can in the general case. When coupled with an event loop, coroutines can be used to do asynchronous processing, I/O in particular.

Python's current coroutine support is based on the enhanced generators from PEP 342, which was adopted into Python 2.5. That PEP changed the yield statement to be an expression, added several new methods for generators (send(), throw(), and close()), and ensured that close() would be called when generators get garbage-collected. That functionality was further enhanced in Python 3.3 with PEP 380, which added the yield from expression to allow a generator to delegate some of its functionality to another generator (i.e. a sub-generator).

But all of that ties coroutines to generators, which can be confusing and also limits where in the code it is legal to make an asynchronous call. In particular, the with and for statements could conceptually use an asynchronous call to a coroutine, but cannot because the language syntax does not allow yield expressions in those locations. In addition, if a refactoring of the coroutine moves the yield or yield from out of the function (into a called function, for example), it no longer is treated as a coroutine, which can lead to non-obvious errors; the asyncio module works around this deficiency by using a @asyncio.coroutine decorator.

PEP 492 is meant to address all of those issues. The ideas behind it were first raised by Yury Selivanov on the python-ideas mailing list in mid-April, it was enthusiastically embraced by many in that thread, and by May 5 it had been accepted for Python 3.5 by Guido van Rossum. Not only that, but the implementation was merged on May 12. It all moved rather quickly, though it was discussed at length in multiple threads on both python-ideas and python-dev.

The changes are fairly straightforward from a syntax point of view:

    async def read_data(db):
        data = await db.fetch('SELECT ...')
	...

That example (which comes from the PEP) would create a read_data() coroutine using the new async def construct. The await expression would suspend execution of read_data() until the db.fetch() awaitable completes and returns its result. await is similar to yield from, but it validates that its argument is an awaitable.

There are several different types of awaitable. A native coroutine object, as returned by calling a native coroutine (i.e. one defined with async def) is an awaitable, as is a generator-based coroutine that has been decorated with @types.coroutine. Future objects, which represent some processing that will complete in the future, are also awaitable. The __await__() magic method is present for objects that are awaitable.

There is a problem that occurs when adding new keywords to a language, however. Any variables that are named the same as the keyword suddenly turn into syntax errors. To avoid that problem, Python 3.5 and 3.6 will "softly deprecate" async and await as variable names, but not have them be a syntax error. The parser will keep track of async def blocks and treat the keywords differently within those blocks, which will allow existing uses to continue to function.

There are two other uses of async that will come with the new feature: asynchronous context managers (i.e. with) and iterators (i.e. for). Inside a coroutine, these two constructs can be used as shown in these examples from the PEP:

    async def commit(session, data):
	...

	async with session.transaction():
	    ...
	    await session.update(data)
	    ...
        ...
        async for row in Cursor():
            print(row)

Asynchronous context managers must implement two magic async methods, __aenter__() and __aexit__(), both of which return awaitables, while an asynchronous iterator would implement __aiter__() and __anext__(). Those are effectively the asynchronous versions of the magic methods used by the existing synchronous context manager and iterator.

The main question early on was whether the deferred "cofunction" feature (PEP 3152) might be a better starting point. The author of that PEP, Greg Ewing, raised the issue, but there was a lot of agreement that the syntax proposed by Selivanov was preferable to the codef, cocall, and the like from Ewing's proposal. There was a fair amount of back and forth, but the cofunction syntax for handling certain cases got rather complex and non-Pythonic in the eyes of some. Van Rossum summarized the problems with cofunctions while rejecting that approach.

There were also several suggestions of additional asynchronous features that could be added, but nothing that seemed too urgent. There was some bikeshedding on the keywords (and their order, some liked def async, for example). The precedence of await was also debated at some length, with the result being that, unlike yield and yield from that have the lowest precedence, await has a high precedence: between exponentiation and subscripting, calls, and attribute references.

Mark Shannon complained that there was no need to add new syntax to do what Selivanov was proposing. Others had made similar observations and it was not disputed by Selivanov or other proponents. The idea is to make it easier to program with coroutines. Beyond that, Van Rossum wants the places where a coroutine can be suspended to be obvious from reading the code:

But new syntax is the whole point of the PEP. I want to be able to *syntactically* tell where the suspension points are in coroutines. Currently this means looking for yield [from]; PEP 492 just adds looking for await and async [for|with]. Making await() a function defeats the purpose because now aliasing can hide its presence, and we're back in the land of gevent or stackless (where *anything* can potentially suspend the current task). I don't want to live in that land.

Over a two to three week period, multiple versions of the PEP were posted and debated, with Selivanov patiently explaining his ideas or modifying them based on the feedback. For a feature that seems likely to be quite important in Python's future, the whole process went remarkably quickly—and smoothly. It will probably take a fair amount more time for those ideas to sink in more widely with Python developers.

Did you like this article? Please accept our trial subscription offer to be able to see more content like it and to participate in the discussion.

(Log in to post comments)

Python coroutines with async and await

[LWN subscriber-only content]

Welcome to LWN.net

Free trial subscription