Welcome to LWN.net
The following subscription-only content has been made available to you
by an LWN subscriber. Thousands of subscribers depend on LWN for the
best news from the Linux and free software communities. If you enjoy this
article, please consider accepting the trial offer on the right. Thank you
for visiting LWN.net!
|
|
By Jake Edge
May 13, 2015
It is already possible to create coroutines for
asynchronous processing in Python.
But a recent proposal would elevate coroutines to a full-fledged language
construct, rather than treat them as a type of generator as they are
currently. Two new keywords, async and await, would be
added to the language to support coroutines as first-class Python features.
A coroutine is a kind of function that can
suspend and resume its execution at various pre-defined locations in its
code. Subroutines are a special case of coroutines that have just a single
entry point and complete their execution by returning to their caller.
Python's coroutines (both the existing generator-based and the newly
proposed variety) are not fully general, either, since they can only
transfer control back to their caller when suspending their execution,
as opposed to switching to some other coroutine as they can in the general case.
When coupled with an event loop, coroutines can be used to do asynchronous
processing, I/O in particular.
Python's current coroutine support is based on the enhanced generators from PEP 342, which was
adopted into Python 2.5. That PEP changed the yield statement
to be an expression, added several new methods for generators
(send(), throw(), and close()), and ensured that
close() would be called when generators get garbage-collected.
That functionality was further enhanced in Python 3.3 with PEP 380, which
added the yield from expression to allow a generator to
delegate some of its functionality to another generator (i.e. a sub-generator).
But all of that ties coroutines to generators, which can be confusing and
also limits where in the code it is legal to make an asynchronous call. In
particular, the with and for statements could
conceptually use an asynchronous call to a coroutine, but cannot because
the language syntax does not allow yield expressions in those
locations. In addition, if a refactoring of the coroutine moves the
yield or yield from out of the function (into
a called function, for example), it no
longer is treated as a coroutine, which can lead to non-obvious errors; the asyncio module
works around this deficiency by using a @asyncio.coroutine decorator.
PEP 492 is meant to
address all of those issues. The ideas behind it were first raised by Yury Selivanov on the python-ideas
mailing list in
mid-April, it was enthusiastically embraced by many in that thread, and by
May 5 it had been accepted for
Python 3.5 by Guido van Rossum. Not only that, but the implementation
was merged on May 12.
It all moved rather quickly,
though it was discussed at length in multiple threads on both python-ideas
and python-dev.
The changes are fairly straightforward from a syntax point of view:
async def read_data(db):
data = await db.fetch('SELECT ...')
...
That example (which comes from the PEP) would create a
read_data()
coroutine using the new
async def construct. The
await expression would suspend execution of
read_data()
until the
db.fetch() awaitable completes and returns its result.
await is similar to
yield from, but it validates
that its argument is an awaitable.
There are several different types of awaitable. A native coroutine object,
as returned by calling a native coroutine (i.e. one defined with
async def) is an awaitable, as is a generator-based coroutine
that has been decorated with @types.coroutine. Future
objects, which represent some processing that will complete in the future,
are also awaitable. The __await__() magic method is present for
objects that are awaitable.
There is a problem that occurs when adding new keywords to a language,
however. Any variables that are named the same as the keyword suddenly
turn into syntax
errors. To avoid that problem, Python 3.5 and 3.6 will "softly
deprecate" async and await as variable names, but not
have them be a syntax error. The parser will keep track of
async def blocks and treat the keywords differently within
those blocks, which will allow existing uses to continue to function.
There are two other uses of async that will come with the new
feature: asynchronous context managers (i.e. with) and iterators
(i.e. for). Inside a coroutine, these two constructs can be used
as shown in these examples from the PEP:
async def commit(session, data):
...
async with session.transaction():
...
await session.update(data)
...
...
async for row in Cursor():
print(row)
Asynchronous context managers must implement two magic
async methods,
__aenter__() and
__aexit__(), both of which return
awaitables, while an asynchronous
iterator would implement
__aiter__() and
__anext__().
Those are effectively the asynchronous versions of the magic methods used
by the existing synchronous context manager and iterator.
The main question early on was whether the deferred "cofunction" feature (PEP 3152) might be
a better starting point. The author of that PEP, Greg Ewing, raised the issue, but there was a lot of
agreement that the syntax proposed by Selivanov was preferable to
the codef, cocall, and the like from Ewing's proposal.
There was a fair amount of back and forth, but the cofunction syntax for
handling
certain cases got rather complex and non-Pythonic in the eyes of some. Van
Rossum summarized the problems with
cofunctions while rejecting that approach.
There were also several suggestions of additional asynchronous features
that could be added,
but nothing that seemed too urgent. There was some bikeshedding on
the keywords (and their order, some liked def async, for
example). The precedence of await was also debated at some length,
with the result being that, unlike yield and
yield from that have the lowest precedence, await
has a high precedence: between exponentiation and
subscripting, calls, and attribute references.
Mark Shannon complained that there was no
need to add new syntax to do what Selivanov was proposing. Others had made
similar observations and it was not disputed by Selivanov or other
proponents. The idea is to make it easier to program with
coroutines. Beyond that, Van Rossum wants the places where a coroutine can
be suspended to be obvious from reading the
code:
But new syntax is the whole point of the PEP. I want to be able to
*syntactically* tell where the suspension points are in coroutines.
Currently this means looking for yield [from]; PEP 492 just adds looking
for await and async [for|with]. Making await() a function defeats the
purpose because now aliasing can hide its presence, and we're back in the
land of gevent or stackless (where *anything* can potentially suspend the
current task). I don't want to live in that land.
Over a two to three week period, multiple versions of the PEP were posted
and debated, with Selivanov patiently explaining his ideas or modifying
them based on the feedback. For a feature that seems likely to be quite
important in Python's future, the whole process went remarkably quickly—and
smoothly. It will probably take a fair amount more time for those ideas to
sink in more widely with Python developers.
(
Log in to post comments)