Article / 4th Jun 2018

A Django Async Roadmap

I think that the time has come to start talking seriously about bringing async functionality into Django itself, and so I have been working on a draft "roadmap" for what I think this might look like. I've run this past a few people - some of who were Django core members, and some who weren't - but I'm now posting it up for public feedback (see the end for where to discuss this).

I'm not saying we should do exactly what I outline here, or indeed that we must do it at all, but it's the culmination of quite a lot of research into asynchronous support, what it means, and how we can add it to Django in a feasible way rather than rewriting everything.

Firstly, let's start with what this is not. It's not, in any form, an attempt to bring WebSocket or any other protocol handling into Django - but it will probably help their implementation in other projects. Channels will still exist for WebSocket handling (that's covered below).

Instead, the goal is to make Django a world-class example of what async can enable for HTTP requests, such as:

Doing ORM queries in parallel
Allowing views to query external APIs without blocking threads
Running slow-response/long-poll endpoints alongside each other efficiently

And, on top of all this, bringing easy performance improvements to any project that spends a majority of time blocking on databases or sockets (which is most projects!)

Channels has been a great proving ground for a lot of the ideas and techniques to use, and I think will continue to serve a purpose for quite a while as the place for WebSockets, broadcast groups, channel layers and related functionality, but it can only do so much sitting underneath an entirely synchronous Django.

At the same time, it's imperative that we keep Django backwards-compatible with existing code and, especially, don't make it any more complicated for beginners to dive in and get started. We shouldn't expect every developer to know about, or want, asynchronous code support.

This roadmap lays out plans for a phased introduction of async functionality into Django in a way that can remain backwards-compatible, and gradually make the core of Django fully asynchronous over the course of several releases.

The end goals are:

To end up with a majority of the blocking parts of Django (such as sessions, auth, the ORM, handlers, etc.) being async-native with a synchronous wrapper exposed on top where needed (for user simplicity and/or backwards compatibility)
To keep Django's familiar layout of models/views/templates/middleware, albeit with a few changes as necessary.
To have Django be at least the same speed, if not faster, than it was before, and not cause significant performance regressions at any stage of this plan.
To allow people to write fully-async websites if they desire, from the request handler all the way down to the view and ORM, but to not force this as the default.
To adopt new talent into the Django team by ensuring the changes are done in a way where we can have new contributors helping out on large-scale features, the kind that traditionally caused people to be added to the core team but have been more scarce recently.

This is no small endeavour, and I expect the overall effort to take on the order of years. That said, I believe it can be done incrementally in a way that provides benefits in every release.

Timing

Why now?

Django 2.1 will be the first release that only supports Python 3.5 and up, and so this provides us the perfect place to start working on async-native code, as async def and similar native support for coroutines was not present in Python before.
Asynchronous database backends for Python are starting to appear, and even ones that aren't natively async have been proven to work well in a threadpool.
The Web is slowly shifting to use cases that prefer high concurrency workloads and large parallelisable queries. (especially API patterns like GraphQL)

Why not now?

asyncio still has growing pains, and there are alternative async frameworks for Python that propose to replace it but are incompatible.
- There is sufficient momentum behind asyncio that I think it is going to end up staying. I would like to explore ways that we can help gradually evolve it, though, rather than replacing it wholesale, and maybe ways we can provide common async operations as safe higher-level primitives.
Developers are still unfamiliar developing Python applications with async support, and there's a lack of documentation, tutorials and tooling to help with it.
- Django could prove to be a good catalyst for helping these materials get made, and by default we'll still try and keep people away from most async stuff (apart from maybe await-ing a few ORM queries).

Changes

I'll split Django into its main features and address them individually, as we ideally want to work on several things in parallel (though I imagine Request Path and ORM come first - see "Timeline" below).

Request Path

The Django request path is approximately:

A WSGI Handler, which turns the raw request into a HTTPRequest object, and a HTTPResponse into a raw response
A URL routing layer, which maps incoming requests to a view
Middleware, which processes requests and responses on the way to and from a view
Views, which run business logic

The first two parts - handlers and the URL routing layer - are internal to Django and expose very few direct public APIs, and so rewriting these to be natively async-capable is easier. Middleware and views are, however, provided by our users, and so backwards compatibility must be maintained. Especially important is that views are kept simple - we should not require users to build something like Channels-style consumers if they don't need the power.

This would be achieved by changing the middleware and view flow to be fully asynchronous, and wrapping any synchronous middleware in a threadpool-based wrapper that allows it to still execute. We may even change to just have ASGI middleware rather than Django middleware - the two are now quite similar in concept after the middleware changes in the last few Django releases - but there would be some extra work needed to make them line up.

The bottom layer of this middleware stack would then inspect to see if the thing it was calling (the item pointed to by the URL routing) was a synchronous view, asynchronous view or maybe even an ASGI application (if we want to support those as well as views) and execute it appropriately.

Currently, detecting if something is an async view or ASGI application would not be possible as both would just look like a callable that returns a coroutine; there are a couple of potential solutions to this. We can detect if something is a synchronous View, however, and preserve backwards compatibility (and run it in a threadpool).

Even if there is a fully asynchronous path through the handler, WSGI compatibility has to also be maintained; in order to do this, the WSGIHandler will coexist alongside a new ASGIHandler, and run the system inside a one-off eventloop - keeping it synchronous externally, and asynchronous internally.

This will allow async views to do multiple asynchronous requests and launch short-lived coroutines even inside of WSGI, if you choose to run that way. If you choose to run under ASGI, however, you will then also get the benefits of requests not blocking each other and using less threads.

ORM

The ORM is arguably the biggest and most complex part of Django, and thus will be the one that takes the most work to convert - especially as async database backends are still an area of active research and development in Python.

Fortunately, most of that complexity is hidden behind a relatively small public API. The only real blocking operations you can do against the ORM are:

Evaluating a QuerySet
Introspecting tables, columns or indexes
Calling save/update/etc. on a model instance
Loading a lazy ForeignKey/RelatedField attribute on a model instance

Sadly, it is not possible to have asynchronous attribute access in Python 3 and so we cannot preserve the lazy ForeignKey attribute traversal. However, everything else can be kept:

QuerySets gain both __await__ (which returns a full result list of all rows) and __aiter__ (paginated/cursor-based results)
Introspection APIs just become awaitable functions
Save, update and similar model instance methods become awaitable functions
Related field references on model instances only work if they were included in a select_related/prefetch_related statement

Internally, we would first run most of the existing ORM code - the query builder and database backends - inside a threadpool, providing asynchronous support over the top. Then, we would progressively move the "synchronous barrier" down towards the cursor wrappers, allowing databases that don't have async backends to then decide to run that in a thread themselves, while async-native backends can have full control over what they do.

Migrations would initially remain unchanged, as they aren't designed to run in a request environment. They could potentially be made async at a later date, though it's unclear if this has any benefits.

Templates

The Django template renderer is difficult to change, as we have found previously, but many uses of QuerySets come from templates and so we need to think about them.

Initially, we would just leave it as a synchronous render inside of a threadpool, which will run those ORM queries in blocking mode inside a thread, effectively working as they do now but providing an async-capable render method for use in asynchronous views.

As a second phase, we would look at what it would take to make it async-capable, and if we want to do that work rather than fully recommending another template language (the most popular alternative, Jinja2, is async-capable). The tricky part of this would be determining when things are called sync vs async; how do we deference variables/handle things like the {% for %} tag? Should we even allow querysets to be called in templates any more?

Forms

Form validation and saving would have to go fully async. In this case, we could probably solve the "is it sync or async?" problem with a constructor argument to the form which says if it's async or sync. If we ever wanted to make async the default, we could change this argument from default-False to default-True.

Caching

Django's cache layer is reasonably simple, and we would simply make a parallel, asynchronous version of the existing API. This would, unfortunately, have to be namespaced or prefixed, since you can't have a function that's both synchronously callable and that also returns an awaitable.

Sessions

Session backends will need to become async-capable; since they're only called directly by Django, we can have them advertise if they are async or not via an attribute, and then run them in a threadpool if needed. Eventually, we would assume all backends are async-capable and ship a "wrapper backend" that takes a sync backend and turns it async, so people can still write them synchronously if needed.

The few end-user session functions will gain asynchronous versions. Most interaction with sessions is reading and writing from request.session, which does not need to be async as it's not saved as soon as you write to it.

Authentication

Similarly to session backends, authentication backends will also need to be optionally-async with an attribute advertising their status. End-user functions, like login/logout, will need optional-async versions as well. Namespacing here is again tricky; we'll need to provide both sync and async versions of these functions for a good while, and they're top-level in a module.

Admin

The core admin classes can be rebuilt to be more async once the rest of the work they would need is complete, but initially it would be left alone and serve as a decent test of the backwards-compatibility of the changes.

Email

The core email sending code will initially remain as synchronous-in-threadloop, but gain async interfaces to trigger email sending. Later on, we can then investigate switching it to use a fully asynchronous SMTP transport.

Static files

The only part here that needs changing would be the static file serving code, which can just run under the backwards-compatibility synchronous layer initially and then be upgraded to full-async at leisure.

Signals

Signals would have to change to have the ability to be called asynchronously - their async-ness would have to be part of their register call, and they would have to be called asynchronously if they had asynchronous listeners.

Other Areas

There are other assorted parts of Django that will need some async work, but none of them present significant challenges like the above sections. In general, we would only convert function calls that have the potential to block (or to call user-supplied code) to be asynchronous, and any converted function would have a deprecation plan and name change.

Timeline

This is very rough, but the idea is to make sure we don't disrupt an LTS release. It is also… ambitious, but a good roadmap always is.

It's also worth noting that the work can be done iteratively so there's no risk of the whole thing being some mammoth 3-year project/branch that might not land. The idea would be to keep this work as close to master as possible.

Django 2.1: Current in-progress release. No async work.
Django 2.2: Initial work to add async ORM and view capability, but everything defaults to sync by default, and async support is mostly threadpool-based.
Django 3.0: Rewrite the internal request handling stack to be entirely asynchronous, add async middleware, forms, caching, sessions, auth. Start deprecation process for any APIs that are becoming async-only.
Django 3.1: Continue improving async support, potential async templating changes
Django 3.2: Finish deprecation process and have a mostly-async Django.

The ORM and request handling portions are the important things to get working initially; most of the other areas covered above can be done as and when people become available to work on them.

I would, in fact, be reasonably happy if we only converted the request handling path as a first move; this is the only thing we would need to do to unblock people who wish to do their own asynchronous development on Django.

Safety

Perhaps the main flaw of Python's async implementation is the fact that you can accidentally call synchronous functions from asynchronous contexts - and they'll happily work fine and block the entire event loop until they're finished.

Since Django is meant to provide a safe environment for our users, we'll add detection code to all synchronous user-called endpoints in Django that are blocking/have async alternatives, and alert if they are called on a thread with an active async event loop. This should prevent most accidental usage of synchronous code, which is likely to happen most during initial porting of existing code from a sync to an async context.

This problem also extends to overriding key lookup (__getitem__), attribute lookup (__getattr__), and other operators - you cannot do blocking operations inside these with Python's async solution, as they cannot be called in an asynchronous way. Django fortunately does very little of this, but those uses that exist - such as lazily resolving ForeignKeys, as mentioned above - will have to fundamentally change.

Funding

We cannot expect such a large endeavour to happen purely on the back of volunteer work alone. We should expect to have to raise funds for this project and being able to pay the bigger contributors for their time (and not just coding, but also other aspects like technical writing, project management, etc.)

This could come in several forms - including Kickstarter-style campaigns, direct corporate sponsorship of the development, or directing more money to DSF donations generally with this outlined as a concrete funding result.

What does this mean for Channels?

I would propose that Channels continues to exist as a place for WebSocket handling code, multiplexing, and similar challenges, but stops having to include things like the ASGI handler and authentication/session support.

The core routing and protocol switching parts of Channels would likely move into Django itself as the layer that exists underneath Views; the generic consumers and channel layer support would not.

Daphne would continue to be maintained until enough alternative, mature ASGI servers existed, at which point we would seek to either find new permanent maintainers or sunset it.

Further Discussion

This isn't just some plan I expect us to go with and stick to perfectly - it's meant to be a concrete starting point for discussing what this would mean for Django, if we should invest in it, and what it means for us to do that.

To that end, I have started a discussion on the django-developers mailing list about this post; please chime in there if you can with feedback and comments. If you don't want to post publicly on the list, you are also more than welcome to email comments to me at andrew@aeracode.org and I will aggregate and relay them to everyone anonymously.

I think this presents an exciting way to take Django into the future and lead the way in terms of building a world-class web framework with asynchronous support - but I still want to make sure everyone is on board first!