Sorting querysets with NULLs in Django

One thing which I’ve found surprisingly hard to do in Django over the years is sort a list of items on a database column when that column might have NULLs in it. The problem is that the default Postgres behaviour is to give NULL a higher sort value than everything else, so when sorting in descending order, all the NULLs appear at the top. This is particularly strange if, say, you want a list of items sorted by most recently updated, and the ones at the top are the ones that have never had an update.

If we were writing the SQL directly, we could just add NULLS LAST to the ORDER BY clause, but that would be a really rubbish reason to drop down to raw SQL mode in Django.

Fortunately, Django 1.8 has introduced a new feature: Func() expressions. These expressions let you run SQL-level functions like LOWER(), SUM() etc. and annotate your queryset with a new column containing the result. I didn’t want to run a database function, but what I discovered was that it is really easy to subclass and make your own Func() expression, giving you access to a template for generating SQL! The base class looks something like:

class Func(Expression):
    function = None
    template = '%(function)s(%(expressions)s)'

    # Other stuff

Normally you are supposed to override the function attribute, which then gets fed into the template and wrapped around the existing SQL statement. However, it is equally possible to override the template attribute itself and get rid of the wrapping function altogether! This led me to create my own “function” which just returns a boolean to say whether the current SQL statement (completely generated by the ORM and untouched by human hands) evaluates to NULL:

class IsNull(Func):
    template = '%(expressions)s IS NULL'

Welcome to Hacksville!

From here it’s simply a case of annotating your existing queryset with this field, and then adding it to the .order_by() statement:

MyModel.objects\
    .annotate(last_update=Max('update__created_date'))\
    .annotate(last_update_isnull=IsNull('last_update'))\
    .order_by('last_update_isnull', '-last_update')

First we sort on last_update_isnull in ascending order (it will be either true or false, so all the “yes, it is NULL” items will go to the bottom of the list). Then we use the last_update field, which is what we really want to sort on, as the secondary sort field, safe in the knowledge that all the NULLs are already out of the way.

So there you have it: my moderately hacky solution that is quite small and crucially still plays nicely with the ORM 🙂