Category Archives: Python

FP: a quiet revolution

Functional Programming (FP) is taking over the programming world, which is kind of weird since it has taken over the programming world at least once before. If you aren’t a developer then you may never even have heard of it. This post aims to explain what it is and why you might care about it even if you never program a computer – and how you might go about adopting it in your organisation.

Not too long ago, every graduate computer scientist would have spent some time doing FP, perhaps in a language called LISP. FP was considered a crucial grounding in CompSci and some FP texts gained a cult following. The legendary “wizard book” Structure and Intpretation of Computer Programs was the MIT Comp-101 textbook.

Famously a third of students dropped out in their first semester because they found this book too difficult.

I think this was likely to be how MIT taught the course as much as anything, but nevertheless functional programming (and the confusingly-brackety LISP) started getting a reputation for being too difficult for mere mortals.

Along with the reputation for impossibility, universities started getting a lot of pressure to turn out graduates with “useful skills”. This has always seemed a bit of a waste of university’s time to me – they are very specifically not supposed to be useful in that sense. I’d much rather graduates got the most out of their limited time at university learning the things that only universities can provide, rather than programming which, bluntly, we can do a lot more effectively than academics.

Anyway, I digress.

The rise of Object Orientation

So it came to pass that universities decided to stop teaching academic languages and start teaching Java. Ten years ago I’d guess well over half of all university programming courses taught Java. Java is not a functional language and until recently had no functional features. It was unremittingly, unapologetically Object Oriented (OO).  Contrary to Sun’s bombastic marketing when they released Java (and claimed it was a revolution in programming) Java as a language was about as mainstream and boring as it could be. The virtual machine (the JVM) was much more interesting, and I’ll come back to that later.

(OO is not in itself opposed to FP, and vice versa. Many languages – as we’ll see – are able to support both paradigms. However OO, particularly the way it was taught with Java, encourages a way of thinking about data flowing through a system, and this leads to data being copied and duplicated… which leads to all sorts of problems managing state. FP meanwhile tends to think in terms of transformation of data, and relies on the programming language to deal with the menial tasks of deciding when to copy data whilst doing so. When computers were slow this could cause significant bottlenecks, but computers these days are huge and fast and you can get more of them easily, so it doesn’t matter nearly as much – until it suddenly does of course. Anyway, I digress again.)

In the workplace meanwhile FP had never really taken off. The vast majority of software is written using imperative languages like ‘C’ or Object Oriented languages like.. well pretty much any language you’ve heard of. Perl, Python, Java, C#, C++ – Object Orientation had taken over the world. FP’s steep learning curve, reputation for impossibility, academic flavour and at times performance constraints made it seem something only a lunatic would select.

And so did some proclaim, Fukuyama-like, the “end of history”: Object Orientation was the one true way to build software. That is certainly how it seemed until a few years ago.

Then something interesting started happening, a change that has had far-reaching effects on many programming languages: existing OO languages started gaining FP features. Python was an early adopter here but a lot of OO languages started gaining a smattering of FP features.

This has provided an easy way for existing programmers to be exposed to how FP thinks about problem solving – and the way one approaches a large problem in FP can be dramatically different to traditional OO approaches.

Object Oriented software has been so dominant that its benefits and drawbacks are rarely discussed – in fact the idea that it might have drawbacks would have been thought madness by many until recently.

OO does have real benefits. It provides a process-driven approach for analysis, where your problem domain is analysed first for the data that exists in the business or whatever, and then behaviours are hooked onto these data. A large system is decomposed by responsibilities towards data.

There are some other things where OO helps too, although they don’t maybe sound so great. Mediocre can be good enough – and when you’ve got hundreds of programmers on a mammoth government project you need to be able to accommodate the mediocre. The reliance on process and good enough code means your developers become more replaceable. Need one thousand identical carbon units? Lets go!

Of course you don’t get that for free. The resulting code often has problems, and sometimes severe ones. Non-localised errors are a major problem, with causes and effects being removed by billions of lines of code and sometimes weeks of execution. State becomes a constant problem, with huge amounts of state being passed around inside transactions. Concurrency issues are common as well, with unnecessary locking or race conditions being rife.

The outcome is also often very difficult to debug, with a single thread of execution sometimes involving hundreds of cooperating objects, each of which only contributes only one or two lines of code.

The impact of this is difficult to quantify, but I don’t think it is unfair to put some of the epic failures large scale IT to the choices of these tools and languages.


Strangely one of the places where FP is now being widely practised is in front-end applications, specifically Single-Page Applications (SPAs) written in frameworks like React.

The most recent Javascript standards (officially called, confusingly, ECMAScript) have added oodles of functional syntax and behaviour, to the extent that it is possible to write it almost entirely functionally. Furthermore, these new javascript standards can be transpiled into previous versions of Javascript, meaning they will run pretty much anywhere.

Since pretty much every device in the world has a Javascript virtual machine installed, this means we now have the worlds largest ever installed based of functional computers – and more and more developers are using it.

The FP frameworks that are emerging in Javascript to support functional development are bringing some of the more recent research and design from universities directly into practice in a way that hasn’t really happened previously.


The other major movement has been the development of functional languages that run on the Java Virtual Machine (the JVM). Because these languages can call Java functions it means they come with a ready-built standard library that is well known and well documented. There’s a bunch of these with Clojure and Scala being particularly prominent.

These have allowed enterprise teams with a large existing commitment to Java to start developing in FP without throwing away their existing code. I suspect it has also allowed them to retain some senior staff who would otherwise have left through boredom.

Ironically Java itself has added loads of functional features over the last few years, in particular lambda functions and closures.

How to adopt FP

We’ve adopted FP for some projects with some real success and there is a lot of enthusiasm for it here (and admittedly the odd bit of resistance too). We’ve learned a few things about how to go about adopting it.

First, you need to do more design work. Particularly with developers who are new to the approach, spending more time in design is of great benefit – but I would argue this is generally the case in our industry. An abiding problem is the resistance to design and the need to just write some code. Even in the most agile processes design is critical and should not be sidelined. Accommodating this design work in your process is crucial. This doesn’t mean big fat documents, but it does mean providing the space to think and for teams to discuss design before implementation, perhaps with spikes for prototypes.

Second, get up to speed with supporting libraries that work in a functional manner, and avoid those that are brutally OO. Just using ramda encourages developers to work in a more functional manner and develop composable interfaces.

Third, there is still a problem with impenetrable jargon, and it can be a turn off. Avoid talking about monads, even if you think you need one 😉

Finally, you really do not need to be smarter to work with FP. There is a learning curve and it is really quite steep in places, but once you’ve climbed it the kinds of solutions you develop feel just as natural as the OO ones did previously.






There is a new version of gunicorn, 19.0 which has a couple of significant changes, including some interesting workers (gthread and gaiohttp) and actually responding to signals properly, which will make it work with Heroku.

The HTTP RFC, 2616, is now officially obsolete. It has been replaced by a bunch of RFCs from 7230 to 7235, covering different parts of the specification. The new RFCs look loads better, and it’s worth having a look through them to get familiar with them.

Some kind person has produced a recommended set of SSL directives for common webservers, which provide an A+ on the SSL Labs test, while still supporting older IEs. We’ve struggled to find a decent config for SSL that provides broad browser support, whilst also having the best levels of encryption, so this is very useful.

A few people are still struggling with Git.  There are lots of git tutorials around the Internet, but this one from Git Tower looks like it might be the best for the complete beginner. You know it’s for noobs, of course, because they make a client for the Mac 🙂

I haven’t seen a lot of noise about this, but the EU has outlawed pre-ticked checkboxes.  We have always recommended that these are not used, since they are evil UX, but now there’s an argument that might persuade everyone.

Here is a really nice post about splitting user stories. I think we are pretty good at this anyhow, but this is a nice way of describing the approach.

@monkchips gave a talk at IBM Impact about the effect of Mobile First. I think we’re on the right page with most of these things, but it’s interesting to see mobile called-out as one of the key drivers for these changes.

I’d not come across the REST Cookbook before, but here is a decent summary of how to treat PUT vs POST when designing RESTful APIs.

Fastly have produced a spectacularly detailed article about how to get tracking cookies working with Varnish.  This is very relevant to consumer facing projects.

This post from Thought Works is absolutely spot on, and I think accurately describes an important aspect of testing The Software Testing Cupcake.

As an example for how to make unit tests less fragile, this is a decent description of how to isolate tests, which is a key technique.

The examples are Ruby, but the principle is valid everywhere. Still on unit testing, Facebook have open sourced a Javascript unit testing framework called Jest. It looks really very good.

A nice implementation of “sudo mode” for Django. This ensures the user has recently entered their password, and is suitable for protecting particularly valuable assets in a web application like profile views or stored card payments.

If you are using Redis directly from Python, rather than through Django’s cache wrappers, then HOT Redis looks useful. This provides atomic operations for compound Python types stored within Redis.

Using mock.patch in automated unit testing

Mocking is a critical technique for automated testing. It allows you to isolate the code you are testing, which means you test what you think are testing. It also makes tests less fragile because it removes unexpected dependencies.

However, creating your own mocks by hand is fiddly, and some things are quite difficult to mock unless you are a metaprogramming wizard. Thankfully Michael Foord has written a mock module, which automates a lot of this work for you, and it’s awesome. It’s included in Python 3, and is easily installable in Python 2.

Since I’ve just written a test case using mock.patch, I thought I could walk through the process of how I approached writing the test case and it might be useful for anyone who hasn’t come across this.

It is important to decide when you approach writing an automated test what level of the system you intend to test. If you think it would be more useful to test an orchestration of several components then that is an integration test of some form and not a unit test. I’d suggest you should still write unit tests where it makes sense for this too, but then add in a sensible sprinkling of integration tests that ensure your moving parts are correctly connected.

Mocks can be useful for integration tests too, however the bigger the subsystem you are mocking the more likely it is that you want to build your own “fake” for the entire subsystem.

You should design fake implementations like this as part of your architecture, and consider them when factoring and refactoring. Often the faking requirements can drive out some real underlying architectural requirements that are not clear otherwise.

Whereas unit tests should test very limited functionality, I think integration tests should be much more like smoke tests and exercise a lot of functionality at once. You aren’t interested in isolating specific behaviour, you want to make it break. If an integration test fails, and no unit tests fail, you have a potential hotspot for adding additional unit tests.

Anway, my example here is a Unit Test. What that means is we only want to test the code inside the single function being tested. We don’t want to actually call any other functions outside the unit under test. Hence mocking: we want to replace all function calls and external objects inside the unit under test with mocks, and then ensure they were called with the expected arguments.

Here is the code I need to test, specifically the ‘fetch’ method of this class:

class CloudImage(object):

    __metaclass__ = abc.ABCMeta

    blocksize = 81920
    def __init__(self, pathname, release, arch):
        self.pathname = pathname
        self.release = release
        self.arch = arch
        self.remote_hash = None
        self.local_hash = None

    def remote_image_url(self):
        """ Return a complete url of the remote virtual machine image """

    def fetch(self):
        remote_url = self.remote_image_url()"Retrieving {0} to {1}".format(remote_url, self.pathname))
            response = urllib2.urlopen(remote_url)
        except urllib2.HTTPError:
            raise error.FetchFailedException("Unable to fetch {0}".format(remote_url))
        local = open(self.pathname, "w")
        while True:
            data =
            if not data:

I want to write a test case for the ‘fetch’ method. I have elided everything in the class that is not relevant to this example.

Looking at this function, I want to test that:

  1. The correct URL is opened
  2. If an HTTPError is raised, the correct exception is raised
  3. Open is called with the correct pathname, and is opened for writing
  4. Read is called successive times, and that everything returned is passed to local.write, until a False value is returned

I need to mock the following:

  1. self.remote_image_url()
  2. urllib2.urlopen()
  3. open()
  5. local.write()

This is an abstract base class, so we’re going to need a concrete implementation to test. In my test module therefore I have a concrete implementation to use. I’ve implemented the other abstract methods, but they’re not shown.

class MockCloudImage(base.CloudImage):
    def remote_image_url(self):
        return "remote_image_url"

Because there are other methods on this class I will also be testing, I create an instance of it in setUp as a property of my test case:

class TestCloudImage(unittest2.TestCase):

    def setUp(self):
        self.cloud_image = MockCloudImage("pathname", "release", "arch")

Now I can write my test methods.

I’ve mocked self.remote_image_url, now i need to mock urllib2.urlopen() and open(). The other things to mock are things returned from these mocks, so they’ll be automatically mocked.

Here’s the first test:

    def test_fetch(self, m_open, m_urlopen):
        m_urlopen().read.side_effect = ["foo", "bar", ""]
        self.assertEqual(m_open.call_args,'pathname', 'w'))
        self.assertEqual(m_open().write.call_args_list, ['foo'),'bar')])

The mock.patch decorators replace the specified functions with mock objects within the context of this function, and then unmock them afterwards. The mock objects are passed into your test, in the order in which the decorators are applied (bottom to top).

Now we need to make sure our read calls return something useful. Retrieving any property or method from a mock returns a new mock, and the new returned mock is consistently returned for that method. That means we can write:


To get the read call that will be made inside the function. We can then set its “side_effect” – what it does when called. In this case, we pass it an iterator and it will return each of those values on each call.

Now we call call our fetch method, which will terminate because read eventually returns an empty string.

Now we just need to check each of our methods was called with the appropriate arguments, and hopefully that’s pretty clear how that works from the code above. It’s important to understand the difference between:




The first is the arguments passed to “open(…)”. The second are the arguments passed to:

local = open(); local.write(...)

Another test method, testing the exception is now very similar:

    def test_fetch_httperror(self, m_open, m_urlopen):
        m_urlopen.side_effect = urllib2.HTTPError(*[None] * 5)
        self.assertRaises(error.FetchFailedException, self.cloud_image.fetch)

You can see I’ve created an instance of the HTTPError exception class (with dummy arguments), and this is the side_effect of calling urlopen().

Now we can assert our method raises the correct exception.

Hopefully you can see how the mock.patch decorator saved me a spectacular amount of grief.

If you need to, it can be used as a context manager as well, with “with”, giving similar behaviour. This is useful in setUp functions particularly, where you can use the with context manager to create a mocked closure used by the system under test only, and not applied globally.

About us: Isotoma is a bespoke software development company based in York and London specialising in web apps, mobile apps and product design. If you’d like to know more you can review our work or get in touch.

Reviewing Django REST Framework

Recently, we used Django REST Framework to build the backend for an API-first web application. Here I’ll attempt to explain why we chose REST Framework and how successfully it helped us build our software.

Why Use Django REST Framework?

RFC-compliant HTTP Response Codes

Clients (javascript and rich desktop/mobile/tablet applications) will more than likely expect your REST service endpoint to return status codes as specified in the HTTP/1.1 spec. Returning a 200 response containing {‘status’: ‘error’} goes against the principles of HTTP and you’ll find that HTTP-compliant javascript libraries will get their knickers in a twist. In our backend code, we ideally want to raise native exceptions and return native objects; status codes and content should be inferred and serialised as required.

If authentication fails, REST Framework serves a 401 response. Raise a PermissionDenied and you automatically get a 403 response. Raise a ValidationError when examining the submitted data and you get a 400 response. POST successfully and get a 201, PATCH and get a 200. And so on.


You could PATCH an existing user profile with just the field that was changed in your UI, DELETE a comment, PUT a new shopping basket, and so on. HTTP methods exist so that you don’t have to encode the nature of your request within the body of your request. REST Framework has support for these methods natively in its base ViewSet class which is used to build each of your endpoints; verbs are mapped to methods on your view class which, by default, are implemented to do everything you’d expect (create, update, delete).


The base ViewSet class looks for the Accepts header and encodes the response accordingly. You need only specify which formats you wish to support in your

Serializers are not Forms

Django Forms do not provide a sufficient abstraction to handle object PATCHing (only PUT) and cannot encode more complex, nested data structures. The latter limitation lies with HTTP, not with Django Forms; HTTP forms cannot natively encode nested data structures (both application/x-www-form-urlencoded and multipart/form-data rely on flat key-value formats). Therefore, if you want to declaratively define a schema for the data submitted by your users, you’ll find life a lot easier if you discard Django Forms and use REST Framework’s Serializer class instead.

If the consumers of your API wish to use PATCH rather than PUT, and chances are they will, you’ll need to account for that in your validation. The REST Framework ModelSerializer class adds fields that map automatically to Model Field types, in much the same way that Django’s ModelForm does. Serializers also allow nesting of other Serializers for representing fields from related resources, providing an alternative to referencing them with a unique identifier or hyperlink.


Should you choose to go beyond an AJAX-enabled site and implement a fully-documented, public API then best practice and an RFC or two suggest that you make your API discoverable by allowing OPTIONS requests. REST Framework allows an OPTIONS request to be made on every endpoint, for which it examines request.user and returns the HTTP methods available to that user, and the schema required for making requests with each one.


Support for OAuth 1 and 2 is available out of the box and OAuth permissions, should you choose to use them, can be configured as a permissions backend.


REST framework provides a browsable HTTP interface that presents your API as a series of forms that you can submit to. We found it incredibly useful for development but found it a bit too rough around the edges to offer as an aid for third parties wishing to explore the API. We therefore used the following snippet in our file to make the browsable API available only when DEBUG is set to True:



REST Framework gives you an APITestCase class which comes with a modified test client. You give this client a dictionary and encoding and it will serialise the request and deserialise the response. You only ever deal in python dictionaries and your tests will never need to contain a single instance of json.loads.


The documentation is of a high quality. By copying the Django project’s three-pronged approach to documentation – tutorial, topics, and API structure, Django buffs will find it familiar and easy to parse. The tutorial quickly gives readers the feeling of accomplishment, the high-level topic-driven core of the documentation allows readers to quickly get a solid understanding of how the framework should be used, and method-by-method API documentation is very detailed, frequently offering examples of how to override existing functionality.

Project Status

At the time of writing the project remains under active development. The roadmap is fairly clear and the chap in charge has a solid grasp of the state of affairs. Test coverage is good. There’s promising evidence in the issue history that creators of useful but non-essential components are encouraged to publish their work as new, separate projects, which are then linked to from the REST Framework documentation.



We found that writing permissions was messy and we had to work hard to avoid breaking DRY. An example is required. Let’s define a ViewSet representing both a resource collection and any document from that collection:

class JobViewSet(ViewSet):
    Handles both URLS:
    serializer_class = JobSerializer
    permission_classes = (IsAuthenticated, JobPermission)

    def get_queryset(self):
        if self.request.user.is_superuser:
            return Job.objects.all()

        return Job.objects.filter(
            Q(applications__user=request.user) |

If the Job collection is requested, the queryset from get_queryset() will be run through the serializer_class and returned as an HTTPResponse with the requested encoding.

If a Job item is requested and it is in the queryset from get_queryset(), it is run through the serializer_class and served. If a Job item is requested and is not in the queryset, the view returns a 404 status code. But we want a 403.

So if we define that JobPermission class, we can fail the object permission test, resulting in a 403 status code:

class JobPermission(Permission):
    def get_object_permission(self, request, view, obj):
    if obj in Job.objects.filter(
        Q(applications__user=request.user) |
        return True
    return False

Not only have we duplicated the logic from the view method get_queryset (we could admittedly reuse view.get_queryset() but the method and underlying query would still be executed twice), if we don’t then the client is sent a completely misleading response code.

The neatest way to solve this issue seems to be to use the DjangoObjectPermissionsFilter together with the django-guardian package. Not only will this allow you to define object permissions independently of your views, it’ll also allow you filter querysets using the same logic. Disclaimer: I’ve not tried this solution, so it might be a terrible thing to do.

Nested Resources

REST Framework is not built to support nested resources of the form /baskets/15/items. It requires that you keep your API flat, of the form /baskets/15 and /items/?basket=15.

We did eventually choose to implement some parts of our API using nested URLs however it was hard work and we had to alter public method signatures and the data types of public attributes within our subclasses. We required entirely highly modified Router, Serializer, and ViewSet classes. It is worth noting that REST Framework deserves praise for making each of these components so pluggable.

Very specifically, the biggest issue preventing us pushing our nested resources components upstream was REST Framework’s decision to make lookup_field on the HyperlinkedIdentityField and HyperlinkedRelatedField a single string value (e.g. “baskets”). To support any number of parent collections, we had to create a NestedHyperlinkedIdentityField with a new lookup_fields list attribute, e.g. ["baskets", "items"].


REST Framework is great. It has flaws but continues to mature as an increasingly popular open source project. I’d whole-heartedly recommend that you use it for creating full, public APIs, and also for creating a handful of endpoints for the bits of your site that need to be AJAX-enabled. It’s as lightweight as you need it to be and most of what it does, it does extremely well.

About us: Isotoma is a bespoke software development company based in York and London specialising in web apps, mobile apps and product design. If you’d like to know more you can review our work or get in touch.

Django Class-Based Generic Views: tips for beginners (or things I wish I’d known when I was starting out)

Django is renowned for being a powerful web framework with a relatively shallow learning curve, making it easy to get into as a beginner and hard to put down as an expert. However, when class-based generic views arrived on the scene, they were met with a lukewarm reception from the community: some said they were too difficult, while others bemoaned a lack of decent documentation. But if you can power through the steep learning curve, you will see they are also incredibly powerful and produce clean, reusable code with minimal boilerplate in your

So to help you on your journey with CBVs, here are some handy tips I wish I had known when I first started learning all about them. This isn’t a tutorial, but more a set of side notes to refer to as you are learning; information which isn’t necessarily available or obvious in the official docs.

Starting out

If you are just getting to grips with CBVs, the only view you need to worry about is TemplateView. Don’t try anything else until you can make a ‘hello world’ template and view it on your dev instance. This is covered in the docs. Once you can handle that, keep reading the docs and make sure you understand how to subclass a ListView and DetailView to render model data into a template.

OK, now we’re ready for the tricky stuff!

Customising CBVs

Once you have the basics down, you will find that most of your work revolves around subclassing the built-in class-based generic views and overriding one or two methods. At the start of your journey, it is not very obvious what to override to achieve your goals, so remember:

  • If you need to get some extra variables into a template, use get_context_data()
  • If it is a low-level permissions check on the user, you probably want dispatch()
  • If you need to do a complicated database query on a DetailView, ListView etc, try get_queryset()
  • If you need to pass some extra parameters to a form when constructing it via a FormView, UpdateView etc, try get_form() or get_form_kwargs()

If you haven’t heard of, go there and bookmark it now. It is possibly the most useful reference out there for working with class-based generic views. When you are subclassing views and trying to work out which methods to override, and the official docs just don’t seem to cut it, has your back. If it wasn’t for that site, I think we would all be that little bit grumpier about using CBVs.


CBVs cut a LOT of boilerplate code out of the process of writing forms. You should already be using ModelForms wherever you can to save effort, and there are generic class-based views available (CreateView/UpdateView) that allow you to plug in your ModelForms and reduce your boilerplate code even further. Always use this approach if you can. If your form does not map to a particular model in the database, use FormView.


If you want to put some guards on your view e.g. check if the user is logged in, check they have a certain permission etc, you will usually want to do it on the dispatch() method of the view. This is the very first method that is called in your view, so if a user shouldn’t have access then this is the place to intercept them:

from django.core.exceptions import PermissionDenied
from django.views.generic import TemplateView

class NoJimsView(TemplateView):
    template_name = 'secret.html'

    def dispatch(self, request, *args, **kwargs):
        if request.user.username == 'jim':
            raise PermissionDenied # HTTP 403
        return super(NoJimsView, self).dispatch(request, *args, **kwargs)

Note: If you just want to restrict access to logged-in users, there is a @require_login decorator you can add around the dispatch() method. This is covered in the docs, and it may be sufficient for your purposes, but I usually end up having to modify it to handle AJAX requests nicely as well.

Multiple inheritance

Once you start subclassing and overriding generic views, you will probably find yourself needing multiple inheritance. For example, perhaps you want to extend your “No Jims” policy (see above) to several other views. The best way to achieve this is to write a small Mixin and inherit from it along with the generic view. For example:

class NoJimsMixin(object):
    def dispatch(self, request, *args, **kwargs):
        if request.user.username == 'jim':
            raise PermissionDenied # HTTP 403
        return super(NoJimsMixin, self).dispatch(request, *args, **kwargs)

class NoJimsView(NoJimsMixin, TemplateView):
    template_name = 'secret.html'

class OtherNoJimsView(NoJimsMixin, TemplateView):
    template_name = 'other_secret.html'

Now you have entered the world of python’s multiple inheritance and Method Resolution Order. Long story short: order is important. If you inherit from two classes that both define a foo() method, your new class will use the one from the parent class that was first in the list. So in the above example, in your NoJimsView class, if you listed TemplateView before NoJimsMixin, django would use TemplateView’s dispatch() method instead of NoJimsMixin’s. But in the above example, not only will your NoJimsMixin’s dispatch() get called first, but when you call super(NoJimsMixin, self).dispatch(), it will call TemplateView’s dispatch() method. How I wish I had known this when I was learning about CBVs!


As you browse around the docs, code and, you will see references to Views, BaseViews and Mixins. They are largely a naming convention in the django code: a BaseView is like a View except it doesn’t have a render_to_response() method so it won’t render a template. Almost all Views inherit from a corresponding BaseView and add a render_to_response() method e.g. DetailView/BaseDetailView, UpdateView/BaseUpdateView etc. This is useful if you are subclassing from two Views, because it means you can choose which one renders the final output. It is also useful if you want to render to JSON, say in an AJAX response, and don’t need HTML rendering at all (in this case you’d need to provide your own render_to_response() method that returns a HttpResponse).

Mixin classes provide a few helper methods, but can’t be used on their own, as they are not full Views.

So in short, if you are just subclassing one thing, you will usually subclass a View. If you want to manually render a non-HTML response, you probably need a BaseView. If you are inheriting from multiple classes, you will need a combination of some or all of View, BaseView and Mixin.

A final note on AJAX

Django is not particularly good at serving AJAX requests out of the box, and once you start trying to use CBVs to do AJAX form submissions, things get quite complicated.

The docs offer some help with this in the form of a Mixin you can copy and paste into your code, which gives you JSON responses instead of HTML. You will also need to pass CSRF tokens in your POST requests, and again there is an example of how to do this in the docs.

This should be enough to get you started, but I often find myself having to write some extra Mixins, and that is before even considering the javascript code on the front end to send requests and parse responses, complete with handling of validation and transport errors. Here at Isotoma, we are working on some tools to address this, which we hope to open-source in the near future. So watch this space!


In case you hadn’t worked it out, we at Isotoma are fans of Django’s class-based generic views. They are definitely not straightforward for newcomers, but hopefully with the help of this article and other resources (did I mention, it’ll be plain sailing before you know it. And once you get what they’re all about, you won’t look back.

About us: Isotoma is a bespoke software development company based in York and London specialising in web apps, mobile apps and product design. If you’d like to know more you can review our work or get in touch.

Buildout Basics Part 1

Introduction to the series

This is the first in a 3 part series of tutorials about creating, configuring and maintaining buildout configuration files, and how buildout can be used to deploy and configure both python-based and other software.
During the course of this series, I will cover buildout configuration files, some buildout recipes and a simple overview structure of a buildout recipe. I will not cover creating a recipe, or developing buildout itself.

For a very good guide on the python packaging techniques that we will be relying on, see this guide:

All code samples will be python 2.4+ compatible, system command lines will be debian/ubuntu specific, but simple enough to generalise out to most systems (OSX and Windows included).
Where a sample project or code is required, I’ve used Django as it’s what I’m most familiar with, but this series is all about the techniques and configuration in buildout itself, it is not Django specific, so don’t be scared off if you happen to be using something else.

Buildout Basics

So, what’s this buildout thing anyway?

If you’re a python developer, or even just a python user, you will probably have come across either easy_install or pip (or both). These are pretty much two methods of achieving the same thing; namely, installing python software

$> sudo easy_install Django

This is a fairly simple command, it will install Django onto the system path, so from anywhere you can do

>>> import django

While this is handy, it’s not ideal for production deployment. System installing a package will lead to problems with maintenance, and probably also lead to version conflict problems, particuarly if you have multiple sites or environments deployed on the same server. One environment may need Django 1.1, the newest may need 1.3. There are significant differences in the framework from one major version to another, and a clean upgrade may not be possible. So, system-installing things is generally considered to be a bad idea, unless it’s guaranteed to be a dedicated machine.

So what do you do about it?
One answer is to use a virtualenv. This is a python package that will create a ‘clean’ python environment in a particular directory:

$> virtualenv deployment --no-site-packages

This will create a directory called ‘deployment’, in which is a clean python intepreter with only a local path. This environment will ignore any system-installed packages (--no-site-packages), and give you a fresh, self contained python environment.

Once you have activated this environment, you can then easy_install or pip install the packages you require, and then use them as you would normally, safe in the knowledge that anything you install in this environment is not going to affect the mission-critical sales (or pie-ordering website) process that’s running in the directory next door.

So, if virtualenv can solve this problem for us, why do we need something else, something more complex to solve essentially the same problem?

The answer, is that buildout doesn’t just solve this problem, it solves a whole lot more problems, particuarly when it comes to ‘how do I release this code to production, yet make sure I can still work on it, without breaking the release?’

Building something with buildout

The intent is to show you the parts for a buildout config, then show how it all fits together. If you want to see the final product, then dissemble it to find the overall picture, scroll to the end of this, have a look, then come back. Go on, it’s digital, this will still be here when you come back….

Config Files

Pretty much everything that happens with buildout is controlled by its config file (this isn’t quite true, but hey, ‘basics’). A config file is a simple ini (ConfigParser) style text file; that defines some sections, some options for those sections, and some choices in the options.

In this case, the sections of a buildout config file (henceforth referred to as buildout.cfg) are generally referred to as parts. The most important of these parts is the buildout part itself, which controls the options for the buildout process.

An absolute minimum buildout part looks something like this:

parts = noseinstall

While this is not a complete buildout.cfg, it is the minimum that is required in the buildout part itself. All is is doing is listing the other parts that buildout will use to actually do something, in this case, it is looking for a single part named noseinstall. As this part doesn’t exist yet, it won’t actually work. So, lets add the part, and in the next section, see what it does:

parts = noseinstall

recipe = zc.recipe.egg
eggs = Nose

An aside about

We now have a config file that we’re reasonably sure will do something, if we’re really lucky, it’ll do something that we actually want it to do. But how do we run it? We will need buildout itself, but we don’t have that yet. At this point, there are two ways to proceed.

  1. sudo apt-get install python-zc.buildout
  2. wget && python

For various reasons, unless you need a very specific version of buildout, it is best to use the file. This is a simple file that contains enough of buildout to install buildout itself inside your environment. As it’s cleanly installed, it can be upgraded, version pinned and generally used in the same manner as in a virtualenv style build. If you system-install buildout, you will not be able to easily upgrade the buildout instance, and may run into version conflicts if a project specifies a version newer than the one you have. Both approaches have their advantages, I prefer the second as it is slightly more self contained. Mixing the approaches (using with a system-install is possible, but can expose some bugs in the buildout install procedure).

The rest of this document is going to assume that you have used to install buildout.

Running some buildout

Now we have a method of running buildout, it’s time to do it in the directory where we left the buildout.cfg file created earlier:

$> bin/buildout

At this point, buildout will output something along the lines of:

Getting distribution for 'zc.recipe.egg'.
Got zc.recipe.egg 1.3.2.
Installing noseinstall.
Getting distribution for 'Nose'.
no previously-included directories found matching 'doc/.build'
Got nose 1.0.0.
Generated script '/home/tomwardill/tmp/buildoutwriteup/bin/nosetests-2.6'.
Generated script '/home/tomwardill/tmp/buildoutwriteup/bin/nosetests'.

Your output may not be exactly similar, but should contain broadly those lines.

The simple sample here is using the zc.recipe.egg recipe. This is probably the most common of all buildout recipes as it is the one that will do the heavy work of downloading an egg, analysing its for dependencies (and installing them if required), and then finally installing the egg into the buildout path for use. Recipes are just python eggs that contain code that buildout will run. The easiest way to think of this is that while a recipe is an egg, recipe contains instructions for the buildout process itself, and therefore will not be available to code at the end.

An analysis of the buildout output shows exactly what it has done. It has downloaded an egg for zc.recipe.egg and run the noseinstall part. Let’s take a closer look at that noseinstall part from before:

recipe = zc.recipe.egg
eggs = Nose

So, we can see why buildout has installed zc.recipe.egg, it is specified in the recipe option of this part, so buildout will download it, install it and then run it. We will take a closer look at the construction of a recipe in a later article, but for now, assume that buildout has executed a bunch of python code in the recipe, and we’ll carry on.
The python code in this case will look at the part that it is in, and look for an option called eggs. As we have specified this option, it will then look at this as a list, and install all the eggs that we have listed; in this case, just the one, the unittest test runner Nose.
As you can see from the bottom of the buildout output, the recipe has downloaded Nose, extracted it and created two files; bin/nosetests and bin/nosetests-2.6. Running one of those files like so:

$> bin/nosetests

Ran 0 tests in 0.002s


We can see that this is nose, as we expect it to be. Two files have been generated because that is that the for Nose defines, a base nosetest executable, and one for the specifc python version that we have used (python 2.6 in my case). These are specified in the that makes up the nose egg, which will be covered in a later article.


We can install buildout into a development environment, and use a simple config file to install a python egg. The next article will cover a development example for using with django, and some niceties such as version pinning and running python in our buildouted environment.

The Third Manifesto Implementers’ Workshop

Earlier this month I went to the Third Manifesto Implementers’ Workshop at Northumbria University in Newcastle. A group of us discussed recent developments in implementing the relational data model.

The relational data model was proposed by E. F. Codd in 1969 in response to the complex, hierarchical, data storage solutions of the time which required programs to be written and compiled for each database query. It was a powerful abstraction, but unfortunately SQL and its implementations missed out on important features, and broke it in fundamental ways. In response to this problem, and the industry’s approach towards object-databases, Chris Date and Hugh Darwen wrote “The Third Manifesto” (TTM) to put forward their ideas on how future database systems should work. I urge you to read their books (even if you’re not interested in the subject) – the language is amazing: precise, concise, comprehensive and easy to read – other technical authors don’t come close.

The relational model treats data as sets of logical propositions and allows them to be queried, manipulated and constrained declaratively. It abstracts away from physical storage and access issues which is why it will still be used a hundred years from now (and why NoSQL discussions like these are retrograde). If you’re writing loops to query your data, or having to navigate prescribed connection paths, then your abstractions are feeble and limited.

At the workshop, I talked about my project, Dee, which implements the relational ideas from TTM in Python. You can see my slides here (or here if you have an older browser).

Erwin Smout gave a couple of talks about implementing transition constraints and dispensing with data definition language.

David Livingstone walked us through the RAQUEL architecture – a layered approach along the lines of the OSI network model.

Hugh Darwen discussed the features and implementation of IBM’s Business System 12, one of the first ever relational database systems which had some surprisingly dynamic features, including key inferencing, so that view definitions could keep tabs on their underlying constraints.

Chris Date took us through his latest thoughts on how to update relational views in a generic way. The aim is for database users to be able to treat views and base tables in the same way, for both reading and writing. Lots to think about here, and my to-do list for Dee has grown another section.

Adrian Hudnott discussed a couple of research projects around optimising multiple relational assignments and tracking the source of updates so that transition constraints could be more effective.

Renaud de Landtsheer gave an insight into the work he’s been doing implementing first-order-logic constraints within Oracle databases.

I sat next to Toon Koppelaars (whose name went down well with the Geordies) and then I realised I had his book (Applied Mathematics for Database Professionals) waiting to be read on my desk at work, thanks to the eclectic Isotoma library (and Wes).

It was a packed couple of days with plenty of food for thought. Thank you to David Livingstone and Safwat Mansi for organising and hosting such an enjoyable and interesting event.

Tamper-protection for Bank Transactions

If you need to send electronic transactions to Swedish banks, you’ll be required to add anti-tampering seals to the files. The banks recommend you use a third-party system to create the HMAC SHA256-128 seals, but that could involve a fair amount of expensive server software and maintenance contracts (some linked to the number of people who work in your company).

Instead, you can do it yourself in Python like this:

import hmac
import hashlib
import string

NORMALISE = string.maketrans(
    'xC9xC4xD6xC5xDCxE9xE4xF6xE5xFC' + ''.join(
    [chr(x) for x in range(0,32)]) + ''.join(
    [chr(x) for x in range(127,256)
     if x not in (201,196,214,197,220,233,
    'x40x5Bx5Cx5Dx5Ex60x7Bx7Cx7Dx7E' + ''.join(
    [chr(195) for x in range(0,32)]) + ''.join(
    [chr(195) for x in range(127,256)
     if x not in (201,196,214,197,220,233,

def hex_to_bytes(hexs):
    """Convert string of hex into bytes"""
    return ''.join(['%s' % chr(int(hexs[i:i+2], 16))
                    for i in range(0, len(hexs), 2)])

def get_signature(contents, key):
    """Calculate the HMAC SHA256-128 signature

       contents - an iso-8859-1 (latin-1) encoded string
       key - a string of hex characters

       Returns a 32 char string of hex characters (128 bits)
    key = hex_to_bytes(key)

    #Normalise the contents
    contents = contents.translate(NORMALISE, 'rn')

    dig =, msg=contents,
    return ''.join(['%02X' % ord(x) for x in dig[:16]])

And then to calculate the signature for a file:

>>> print get_signature(open('bankfile.dat').read(),
...                     '1234567890abcdef1234567890abcdef')

Scaffolding template tags for Django forms

We love Django here at Isotoma, and we love using Django’s awesome form classes to generate self-generating, self-validating, [X]HTML forms.

However, in practically every new Django project I find myself doing the same thing over and over again (and I know others do too): breaking the display of a Django form instance up into individual fields, with appropriate mark-up wrappers.

Effectively I keep recreating the output of BaseForm.as_p/as_ul/as_table with template tags and mark-up.

For example, outputting a login form, rather than doing:

{{ form.as_p }}

We would do:

{% if form.username.errors %}
  {% for error in form.username.errors %}
    {{ error }}
  {% endfor %}
{% endif %}
{{ form.username.label }} {{ form.username }}
{% if form.password.errors %}
  {% for error in form.password.errors %}
    {{ error }}
  {% endfor %}
{% endif %}
{{ form.password.label }} {{ form.password }}

Why would you want to do this? There are several reasons, but generally it’s to apply custom mark-up to a particular element (notice I said mark-up, not styling, that can be done with the generated field IDs), as well as completely customising the output of the form (using <div>‘s instead etc.), and also because some designers tend to prefer this way of looking at a template.

“But”, you might say, “Django already creates all this for us with the handy as_p/as_ul/as_table methods, can you just take the ouput from that?”
Well, yes, in fact on a project a couple of weeks ago that’s exactly what I did, outputting as_p in a template, and then editing the source chucked out in a browser.
Which gave me the idea to create a simple little tool to do this for me, but with the Django template tags for dynamically outputting the field labels and fields themselves.

I created django-form-scaffold to do just this, and now I can do this from a Python shell:

>>> from dfs import scaffold
>>> from MyProject.MyApp.forms import MyForm
>>> form = MyForm()
>>> # We can pass either an instance of our form class
>>> # or the class itself, but better to pass an instance.
>>> print scaffold.as_p(form)

{% if %}{% for error in %}
{{ error }}{% endfor %}{% endif %}
<p>{{ }} {{ }}</p>
{% if form.password1.errors %}{% for error in form.password1.errors %}
{{ error }}{% endfor %}{% endif %}
<p>l{{ form.password1.label }} {{ form.password1 }}</p>
{% if form.password2.errors %}{% for error in form.password2.errors %}
{{ error }}{% endfor %}{% endif %}
<p>{{ form.password2.label }} {{ form.password2 }}</p>

Copy and paste this into a template, tweak, and Robert’s your mother’s brother.

As well as as_p(), the dfs.scaffold module also has the equivalent functions as_ul(), as_table, and an extra as_div() function.

Writing interactive command line DB query tools with pyDBCLI

While some people can’t navigate a relational database without reaching for a GUI, the vast majority of us spend a good proportion of our lives inside interactive command line interfaces, such as psql or mysql.

What do you do, however, if there isn’t a CLI query tool available for the DB you’re working with?
I had this exact same problem recently with a project, the main reason for the lack of any such tooling is because I wasn’t actually dealing with a real DB at all, but an ODBC interface to a web service that exposed reporting data as if it were tables in a DB. Rather than spend the rest of my life messing around with queries in a hooked up spreadsheet, or repeatedly writing one off Python snippets, I decided to write my own psql-like tool.

Thus the first version of pyDBCLI was born; well not really, my first tool really only handled querying Webtrends (I’ll save that for another post), while pyDBCLI is a base class for making such tools as long as you have a DB API compatible cursor to query.
pyDBCLI is based on the fantastic cmd.Cmd, so you can extend pretty much exactly as you would would Cmd, except with some extra properties such as cursor and multi_prompt are provided.

In order to make a Cmd based tool behave more like psql I ended up overriding the parseline method with regular expressions to handle escaped commands such as “d” and “c”, fuzzing them if not escaped, or un-escaping them if escaped so that Cmd’s unmodified parseline method can handle parsing and dispatching to defined do_* methods.

The other main change was modifying the default method to dispatch command lines to the a query method for querying against the DB API cursor; as well as this default and several other commands will detect an unfinished SQL query and wait for a finishing character (“;” by default) before sending it (or doing anything) else. This is what the multi_prompt property is for, the prompt property is replaced with multi_prompt when a query spans more than one line, and is set back again when the query is finished and executed.

2 example tools are bundled with pyDBCLI, in the extras package:

  • odbc – a tool to query an ODBC exposed data source, using PyODBC; takes PyODBC compatible DSN strings.
  • litecli – a tool to query a SQLite database; SQLite has it’s own CLI tool to do this, which is very well rounded and much better than litecli, but this is provided as a fairly functional example tool.

Tomorrow I’ll discuss the original reason I whipped up pyDBCLI: creating a tool for querying Webtrends, via ODBC, quickly and with more ease.