In an earlier post Over on the Twisted blog, Duncan McGreggor has asked us to expand a bit on where we think Twisted may be lacking in it’s support for concurrency. I’m afraid this has turned into a meandering essay, since I needed to reference so much background. It does come to the point eventually…
An unsolved problem
To many people it must seem as though “computers” are a solved problem. They seem to improve constantly, they do many remarkable things and the Internet, for example, is a wonder of the modern world. Of course there are screw ups, especially in large IT projects, and these are generally blamed on incompetent officials and greedy consulting firms and so on.
Although undoubtedly officials are incompetent and consultants are greedy, these projects are often crippled by the failure of industry to recognise that some of the core problems of systems design are an unsolved problem. Concurrency is one of the major areas where they fall down. Building an IT system to service a single person is straightforward. Rolling that same system out to service hundreds of thousands is not.
It may seem odd to people outside the world of software, but concurrency (“doing several things at once”) is *still* one of the hot topics in software architecture and language design. Not only is it not a solved problem, there’s still a lot of disagreement on what the problem even *is*.
Here’s a typical scenario in IT systems rollout. Every experienced engineer will have been involved in this. A project where it seemed to be going ok, the software was substantially complete and people were talking about live dates. So the developers chuck it over the wall to the systems guys, so they can run some tests to work out how much hardware they’ll need.
And the answer comes back something like “we’re going to need one server per user” or “it falls over with four simultaneous users”. And I can tell you, if you get *that far* and discover this, the best option is to flee. Run for the hills and don’t look back.
There has always been a distinction between the worlds of academia and industry. Academics frame problems in levels of theoretical purity, and then address them in the abstract. Industry is there to solve immediate problems on the ground, using the tools that are available.
Academics have come up with a thousand ways to address concurrency, and a lot of these were dreamt up in the early days of computing. All the things I’m going to talk about here were substantially understood in the eighties. But these days it takes twenty years for something to make it from academia to something industry can use, and that time lag is increasing.
Industry only really cares about it’s tooling. The fact that academics have dreamt up some magic language that does really cool stuff is of no interest if there isn’t an ecosystem big enough to use. That ecosystem needs trained developers, books, training courses, compilers, interpreters, debuggers, profilers and of course huge systems libraries to support all the random crap every project needs (oh, it’s just like the last project except we need to write iCalendar files *and* access a remote MIDI music device). It also needs actual physical “tin” on which to run the code, and the characteristics of the tin make a lot of difference.
Toy academic languages are *no use*, as far as most of industry is concerned, for solving their problems. If you can’t go and get five hundred contractors with it on their CV, then you’re stuck.
The multicore bombshell
So, all industry has these days, really, is C++ and Java. C++ is still very widely used, but Java is gaining ground rapidly, and one of the reasons for this is it’s support for concurrency. I’ll quote Steve Yegge:
But it’s interesting because C++ is obviously faster for, you know, the short-running [programs], but Java cheated very recently. With multicore! This is actually becoming a huge thorn in the side of all the C++ programmers, including my colleagues at Google, who’ve written vast amounts of C++ code that doesn’t take advantage of multicore. And so the extent to which the cores, you know, the processors become parallel, C++ is gonna fall behind.
But for now, Java programs are getting amazing throughput because they can parallelize and they can take advantage of it. They cheated! Right? But threads aside, the JVM has gotten really really fast, and at Google it’s now widely admitted on the Java side that Java’s just as fast as C++.
His point here is vitally important. The reason Java is gaining is not an abstract language reason, it’s because of a change in the architecture of computers. Most new computers these days are multicore. They have more than one CPU on the processor die. Java has fundamental support for threading, which is one approach to concurrency, and so some programs can take advantage of the extra cores. On a quad-core machine, with the right program, Java will run four times faster than C++. A win, right?
Well here’s a comment from the master himself, Don Knuth:
I might as well flame a bit about my personal unhappiness with the current trend toward multicore architecture. To me, it looks more or less like the hardware designers have run out of ideas, and that they’re trying to pass the blame for the future demise of Moore’s Law to the software writers by giving us machines that work faster only on a few key benchmarks! I won’t be surprised at all if the whole multithreading idea turns out to be a flop, worse than the “Itanium” approach that was supposed to be so terrific—until it turned out that the wished-for compilers were basically impossible to write.
Let me put it this way: During the past 50 years, I’ve written well over a thousand programs, many of which have substantial size. I can’t think of even five of those programs that would have been enhanced noticeably by parallelism or multithreading. Surely, for example, multiple processors are no help to TeX….
I know that important applications for parallelism exist—rendering graphics, breaking codes, scanning images, simulating physical and biological processes, etc. But all these applications require dedicated code and special-purpose techniques, which will need to be changed substantially every few years.
(via Ted Tso)
Hardware designers are threatening to increase the numbers of cores massively. Right now you get two, four maybe eight core systems. But soon maybe hundreds of cores. This is important.
The problems with threading
Until recently, if you’d said to pretty much any developer that concurrency was an unsolved problem, they’d look at you like you were insane. Threading was the answer – everyone knew that. It’s supported in all kernels in all major Operating Systems. Any serious software used threads widely to handle all sorts of concurrency, and hey it was easy – Java, for example, provides primitives in the language itself to manage synchronisation and all the other stuff you need.
But then some people started realising that it wasn’t quite so good as it seemed. Steve Yegge again:
I do know that I did write a half a million lines of Java code for this game, this multi-threaded game I wrote. And a lot of weird stuff would happen. You’d get NullPointerExceptions in situations where, you know, you thought you had gone through and done a more or less rigorous proof that it shouldn’t have happened, right?
And so you throw in an “if null”, right? And I’ve got “if null”s all over. I’ve got error recovery threaded through this half-million line code base. It’s contributing to the half million lines, I tell ya. But it’s a very robust system.
You can actually engineer these things, as long as you engineer them with the certain knowledge that you’re using threads wrong, and they’re going to bite you. And even if you’re using them right, the implementation probably got it wrong somewhere.
It’s really scary, man. I don’t… I can’t talk about it anymore. I’ll start crying.
This is a pretty typical experience of anyone who has coded something serious with threads. Weird stuff happens. You get deadlocks and breakage and just utterly confusing random stuff.
And you know, all those times your Windows system just goes weird, and stuff hangs and crashes and all sorts. I’m willing to bet a good proportion of those are due to errors in threading.
In reality threads are hard. It’s sort of accepted wisdom these days (at least amongst some of the community) that threads are actually too hard. Too hard for most programmers anyhow.
We’re Python coders, so Python is obviously of particular interest to us. We also write concurrent systems. Python’s creator (Guido van Rossem) took an approach to threading, which has become pretty standard in most modern “dynamic” languages. Rather than ensure the whole Python core is “thread-safe” he introduced a Global Interpreter Lock. This means that in practice when one thread is doing something it’s often impossible for the interpreter to context switch to other threads, because the whole interpreter is locked.
It certainly means threads in Python are massively less useful than they are in, say, Java. For a lot of people this has doomed Python – “what no threads!?” they cry, and then move on. Which is a shame, because threads are not the only answer, and as I’ve said I don’t even think they are a good answer.
Enter Twisted. Twisted is single-threaded, so it avoids all of the problems of threads. Concurrency is handled cooperatively, with separate subsytems within your program yielding control, either voluntarily or when they would block (i.e. when they are waiting for input).
This model fits a large proportion of programming problems very effectively, and it’s much more efficient than threads. So how does this handle multicore? Pretty effectively right now. We design our software in such a way that core parts can be run separately and scaled by adding more of them (“horizontal” scaling in the parlance). Our soon-to-be-released CMS, Exotypes, works this way, using multiple processes to exploit multiple cores.
This is a really effective approach. We can run say six processes, load balance between them and it takes great advantage of the hardware. Because we’ve designed it to work this way, we can even scale across multiple physical computers, giving us a lot of potential scale.
But what of machines of the future? Over a hundred cores, run a hundred processes? Over a thousand? At large numbers of cores the multi-process model breaks down too. In fact I don’t think any commonly deployed OS will handle this sort of hardware well at all, except for specialised applications. This is where I think Twisted falls down, through no fault of it’s own. I just suspect, like Don Knuth, that the hardware environment of the future is one that’s going to be extremely challenging for us to work in.
Two worlds, reprise
Of course, these issues have been addressed in academia, and I think, to finally answer Duncan’s question, that the long term solution to concurrency has to be addressed as part of the language. The only architecture that I think will handle it is the sort of thing represented in Erlang – lightweight processes that share no state.
Erlang addresses the challenge of multicore computer fantastically well, but as a language for writing real programs it suffers some huge lacks. I don’t think it’s Erlang that’s going to win, but it’s going to be a language with many of it’s features.
First, Erlang is purely functional, with no object-oriented structures. Pretty much every coder in the world has been trained, and is familiar with, the OO paradigm. For a language to gain traction it’s going to need to support this. This is quite compatible with Erlang’s concurrency model, and shouldn’t be too hard to support.
It also needs a decent library. Right now, the Erlang library ecosystem is, well, sparse.
Finally it needs wide adoption.
So, gods of the machines, I want something that’s got OCaml’s functional OO, Erlang’s concurrency and distribution, Python’s syntax and Python’s standard library. And I want you to bribe people to use it.
If you can do all this, not only will we be able to support multicore, but we might also, finally, be able to actually build a large IT system that actually works.