I have very little nostalgia for the old Design Museum building. Its location near Tower Bridge was always a real effort to get to, and while an attractive modernist icon, it always felt small, very much one of London’s “minor” museums – not befitting London’s reputation as a global design powerhouse. On 21 November it reopened at a new location in Kensington, and I visited on the opening weekend.
There is a new version of gunicorn, 19.0 which has a couple of significant changes, including some interesting workers (gthread and gaiohttp) and actually responding to signals properly, which will make it work with Heroku.
The HTTP RFC, 2616, is now officially obsolete. It has been replaced by a bunch of RFCs from 7230 to 7235, covering different parts of the specification. The new RFCs look loads better, and it’s worth having a look through them to get familiar with them.
Some kind person has produced a recommended set of SSL directives for common webservers, which provide an A+ on the SSL Labs test, while still supporting older IEs. We’ve struggled to find a decent config for SSL that provides broad browser support, whilst also having the best levels of encryption, so this is very useful.
A few people are still struggling with Git. There are lots of git tutorials around the Internet, but this one from Git Tower looks like it might be the best for the complete beginner. You know it’s for noobs, of course, because they make a client for the Mac 🙂
I haven’t seen a lot of noise about this, but the EU has outlawed pre-ticked checkboxes. We have always recommended that these are not used, since they are evil UX, but now there’s an argument that might persuade everyone.
Here is a really nice post about splitting user stories. I think we are pretty good at this anyhow, but this is a nice way of describing the approach.
@monkchips gave a talk at IBM Impact about the effect of Mobile First. I think we’re on the right page with most of these things, but it’s interesting to see mobile called-out as one of the key drivers for these changes.
I’d not come across the REST Cookbook before, but here is a decent summary of how to treat PUT vs POST when designing RESTful APIs.
Fastly have produced a spectacularly detailed article about how to get tracking cookies working with Varnish. This is very relevant to consumer facing projects.
This post from Thought Works is absolutely spot on, and I think accurately describes an important aspect of testing The Software Testing Cupcake.
As an example for how to make unit tests less fragile, this is a decent description of how to isolate tests, which is a key technique.
A nice implementation of “sudo mode” for Django. This ensures the user has recently entered their password, and is suitable for protecting particularly valuable assets in a web application like profile views or stored card payments.
If you are using Redis directly from Python, rather than through Django’s cache wrappers, then HOT Redis looks useful. This provides atomic operations for compound Python types stored within Redis.
Recently, we were faced with the task of writing an API-first web application in order to support future mobile platform development. Here’s a summary of the project from the point of view of one of the developers.
For the first couple of iterations, we had problems demonstrating the project progress to the customer at the end of iteration meetings. The customer on this project was extremely understanding and reasonably tech-savvy but despite that, he remained uninterested in the progress of the API and became quite concerned by the lack of UI progress. Although we were busy writing and testing the API code sitting just beneath the surface, letting the customer watch our test suite run would have achieved nothing. It was frustrating to find that, when there was nothing for the customer to click around on, we couldn’t get the level of engagement and collaboration we would typically achieve. In the end, we had to rely on the wireframes from the design process which the customer had signed off on to inform our technical decisions and, to allay the customer’s fears, we ended up throwing together some user interfaces which lacked any functionality purely to give the illusion of progress.
On the plus side, once we had written enough of our API to know that it was fit for purpose, development on the front-end began and progressed very rapidly; most of the back-end validation was already in place, end-points were well defined, and the comprehensive integration tests we’d written served as a decent how-to-use manual for our API.
Developing the application API-first took more work and more lines of code than it would have required if implemented as a typical post-back website.
Each interface had to be judged by its general usefulness rather than by its suitability for one particular bit of functionality alluded to by our wireframes or specification. Any view that called upon a complex or esoteric query had to instead be implemented using querystring filters or a peculiar non-generic endpoint.
In a typical postback project with private, application-specific endpoints, we’d be able to pick and choose the HTTP verbs relevant to the template we’re implementing however our generic API required considerably more thought. For each resource and collection, we had to carefully think about the permissions structure for each HTTP method, and the various circumstances in which the endpoint might be used.
We wrote around 4000 lines of integration test code just to pin down the huge combination of HTTP methods and user permissions however I sincerely doubt that all of those combinations are required by the web application. Had we not put in the extra effort however, we’d have risked making our API too restrictive to future potential consumers.
In terms of future maintainability, I’d say that each new generic endpoint will require a comparable amount of otherwise-unnecessary consideration and testing of permissions and HTTP methods.
Having such an explicitly documented split between the front and back end was actually very beneficial. The front end and back-end were developed and tested based on the API we’d designed and documented. For over a month, I worked solely on the back-end and my colleague worked solely on the front and we found this division of labour was an incredibly efficient way to work. By adhering to the HTTP 1.1 specification, using the full range of available HTTP verbs and response codes, and to our endpoint specification, we required far less interpersonal coordination than would typically be the case.
The two major issues we found with generic CRUD endpoints were (1) when we needed to perform a complex data query, and (2) update multiple resources in a single transaction.
To a certain extent we managed to solve the first problem using querystrings, with keys representing fields on the resource. For all other cases, and also to solve the second problem, we used an underused yet still perfectly valid REST resource archetype: the controller, used to model a procedural concept.
We used controller endpoints on a number of occasions to accommodate things like /invitations/3/accept (“accept” represents the controller) which would update the invitation instance and other related user instances, as well as sending email notifications.
Where we needed to support searching, we added procedures to collections, of the form /applicants/search, to which we returned members of the collection (in this example “applicants”) which passed a case-insensitive containment test based on the given key.
API-first required extra implementation effort and a carefully-considered design. We found it was far easier and more efficient to implement as a generic, decoupled back-end component than in the typical creation process (model -> unit test -> url -> view -> template -> integration test), with the front-end being created completely independently.
In the end, we wrote more lines of code and far more lines of integration tests. The need to stringently adhere to the HTTP specification for our public API really drove home the benefits to using methods and status codes.
In case you’re curious, we used Marionette to build the front-end, and Django REST Framework to build the back end.
About us: Isotoma is a bespoke software development company based in York and London specialising in web apps, mobile apps and product design. If you’d like to know more you can review our work or get in touch.
A couple of us went to QCon London last week, which as usual had some excellent speakers and some cutting edge stuff. QCon bills itself as “enterprise software development conference designed for team leads, architects and project management”, but it has a reputation for being an awful lot more interesting than that. In particular it covers a lot of cutting-edge work in architecture.
Scale, scale, scale
What that means in 2010 is scale, scale, scale – how do you service a bazillion people. In summary, nobody really has a clue. There were presentations from Facebook, Skype, BBC, Sky and others on how they’ve scaled out, as well as presentations on various architectural patterns that lend themselves to scale.
Everyone has done it differently using solutions tailored to their specific problem-space, pretty much all using Open Source technology but generally building something in-house to help them manage scale. This is unfortunate – it would be lovely to have a silver bullet for the scale problem.
From the academics there is a strong consensus that functional languages are the way forward, with loads of people championing Erlang. I’m a big fan of Erlang myself, and we’ve got a few Erlang coders here at Isotoma.
There was also some interesting stuff on other functional approaches to concurrency, in Haskell specifically and in general. One of the great benefits of functional languages is their ability to defer execution through lazy evaluation, which showed some remarkable performance benefits compared with more traditional data synchronisation approaches. I’d have to wave my hands to explain it better, sorry.
Erlang is now being used in production in some big scale outs now too: the BBC are using CouchDB, which they gave a glowing report to.
Skype are using Postgres (our preferred RDBMS here) and achieving remarkable scale using pretty simple technologies like pgbouncer. The architect speaking for Skype said one of their databases had 60 billion rows, spread over 64 servers, and that it was performing fine. That’s a level of scale that’s outside what you’d normally consider sane.
They did need a dedicated team of seriously clever people though – and that’s one of the themes from all the really big shops who talked, that they needed large, dedicated teams of very highly-paid engineers. Serious scale right now is not an off-the-shelf option.
Erlang starred in one of the other big themes being discussed, NoSQL databases. We’ve had our own experience with these here, specifically using Oracle’s dbXML, with not fantastic results. XML is really not suited to large scale performance unfortunately. Some of the other databases being talked about now though: Cassandra from Facebook, CouchDB and Voldemort from Amazon.
None of these are silver bullets either though – many of them do very little heavy lifting for you – often your application needs custom consistency or transaction handling, or you get unpredictable caching (i.e. “eventual consistency”). You need to architect around your user’s actual requirements, you can’t use an off-the-shelf architecture and deploy it for everyone.
The need to design around your user’s was put very eloquently by Udi Dahan in his Command-Query Responsibility Segregation talk. This was excellent, and it was pleasant to discover that an architecture we’d already derived ourselves from first principles (which I can’t talk about yet) had an actual name and everything! In particular he concentrated on divining User Intent rather than throwing in your normal GUI toolkit for building UIs – he took data grids to pieces, and championed the use of asynchronous notification. The idea of a notification stream as part of a call-centre automation system, rather than hitting F5 to reload repeatedly, was particularly well told.
DevOps, Agile and Kanban
Some of the other tracks were particularly relevant to us. The DevOps movement attempts to make it easier for development and operations teams to work closely together. For anyone who has worked in this industry this will be familiar issue – development and ops have different definitions of success, and different expectations from their customers. When these come into conflict, everyone gets hurt.
There was a great presentation from Simon Stewart of webdriver fame about his role as a System Engineer in Test at Google, where they have around one SET to 7 or 8 developers to help productionise the software, provide a proper test plan and generally improve the productivity and quality of code by applying ops and automated testing principles to development.
One of the things we’ve experienced a lot here over the last year, as we’ve grown, is that there are a lot of bottlenecks, pinch points and pain in areas outside development too. Agile addresses a lot of the issues in a development team, but doesn’t address any of the rest of the process of going from nothing to running software in production. We’ve experienced this with pain in QA, productionisation, documentation, project management, specification – in fact every area outside actual coding!
Lean Kanban attempts to address this, with methods adopted from heavy industry. I’m not going to talk about it here, but there’s definitely a role for this kind of process management, if you can get your customer on-side.
Training and Software Craftsmanship
Finally what I think was the most interesting talk of the conference and one directly relevant to my current work, Jason Gorman gave a fantastic talk about a training scheme he is running with the BBC to improve software craftsmanship using peer-review. I’ll be trying this out at Isotoma, and I’ll blog about it too!
Some of you may know Tim Bray. He’s been a major player in some important technologies of the present (XML) and the future (Atom). He also has a really good blog.
He’s posted a good summary of some of the big issues in software and systems architecture. These are some of the points that occupy anyone involved in longer-term technology strategy, and it’s sobering to see them listed together like that. These are very exciting times to be in technology – but it’s probably easier now than it has ever been to back the wrong horse.
A lot of these issues are ones that we struggle with here at Isotoma, and as Chief Software Architect it’s my job to try and anticipate some of these trends, for the benefit of our clients. This seems like a good opportunity to respond to Tim, and to show how we’re thinking about technology strategy.
(Apologies if I lapse at times into gobbledegook. Some of the things I’ll talk about are just plain technical. I’ll try and link them appropriately, so at least there’s some context.)
Up till not too damn long ago, for a big serious software project you could pick Java or .NET or, if you really liked pain, C++. Today you’d be nuts not to look seriously at PHP, Python, and Ruby. What’s the future mind-share of all these things? I have no idea, but that decision is being made collectively by the community right now.
He’s absolutely right, but obviously we’ve done rather more than look seriously. We’ve been a pretty much pure Python shop right from the outset. We use some PHP, when it makes sense, but Python is our clear choice, and it’s one we’re more than happy with. It’s a significant competitive advantage for us, in all sorts of ways.
Python has delivered for us in developer productivity, and on a larger scale it’s delivered in elegance – it scales very well as a developer language. Also, perhaps unlike Ruby, it scales very well in terms of performance, so I’m comfortable building very large systems in Python.
No, I don’t think relational databases are going away anytime soon. But I think that SQL’s brain-lock on the development community for the past couple of decades has been actively harmful, and I’m glad that it’s now OK to look at alternatives.
Will the non-relational alternatives carve out a piece of the market? I suspect so, but that decision is being made by the community, right now.
Brain-lock is right. It’s been the case for twenty years that for every single IT project your architect would open his toolbox and would pull out an RDBMS.
Relational Databases are still suited to a whole suite of applications and classes of problem. When you have highly structured data and very tight performance criteria they’re going to be a good choice for a long time to come. But for many of the problems they’ve been used to solve they are terminally ill-suited.
We’ve been using ZODB as part of Zope since 2004 (and I used it myself for several years before that). ZODB has some excellent characteristics for whole classes of problem that an RDBMS has problems with. It’s a lot more flexible, and it’s hierarchical nature provides a natural fit for web projects.
More recently we’ve been making heavy use of DB XML, Oracle’s Open Source XML database. This is a fantastic product, and it’s a much better model for most of the applications we build. A good example, oddly, would be Forkd which we built using a traditional RDBMS. If we had the experience of DB XML then that we have now then there’s no question that we’d use it for Forkd. Fitting recipes into a relational database is an exercise in ultimately pointless contortion.
I’m very confident XML databases are going to be huge.
CORBA is dead. DCOM is dead. WS-* is coughing its way down the slope to dusty death. REST, they say, is the way to go. Which I believe, actually. Still, there’s not much yet in tooling or best practices or received wisdom or blue-suit consultants or the other apparatus of a mainstream technology.
So what are they going to be teaching the kids, a few years hence, the right way is to build an application across a network full of heterogeneous technology? That’s being worked out by the community, right now.
The lack of tooling and blue-suit consultants (ignore my blue suit for the second) is, I think, a good thing. REST is not a technology stack, it’s an architectural style. It pares down the network programming model to fit the harsh realities of a stateless, highly concurrent, open system. We’re big fans of REST, and it’s a natural fit for how we work.
It’s not the whole story though, and there are a whole bunch of recurring problems in RESTful interfaces that are awaiting smart people to solve them. There’s some good work going on with URI Templates and PATCH, and of course Atom that I think are part of the solution yet.
Some relatively common orchestrations are horribly contorted in REST too, and it wouldn’t surprise me if here, to handle specific cases of lock acquisition and release and so forth we see some tooling.
Moore’s law is still holding, but the processors get wider not faster. Now that the best and the brightest have spent a decade building and debugging threading frameworks in Java and .NET, it’s increasingly starting to look like threading is a bad idea; don’t go there. I’ve personally changed my formerly-pro-threading position on this 180º since joining Sun four years ago.
We still haven’t figured out the right way for ordinary people to program many-core processors; check out the inconclusive results of my Wide Finder project last year. (By the way, I’ve now got an Internet-facing T2000 all of my own and will be re-launching Wide Finder as soon as I get some data staged on it; come one, come all).
And I can’t even repeat my crack about the right answer being worked out right now, because I’m not actually sure that anyone has a grip on it just yet. But we’re sure enough at an inflection point.
We’re a lot further down this particular inflection curve than most, I think. We make heavy use of Twisted, a single-threaded cooperatively multitasking network programming system that specifically addresses the threading problem.
I don’t think it’s the whole answer though, but nor is Erlang, which Tim championed in his Wide Finder project, with fascinating results.
Erlang has some marvellous attributes when it comes to large scale concurrent systems, and I’m very impressed with it. But adopting Erlang throws too much away, I think, losing the large-scale structural advantages of the Object Oriented approach that is pretty much the default for software architecture today.
Perhaps something like Stackless is the longer term solution here. An OO, message-passing, naturally distributed language using Python syntax and standard library but with some core functional changes (variables not being variable, for example) is the answer.
Or maybe even Mozart, which solves a lot of these problems too. It’s the current first-year MIT language [update: this is probably a lie, see comments], so expect to hear more of it in time.
Tim is right though, nobody really knows the answer here. All we know is that it certainly isn’t traditional multi-threaded programming, a la Java or C++.
Used to be, it was Java EE or Perl or ASP.NET. Now all of a sudden it’s PHP and then Rails and a bunch of other frameworks bubbling up over the horizon; not a month goes buy that I don’t see a bit of buzz over something that includes the term “Rails-like”.
It seems obvious to me that pretty soon there’s going to be a Rails++ that combines the good ideas from RoR with some others that will be obvious once we see them.
Also, that some of those “Rails-like” frameworks, even if they’re not a huge step forward, will get some real market share because they’ll have some combination of of minor advantages.
Once again, I can’t say it’s being worked out right now, because for right now I see a pretty uniform picture of Rails’ market share advancing steadily. It won’t last.
We use a couple of rails-like frameworks ourselves, Turbogears being the most obviously MVC. The big ideas in Rails, and similar frameworks, is the combination of MVC with an Object Relational layer. Since, as I’ve said, I don’t think the Relational stuff is needed at all, there’s an obvious first place where Rails and friends should look. Ditch the RDBMS.
Second, MVC maps pretty well to a lot of applications, and it’s a natural architectural style for a lot of people. MVC isn’t the only architectural style though, and it’s not necessarily the best fit for some though. The well-documented problems at Twitter, for example, I think just show a poor fit between MVC and Twitter’s fascinating (and well chosen) Jabber back-end. I know for certain, I’d not have used anything Rails-like for that.
I think it’s likely that the notional “Rails++” will probably not be MVC, and nor will it have an Object Relational layer. I think Rails, and it’s imitators, are just not suited long-term to the challenges of scale and distribution. That said, they clearly work well for a whole host of small and medium-sized projects right now.
Servers, they’re easy to understand. Blue-suited salesmen sell them to CIOs a few hundred thousand dollars’ worth at a time, they get loaded into data centers where they suck up too much power and HVAC.
Well, unless you’re gonna do your storage and compute and load-balancing and so on out in the cloud. Are you? The CIOs and data-center guys are wrestling this problem to the ground right now.
And as for software, used to be you shipped binaries on magnetic media and charged ’em a right-to-use license. Nope, nowadays it’s open-source and they download it for free and you charge them a support contract. Nope, that was last century; maybe the software’s all going to be out there in the cloud and you never download anything, just pay to use what’s there.
Personally, I don’t think any of those models are actually going to go away. But which works best where? The market’s working that out, right now.
Obviously we’re Open Source throughout, which takes us a long long way down this road already. We’ve got one secret squirrel project that’s successfully deployed using a massive Amazon EC2 back-end too, but I can’t say more about it.
Lets just say the massive economic advantages of the cloud are so conclusive that this is an obvious bet. Something like Google’s AppEngine is only a first step down this road, but it’s visionary and appropriate. And it’s in Python 😉
As I wrote a couple of months ago: how long can the public and private sector IT management continue to go on ignoring the fact that in OS X and Ubuntu, there are not one but two alternatives to the Windows desktop that are more reliable, more secure, more efficient, and cheaper? More or less everybody now has a friend or relative that’s on Mac or Linux and is going to be wondering why their desktop can’t be that slick.
What’s going to happen? I don’t know, but it’s going to be dramatic once we get to the tipping point, and I think we’re approaching it right now.
We use Ubuntu throughout, on all our desktops and laptops. Well, nearly. The machine that runs Sage for accounting is Windows. And our front-end guys need Windows or OSX to run PhotoShop and Flash. But everything else? Ubuntu works really well, and saves us an absolute fortune.
Will It Always Be Like This?
You know, just maybe. Our mastery of the IT technologies is still in its brawny youth, with lots of low-hanging fruit to be snatched and big advances to be made. And these days, with the advent of blogs and unconferences and all those new communication channels, our thought leaders are busy chattering at each other about all these problems all the time, 24/7/365. The gap between the leading edge and technology that’s actually deployed in the enterprise is as wide as it’s ever been and to me, that feels like a recipe for permanent disruption. Cowabunga!
Our industry has the greatest community of practice that has existed, perhaps, in the history of mankind. Every profession has it’s conferences and papers and journals, but only in our part of IT is it normal to share and discuss all of our work, all of the time, even to the extent of giving away the very code we write.
I can’t see an end to this cycle of innovation yet – it’s just too damed valuable to everyone concerned. Cowabunga indeed 🙂
There are a few camps in the world of Software Architects. It’s not a question of technology choices, although it may seem that way. The truth is that technology changes so fast it would be foolish to hang one’s beliefs on it. In a career in software you know for certain that today’s great technology is going to be completely obsolete in twenty years time. Being open to new technologies is the only way to be certain you will not become obsolete yourself.
Instead, these camps are really centred around a set of core beliefs about the purpose of architecture, and what the unstated underlying goals are. Architecture is ultimately a set of economic decisions – every decision carries costs and risks, and these must be balanced. How one weighs these factors in the consideration of an architecture is affected by these underlying goals.
Here at Isotoma we’re pretty firmly in the camp of what I guess you could term the “new pragmatists”. This approach is typified by things such as using REST versus SOAP for system communication, for example, although not limited to it.
Briefly, REST is an approach to communication between distributed systems. It tries to leverage the successful patterns inherent in good Internet application design. We use it very successfuly, and it fits well with our technology choices. This is not an accident – the technology is chosen to fit the overall approach.
The “toolset” guys on the other hand are real believers in their tools and IDEs and autogenerated this and graphically powered that. They want to be able to make building applications easy, which is kind of laudable, but misguided. Every added bit of ‘easiness’ for one person adds layers of unnecessary complexity – and unnecessary complexity is our real enemy, as software architects.
Ryan Tomayko puts this really nicely:
A lot of REST adherents, myself included, have come to appreciate REST only after believing and then eventually giving up on the notion that distributed systems design could ever be easy. You eventually find out that easy is not feasible and go for the next best thing: simple; manageable with reasonable effort and care.
REST was basically dreamed up by a bunch of individual geeks, doesn’t have a standards body and has no big software companies behind it. The “toolset” approach has a whole ecosystem of vendors and manufacturers champing at the bit to make it the de facto standard, so then you can buy their shiny development tools. In previous iterations of the software business the victory of the vendors has been a foregone conclusion. They place the adverts in the publications that IT managers read, and they buy what is advertised.
This time around, amazingly, it looks like instead we are winning, or may even have won. The blame for this can probably be placed on the Internet, and the way it has allowed a community of practice to develop outside the grind of the trade shows and vendor sponsored events that previously dominated the computer business.
So, what’s more amazing, the fact there actually is a grassroots movement in something as obscure as Software Architecture, or the fact it seems to be beating huge organisations with large budgets? I don’t know, but I think it bodes very well for the future.
Over at CNET, Opera’s CTO has written an article that rather sums up my own feelings on the battle between Microsoft’s XML-based document formats and OpenOffice’s. He also introduces something that I think is little short of revolutionary.
Both of them are frankly impenetrable XML. Just by taking their binary format and turning it into tags does not make it therefore interoperable. Interoperability in formats, as in any protocols, is gained by providing a level of abstraction. Just providing what is effectively a list of the properties of internal objects is no good — the choice of objects to solve the problem (i.e. in the solution space in architectural terms) is effectively arbitrary. In your protocols, file formats and published interfaces you need to provide some level of abstraction, so you say things about the problem space.
I am a fan of LaTeX which does provide this level of abstraction. LaTeX templates and documents deal with structure and with core concepts in typesetting such as kerning and leading. These things are necessary if you are to publish a “real” book, such as a scholarly work.
However, LaTeX is hard. Really hard. And for the sort of document generally generated in the workplace it’s overkill. In fact most documents produced in the workplace are never actually read by anyone, so the less effort involved in production the better.
Mr Lie’s suggestion of HTML and CSS3 is a really interesting one – HTML is accreting semantic information rapidly, with things such as the microformats movement. Allowing a document to provide multi-dimensional semantic information (is it a book? is it a plane? it’s both!), and that also can be viewed ‘natively’ in the browser is very exciting. What is more important it knocks all the Microsoft vs OpenDocument stuff into a cocked hat. Who gives a damn what ISO eventually certify, if you’ve got something so eminently practical you can actually use?
The proof of the pudding of course, is in the eating, and Mr Lie seems to have done just this, producing a book using Prince.
This really feels like a major step forward in convergence to me, and a proper pragmatic step too, that leads to actual results rather than a load of hand-waving at conferences. It makes me want to go and write a book!
A truly excellent article on Scaling Agile development through architecture over at Agile Journal. We’ve been discovering a lot of these rules ourselves are our project size and complexity increases, and I can recommend them all.
It’s still difficult though – we’ve certainly found that sometimes it’s virtually impossible to predict all the key points of an architecture early. The old agile stalwarts of good test coverage and lots-and-lots-of-refactoring have to come in.