Author Archives: Doug Winter

Containerisation: tips for using Kubernetes with AWS

Containers have been a key part of developer toolkits for many years now, but they are now becoming more common to use in production. Driving this adoption, in part, is the maturity of production-grade tooling and systems.

The leading container management product is Docker, but on its own docker does not provide enough to deploy into production, which has led to a new product category Container Orchestration.

The leading product in this space is Kubernetes, developed initially by Google and then released as open source software in 2015. Kubernetes differs from some of the competing container orchestration products in its design philosophy which is committed to open source (with components like iptables, nginx and etcd as core moving parts) and by being entirely API first in its design.

Our experience is that Kubernetes is ridiculously easy to deploy and manage and has many benefits over straight virtualisation for deploying mixed workloads, particularly in a public cloud environment.

Our services

We are working towards becoming a Kubernetes Certified Service Provider and are actively delivering Kubernetes solutions for customers, primarily on AWS. If you are interested in our consulting or implementation services please just drop us a line.

Why containers?

The primary benefits are cost and management effort. Cost because expensive compute resource can be efficiently shared between multiple workloads. Management because the container paradigm packages up an application with its dependencies in a way that facilitates flexible release and deployment.

A container cluster of 2 or 3 computers can host dozens of containers, all delivering different workloads. the Kubernetes software can scale containers within this cluster and can scale the overall cluster up and down depending on the needs of the workloads. This allows the cluster to be downsized to a minimum size when load is low. It also means containers that require very very low levels of resources can remain in service without needing to take a whole virtual machine.

Management time benefits enormously because of the packaging of applications with their dependencies. It allows you to share compute resource even when the workloads have conflicting dependencies – a very common problem. It allows upgrades to progress across your estate in a partial manner, rather than requiring big bang upgrades of possibly risky underlying software.

Finally it also allows you to safely upgrade the underlying operating system of your cluster without downtime. Workloads are automatically migrated around the cluster as nodes are taken out of service and new, upgraded, nodes are brought in.  We’ve done this a bunch of times now and it is honestly kind of magic.

There are other benefits to do with ease of accessgranular access control and federation, and I might deal with those in later posts.

Tips

Here are a few tips if you are considering getting started with Kubernetes.

Domains

Buy a new top level domain for every cluster. This makes your URLs so much nicer, and it really isn’t that expensive! 🙂

AWS accounts

We consider best practice to be a MASTER account, where your user accounts sit, and then one sub account for your production environment, with further sub accounts for pre-production environments. Note that you can run staging sites in your production cluster – this pattern should become much more common, since you are not staging the cluster, but staging the sites.

A staging cluster is only needed to test cluster-wide upgrades and changes.

Security

When all your sites are in a single cluster, and behind a single AWS ELB (yes, you can do this), it makes things such as Web Application Firewall automation and IP restricted ELBs more cost-effective. These things only need to be applied once to provide benefit across your estate.

Role-Based Access Control

This is a relatively new feature of Kubernetes, but it is solid and well-designed. I’d recommend turning this on from day one, so the capabilities are available to you later.

Flannel and Calico, or Weave

Similarly I’d recommend enabling an overlay network from day one. These are easily deployed into an AWS Kubernetes cluster using the kops tool, but they provide advanced network capabilities if you ever need them in the future.

Namespaces

Use namespaces to subdivide your estate into logical partitions. production and staging are an obvious distinction, but you may well have user groups where namespaces make a sensible boundary for applying access control.

Tooling

Currently integrating kubernetes configuration with cloudformation configuration means writing some custom tooling. Bite the bullet with this and dedicate some time to making a good job of this. I’m expecting to see Kubernetes become a first-class citizen within AWS at some point, but until then you are going to need to own your devops and do a good job of this.

Resource records

Create Route53 ALIAS records for all your exposed endpoints (which could be just your single ELB for your ingress controller), and use this in your Cloudfront distributions. This makes upgrades a lot easier!

SOMA in use during the 2017 Edinburgh Festival

Live video mixing with the BBC: Lessons learned

In this post I am going to reflect on some of the more interesting aspects of this project and the lessons they might provide for other projects.BBC R&D logo

This post is one of a series talking about our work on the SOMA video mixing application for the BBC. The previous posts in the series are:

  1. Building a live television video mixing application for the browser
  2. The challenges of mixing live video streams over IP networks
  3. Rapid user research on an Agile project
  4. Compositing and mixing video in the browser
  5. Taming the Async Beast with FRP and RxJS
  6. RxJS: An object lesson in terrible good software
  7. Video: The future of TV broadcasting
  8. Integrating UX with agile development

In my view there are three broad areas where this project has some interesting lessons.

Novel domains

First is the novel domain.

This isn’t unfamiliar – we often work in novel domains that we have little to no knowledge of. It is the nature of technical agency in fact – while we have some domains that we’ve worked in for many years such as healthcare and education there are always novel businesses with entirely new subjects to wrap our heads around.  (To give you some idea, a few recent examples include store-and-forward television broadcasting, horse racing odds, medical curricula, epilepsy diagnosis, clustering automation and datacentre hardware provisioning.)

Over the years this has been the thing that I have most enjoyed out of every aspect of our work. Plunging into an entirely new subject with a short amount of time to understand it and make a useful contribution is exhilarating.

Although it might sound a bit insane to throw a team who know nothing about a domain at a problem, what we’re very good at is designing and building products. As long as our customers can provide the domain expertise, we can bring the product build. It is easier for us to learn the problem domain than it is for a domain expert to learn how to build great products.

The greatest challenge with a new domain is the assumptions. We all have these in our work – the things we think are so well understood that we don’t even mention them. These are a terrible trap for software developers, because we can spend weeks building completely the wrong thing with no idea that we’re doing so.

We were very lucky in this respect to be working with a technical organisation within the BBC: Research & Development. They were aware of this risk and did a very good job of arranging our briefing, which included a visit to a vision mixing gallery. This is the kind of exercise that delivers a huge amount in tacit understanding, and allows us to ask the really stupid questions in the right setting.

I think of the core problem as a “Rumsfeld“. Although he got a lot of criticism for these comments I think they’re bizarrely insightful. There really are unknown unknowns, and what the hell do you do about them? You can often sense that they exist, but how do you turn them into known unknowns?

For many of these issues the challenge is not the answer, which is obvious once it has been found, but facilitating the conversation to produce the answer. It can be a long and frustrating process, but critical to success.

I’d encourage everyone to try and get the software team into the existing environment of the target stakeholder groups to try and understand at a fundamental level what they need.

The Iron Triangle

The timescale for this project was extraordinarily difficult – nine weeks from a standing start. In addition much of the scope was quite fixed – we were largely building core functionality that, if missing, would have rendered the application useless. In addition we wanted to achieve the level of finish for the UX that we generally deliver.

This was extremely ambitious, and in retrospect we bit off more than we could reasonably chew.

Time is the greatest enemy of software projects because of the challenges in estimation. For reasons covered in a different blog post, estimation for software projects is somewhere between an ineffable art reserved only for the angels, and completely impossible.

Triangle with sides labeled Quality, Scope and TimeWhen estimates are impossible, time becomes an even greater challenge. One of the truisms of our industry is the “Iron Triangle” of time, scope and quality. Like a good chinese buffet, you can only choose two. If you want a fixed time and scope, it is quality that will suffer.

Building good software takes thought and planning. Also, the first version of a component is rarely the best – it takes time to assemble, then consider it, and then perhaps shape it into something near its final form.

Quality is, itself, an aggregate quality. Haste lowers the standards for each part and so, by a process of multiplication, lowers far more the overall quality of a product. The only way to achieve a very high quality for the finished product is for every single part to be of similarly high quality. This is generally our goal.

However. Whisper it. It is possible to “manage” quality, if you understand your process and know the goal. Different kinds of testing can provide different levels of certainty of code quality. Manual testing, when done exhaustively, can substitute in some cases for perfection in code.

We therefore managed our quality, and I think actually did well here.

Asynchronous integration components had to be of absolute perfection because any bugs would result in general lack of stability which would be impossible to trace. The only way to build these is carefully, with a design and the only way to test these is exhaustively with unit and integration tests.

On the other hand, there were a lot of aspects of the UI where it was crucial that they performed and looked excellent, but the code could be rougher around the edges, and could just be hacked out. This was my area of the application, and my goal was to deliver features as fast as possible with just acceptable quality. Some of the code was quite embarrassing but we got the project over the line in the time, with the scope, and it all worked. This was sufficient for those areas.

Experimental technologies

I often talk about our approach using the concept of an innovation curve, and our position on it (I think I stole the idea from Ian Jindal – thanks Ian!).

If you can imagine a curve like this one, where the X axis is “how innovative your technologies are”, the Y axis is “pain”.

In practical terms this can be translated into “how likely I am to find the answer to my problems on Stack Overflow“.

At the very left, everything has been seen and done before, so there is no challenge from novelty – but you are almost certainly not making the most of available technologies.

At the far right, you are hand crafting your software from individual photons and you have to conduct high-energy physics experiments to debug your code. You are able to mould the entire universe to your whim – but it takes forever and costs a fortune.

There is no correct place to sit on this curve – where you sit is a strategic (and emotional) decision that depends on the forces at play in your particular situation.

Isotoma endeavours to be somewhere on the shoulder of the curve. The software we build generally needs to last 5+ years, so we can’t pick flash-in-the-pan technologies that will be gone in 18 months. But similarly we need to be relatively recent so it doesn’t become obsolete. This is sometimes called “leading edge”. Almost bleeding edge, but not so close you get cut. With careful choice of tools it is possible to maintain a position like this successfully.

This BBC project was off to the right of this curve, far closer to the bleeding edge than we’d normally choose, and we definitely suffered.

Some of the technologies we had to use had some serious issues:

  1. To use IPStudio, a properly cutting edge product developed internally by BBC R&D, we routinely had to read the C++ source code of the product to find answers to integration questions.
  2. We needed dozens of coordinated asynchronous streams running, for which we used RxJS. This was interesting enough to justify two posts on this blog on its own.
  3. WebRTC, which was the required delivery mechanism for the video, is absolutely not ready for this use case. The specification is unclear, browser implementation is incomplete and it is fundamentally unsuited at this time to synchronised video delivery.
  4. The video compositing technologies in browsers actually works quite well, but was entirely new to us and it took considerable time to gain sufficient expertise to do a good job. Also browser implementations still have surprising sharp edges (only 16 WebGL contexts are allowed! Why 16? I dunno.)

Any of these one issues could have sunk our project, so I am very proud we shipped good software, with all four issues.

Lessons learned? Task allocation is the key to this one I think.

One person, Alex, devoted his time to the IPStudio and WebRTC work for pretty much the entire project, and Ricey concentrated on video mixing.

Rather than try and skill up several people, concentrate the learning in a single brain. Although this is generally a terrible idea (because then you have a hard dependency on a single individual for a particular part of the codebase), in this case it was the only way through, and it worked.

Also, don’t believe any documentation, or in fact anything written in any human languages. When working on the bleeding edge you must “Use The Source, Luke”. Go to the source code and get your head around it. Everything else lies.

Summary

I am proud, justifiably I think, that we delivered this project successfully. It was used at the Edinburgh festival and actual real live television was mixed using our product, given all the constraints above.

The lessons?

  1. Spend the time and effort to make sure your entire team understand the tacit requirements of the problem domain and the stakeholders.
  2. Have an approach to managing appropriate quality that delivers the scope and timescale, if these are heavily constrained.
  3. Understand your position on the innovation curve and choose a strategic approach to managing this.

The banner image at the top of the article, taken by Chris Northwood, shows SOMA in use during the 2017 Edinburgh Festival.

Screenshot of SOMA vision mixer

Compositing and mixing video in the browser

This blog post is the 4th part of our ongoing series working with the BBC Research & Development team. If you’re new to this project, you should start at the beginning!

BBC R&D logoLike all vision mixers, SOMA (Single Operator Mixing Application) has a “preview” and “transmission” monitor. Preview is used to see how different inputs will appear when composed together – in our case, a video input, a “lower third” graphic such as a caption which fades in and out, and finally a “DOG” such as a channel or event identifier shown in the top corner throughout a broadcast.

When switching between video feeds SOMA offers a fast cut between inputs or a slower mix between the two. As and when edit decisions are made, the resulting output is shown in the transmission monitor.

The problem with software

However, one difference with SOMA is that all the composition and mixing is simulated. SOMA is used to build a set of edit decisions which can be replayed later by a broadcast quality renderer. The transmission monitor is not just a view of the output after the effects have been applied as the actual rendering of the edit decisions hasn’t happened yet. The app needs to provide an accurate simulation of what the edit decision will look like.

The task of building this required breaking down how output is composed – during a mix both the old and new input sources are visible, so six inputs are required.

VideoContext to the rescue

Enter VideoContext, a video scheduling and compositing library created by BBC R&D. This allowed us to represent each monitor as a graph of nodes, with video nodes playing each input into transition nodes allowing mix and opacity to be varied over time, and a compositing node to bring everything together, all implemented using WebGL to offload video processing to the GPU.

The flexible nature of this library allowed us to plug in our own WebGL scripts to cut the lower third and DOG graphics out using chroma-keying (where a particular colour is declared to be transparent – normally green), and with a small patch to allow VideoContext to use streaming video we were off and going.

Devils in the details

The fiddly details of how edits work were as fiddly as expected: tracking the mix between two video inputs versus the opacity of two overlays appeared to be similar problems but required different solutions. The nature of the VideoContext graph meant we also had to keep track of which node was current rather than always connecting the current input to the same node. We put a lot of unit tests around this to ensure it works as it should now and in future.

By comparison a seemingly tricky problem of what to do if a new edit decision was made while a mix was in progress was just a case of swapping out the new input, to avoid the old input reappearing unexpectedly.

QA testing revealed a subtler problem that when switching to a new input the video takes a few tens of milliseconds to start. Cutting immediately causes a distracting flicker as a couple of blank frames are rendered – waiting until the video is ready adds a slight delay but this is significantly less distracting.

Later in the project a new requirement emerged to re-frame videos within the application and the decision to use VideoContext paid off as we could add an effect node into the graph to crop and scale the video input before mixing.

And finally

VideoContext made the mixing and compositing operations a lot easier than they would have been otherwise. Towards the end we even added an image source (for paused VTs) using the new experimental Chrome feature captureStream, and that worked really well.

After making it all work the obvious point of possible concern is performance, and overall it works pretty well.  We needed to have half-a-dozen or so VideoContexts running at once and this was effective on a powerful machine.  Many more and the computer really starts struggling.

Even a few years ago attempting this in the browser would have been madness, so its great to see such a lot of progress in something so challenging, and opening up a whole new range of software to work in the browser!

Read part 5 of this project with BBC R&D where Developer Alex Holmes talks about Taming async with FRP and RxJS.

The challenges of mixing live video streams over IP networks

Welcome to our second post on the work we’re doing with BBC Research & Development. If you’ve not read the first post, you should go read that first 😉

Introducing IP Studio

BBC R&D logo

The first part of the infrastructure we’re working with here is something called IP Studio. In essence this is a platform for discovering, connecting and transforming video streams in a generic way, using IP networking – the standard on which pretty much all Internet, office and home networks are based.

Up until now video cameras have used very simple standards such as SDI to move video around. Even though SDI is digital, it’s just point-to-point – you connect the camera to something using a cable, and there it is. The reason for the remarkable success of IP networks, however, is their ability to connect things together over a generic set of components, routing between connecting devices. Your web browser can get messages to and from this blog over the Internet using a range of intervening machines, which is actually pretty clever.

Doing this with video is obviously in some senses well-understood – we’ve all watched videos online. There are some unique challenges with doing this for live television though!

Why live video is different

First, you can’t have any buffering: this is live. It’s unacceptable for everyone watching TV to see a buffering message because the production systems aren’t quick enough.

Second is quality. These are 4K streams, not typical internet video resolution. 4K streams have (roughly) 4000 horizontal pixels compared to the (roughly) 2000 for a 1080p stream (weirdly 1080p, 720p etc are named for their vertical pixels instead). this means they need about 4 times as much bandwidth – which even in 2017 is quite a lot. Specialist networking kit and a lot of processing power is required.

Third is the unique requirements of production – we’re not just transmitting a finished, pre-prepared video, but all the components from which to make one: multiple cameras, multiple audio feeds, still images, pre-recorded video. Everything you need to create the finished live product. This means that to deliver a final product you might need ten times as much source material – which is well beyond the capabilities of any existing systems.

IP Studio addresses this with a cluster of powerful servers sitting on a very high speed network. It allows engineers to connect together “nodes” to form processing “pipelines” that deliver video suitable for editing. This means capturing the video from existing cameras (using SDI) and transforming them into a format which will allow them to be mixed together later.

It’s about time

That sounds relatively straightforward, except for one thing: time. When you work with live signals on traditional analogue or point-to-point digital systems, then live means, well, live. There can be transmission delays in the equipment but they tend to be small and stable. A system based on relatively standard hardware and operating systems (IP Studio uses Linux, naturally) is going to have all sorts of variable delays in it, which need to be accommodated.

IP Studio is therefore based on “flows” comprising “grains”. Each grain has a quantum of payload (for example a video frame) and timing information. the timing information allows multiple flows to be combined into a final output where everything happens appropriately in synchronisation. This might sound easy but is fiendishly difficult – some flows will arrive later than others, so systems need to hold back some of them until everything is running to time.

To add to the complexity, we need two versions of the stream, one at 4k and one at a lower resolution.

Don’t forget the browser

Within the video mixer we’re building, we need the operator to be able to see their mixing decisions (cutting, fading etc.) happening in front of them in real time. We also need to control the final transmitted live output. There’s no way a browser in 2017 is going to show half-a-dozen 4k streams at once (and it would be a waste to do so). This means we are showing lower resolution 480p streams in the browser, while sending the edit decisions up to the output rendering systems which will process the 4k streams, before finally reducing them to 1080p for broadcast.

So we’ve got half-a-dozen 4k streams, and 480p equivalents, still images, pre-recorded video and audio, all being moved around in near-real-time on a cluster of commodity equipment from which we’ll be delivering live television!

Read part 3 of this project with BBC R&D where we delve into rapid user research on an Agile project.

MediaCity UK offices

Building a live television video mixing application for the browser

BBC R&D logoThis is the first in a series of posts about some work we are doing with BBC Research & Development.

The BBC now has, in the lab, the capability to deliver live television using high-end commodity equipment direct to broadcast, over standard IP networks. What we’ve been tasked with is building one of the key front-end applications – the video mixer. This will enable someone to mix an entire live television programme, at high quality, from within a standard web-browser on a normal laptop.

In this series of posts we’ll be talking in great depth about the design decisions, implementation technologies and opportunities presented by these platforms.

What is video mixing?

Video editing used to be a specialist skill requiring very expensive, specialist equipment. Like most things this has changed because of commodity, high-powered computers and now anyone can edit video using modestly priced equipment and software such as the industry standard Adobe Premiere. This has fed the development of services such as YouTube where 300 hours of video are uploaded every minute.

“Video Mixing” is the activity of getting the various different videos and stills in your source material and mixing them together to produce a single, linear output. It can involve showing single sources, cutting and fading between them, compositing them together, showing still images and graphics and running effects over them. Sound can be similarly manipulated. Anyone who has used Premiere, even to edit their family videos, will have some idea of the options involved.

Live television is a very different problem

First you need to produce high-fidelity output in real time. If you’ve ever used something like Premiere you’ll know that when you finally render your output it can take quite a long time – it can easily spend an hour rendering 20 minutes of output. That would be no good if you were broadcasting live! This means the technology used is very different – you can’t just use commodity hardware, you need specialist equipment that can work with these streams in realtime.

Second the capacity for screw up is immensely higher. Any mistakes in a live broadcast are immediately apparent, and potentially tricky to correct. It is a high-stress environment, even for experienced operators.

Finally, the range of things you might choose to do is much more limited, because you can spend little time setting it up. This means live television tends to use a far smaller ‘palette’ of mixing operations.

Even then, a live broadcast might require half a dozen people even for a modest production. You need someone to set up the cameras and control them, a sound engineer to get the sound right, someone to mix the audio, a vision mixer, a VT Operator (to run any pre-recorded videos you insert – perhaps the titles and credits) and someone to set up the still image overlays (for example, names and logos).

If that sounds bad, imagine a live broadcast away from the studio – the Outside Broadcast. All the people and equipment needs to be on site, hence the legendary “OB Van”:

P1040518

Inside one of those vans is the equipment and people needed to run a live broadcast for TV. They’d normally transmit the final output directly to air by satellite – which is why you generally see a van with a massive dish on it nearby. This equipment runs into millions and millions of pounds and can’t be deployed on a whim. When you only have a few channels of course you don’t need many vans…

The Internet Steamroller

The Internet is changing all of this. Services like YouTube Live and Facebook Live mean that anyone with a phone and decent coverage can run their own outside broadcast. Where once you needed a TV network and millions of pounds of equipment now anyone can do it. Quality is poor and there are few options for mixing, but it is an amazingly powerful tool for citizen journalism and live reporting.

Also, the constraints of “channels” are going. Where once there was no point owning more OB Vans than you have channels, now you could run dozens of live feeds simultaneously over the Internet. As the phone becomes the first screen and the TV in the corner turns into just another display many of the constraints that we have taken for granted look more and more anachronistic.

These new technologies provide an opportunity, but also some significant challenges. The major one is standards – there is a large ecosystem of manufacturers and suppliers whose equipment needs to interoperate. The standards used, such as SDI (Serial Digital Interface) have been around for decades and are widely supported. Moving to an Internet-based standard needs cooperation across the industry.

BBC R&D has been actively working towards this with their IP Studio  project, and the standards they are developing with industry for Networked Media.

Read part 2 of this project with BBC R&D where I’ll describe some of the technologies involved, and how we’re approaching the project.

Our Plants Need Watering Part II

This is the second post in a series doing a deep dive into Internet of Things implementation.  If you didn’t read the first post, Our Plants Need Watering Part I, then you should read that first.

This post talks about one of the most important decisions you’ll make in an IoT project: which microcontroller to use. There are lots of factors and some of them are quite fractal – but that said I think I can make some concrete recommendations based on what I’ve learned so far on this project, that might help you in your next IoT project.

This post gets really technical I am afraid – there’s no way of comparing microprocessors without getting into the weeds.

There are thousands of different microcontrollers on the market, and they are all different.  How you choose the one you want depends on a whole range of factors –  there is no one-size-fits all answer.

Inside a microcontroller

A microcontroller is a single chip that provides all the parts you require to connect software and hardware together. You can think of it as a tiny, complete, computer with CPU, RAM, storage and IO. That is where the resemblance ends though, because each of these parts is quite different from the computers you might be used to.

 

CPU

The Central Processing Unit (CPU) takes software instructions and executes them. This is the bit that controls the rest of the microcontroller, and runs your software.

Microcontroller CPUs come in all shapes and sizes all of which governs the performance and capabilities of the complete package. Mostly the impact of your CPU choice is smaller than you might think – toolchains and libraries protect you from most of the differences between CPU platforms.

Really it is price and performance that matter most, unless you need very specific capabilities. If you want to do floating point calculations or do high-speed video or image processing then you’re going to select a platform with those specific capabilities.

Flash Memory

The kind of computers we are used to dealing with have hard disks to hold permanent storage. Microcontrollers generally do not have access to hard disks. Instead they have what is called “flash” memory. This is permanent – it persists even if power is disconnected. The name “flash” comes from the way the memory is erased “like a camera flash”. It’s colloquially known as just “flash”.

You need enough flash to store your code. The amount of flash available varies tremendously. For example the Atmel ATtiny25 has a whole 2KB of flash whereas the Atmel ATSAM4SD32 has 2MB.

Determining how big your code will be is an important consideration, and often depends on the libraries you need to use. Some quotidian things we take for granted in the macro world, like C’s venerable printf function are too big to fit onto many microcontrollers in its normal form.

Static RAM (SRAM)

Flash is not appropriate for storing data that changes. This means your working data needs somewhere else to go. This is generally SRAM. You will need enough SRAM to hold your all changeable data.  

The amount of SRAM available varies widely. The ATtiny25 has a whole 128 bytes (far less than the first computer I ever programmed, the ZX81, and that was 35 years ago!). At the other end of the scale the ATSAM4SD32 has 160K, and can support separate RAM chips if you need them.

I/O Pins

Microcontrollers need to talk to the outside world, and they do this via their I/O pins. You are going to need to count the pins you need, which will depend on the devices you plan to connect your microcontroller to.

Simple things like buttons, switches, LEDs and so forth can use I/O pins on an individual basis in software, and this is a common use case. Rarely do you build anything that doesn’t use a switch, button or an LED.

If you are going to talk digital protocols however you might well want hardware support for those protocols. This means you might consider things like I²C, RS232 or ISP.

A good example of this is plain old serial. Serial is a super-simple protocol that dates back to the dark ages of computing. One bit at a time is sent over a single pin, and these are assembled together into characters. Serial support needs a bit of buffering, some timing control and possibly some flow control, but that’s it.

The ATtiny range of microprocessors have no hardware support for serial, so if you want to even print text out to your computer’s serial port you will need to do that in software on the microprocessor. This is slow, unreliable and takes up valuable flash. It does work though, at slow speeds – timing gets unreliable pretty quickly when doing things in software.

At the other end you have things like the SAM3X8E based on the ARM Cortex M3 which have a UART and 3 USARTs – hardware support for high speed (well 115200 baud) connections to several devices simultaneously and reliably.

Packaging

There are loads of different packaging formats for integrated circuits. Just check out the list on Wikipedia. Note that when you are developing your product you are likely to use a “development board”, where the microcontroller is already mounted on something that makes it easy to work with.

Here is a dev board for the STM32 ARM microprocessor:

(screwdriver shown for scale).

You can see the actual microprocessor here on the board:

Everything else on the board is to make it easier to work with that CPU – for example adding protection (so you don’t accidentally fry it), making the pins easier to connect, adding debug headers and also a USB interface with a programmer unit, so it is easy to program the chip from a PC.

For small-scale production use, “through hole” packages like DIP can be worked with easily on a breadboard, or soldered by hand. For example, here is a complete microcontroller, the LPC1114FN28:

Some, others, like “chip carriers” can fit into mounts that you can use relatively easily, and finally there are “flat packages”, which you would struggle to solder by hand:

Development support

It is all very well choosing a microcontroller that will work in production – but you need to get your software written first. This means you want a “dev board” that comes with the microcontroller helpfully wired up so you can use it easily.

There are dev boards available for every major platform, and mostly they are really quite cheap.

Here are some examples I’ve collected over the last few years:

The board at the bottom there is an Arduino Due, which I’ve found really useful.  The white box connected to it is an ATMEL debug device, which gives you complete IDE control of the code running on the CPU, including features like breakpoints, watchpoints, stepping and so forth.

Personally I think you should find a dev board that fits your needs first, then you need to choose a microcontroller that is sufficiently similar. A workable development environment is absolutely your number one goal!

Frameworks, toolchains and libraries

This is another important consideration – you want it to be as easy as possible to write your code, whilst getting full access to the capabilities of the equipment you’ve chosen.

Arduino

Arduino deserves a special mention here, as a spectacularly accessible way into programming microprocessors. There is a huge range of Arduino, and Arduino compatible, devices starting at only a few pounds and going right up to some pretty high powered equipment.

Most Arduino boards have a standard layout allowing “shields” to be easily attached to them, giving easy standardised access to additional equipment.

The great advantage of Arduino is that you can get started very easily. The disadvantage is that you aren’t using equipment you could go into production with directly. It is very much a hobbyist solution (although I would love to hear of production devices using Arduino code).

Other platforms

Other vendors have their own IDEs and toolchains – many of which are quite expensive.  Of the ones I have tried Atmel Studio is the best by far.  First it is free – which is pretty important.  Second it uses the gcc toolchain, which makes debugging a lot easier for the general programmer.  Finally the IDE itself is really quite good.

Next time I’ll walk through building some simple projects on a couple of platforms and talk about using the Wifi module in earnest.

 

Internet Security Threats – When DDoS Attacks

On Friday evening an unknown entity launched one of the largest Distributed Denial of Service (DDoS) attacks yet recorded, against Dyn, a DNS provider. Dyn provide service for some of the Internet’s most popular services, and they duly suffered problems. Twitter, Github and others were unavailable for hours, particularly in the US.

DDoS attacks happen a lot, and are generally uninteresting. What is interesting about this one is:

  1. the devices used to mount the attack
  2. the similarity with the “Krebs attack” last month
  3. the motive
  4. the potential identity of the attacker

Together these signal that we are entering a new phase in development of the Internet, one with some worrying ramifications.

The devices

Unlike most other kinds of “cyber” attack, DDoS attacks are brute force – they rely on sending more traffic than the recipient can handle. Moving packets around the Internet costs money so this is ultimately an economic contest – whoever spends more money wins. The way you do this cost-effectively, of course, is to steal the resources you use to mount the attack. A network of compromised devices like this is called a “botnet“.

Most computers these days are relatively well-protected – basic techniques like default-on firewalls and automated patching have hugely improved their security. There is a new class of device though, generally called the Internet of Things (IoT) which have none of these protections.

IoT devices demonstrate a “perfect storm” of security problems:

  1. Everything on them is written in the low-level ‘C’ programming language. ‘C’ is fast and small (important for these little computers) but it requires a lot of skill to write securely. Skill that is not always available
  2. Even if the vendors fix a security problem, how does the fix get onto the deployed devices in the wild? These devices rarely have the capability to patch themselves, so the vendors need to ship updates to householders, and provide a mechanism for upgrades – and the customer support this entails
  3. Nobody wants to patch these devices themselves anyway. Who wants to go round their house manually patching their fridge, toaster and smoke alarm?
  4. Because of their minimal user interfaces (making them difficult to operate if something goes wrong), they often have default-on [awful] debug software running. Telnet to a high port and you can get straight in to adminster them
  5. They rarely have any kind of built in security software
  6. They have crap default passwords, that nobody ever changes

To see how shockingly bad these things are, follow Matthew Garrett on Twitter. He takes IoT devices to pieces to see how easy they are to compromise. Mostly he can get into them within a few minutes. Remarkably one of the most secure IoT device he’s found so far was a Barbie doll.

That most of these devices are far worse than a Barbie doll should give everyone pause for thought. Then imagine the dozens of them so many of us have scattered around our house.  Multiply that by the millions of people with connected devices and it should be clear this is a serious problem.

Matthew has written on this himself, and he’s identified this as an economic problem of incentives. There is nobody who has an incentive to make these devices secure, or to fix them afterwards. I think that is fair, as far as it goes, but I would note that ten years ago we had exactly the same problem with millions of unprotected Windows computers on the Internet that, it seemed, nobody cared about.

The Krebs attack

A few weeks ago, someone launched a remarkably similar attack on a security researcher Brian Krebs. Again the attackers are unknown and they launched the attack using a global network of IoT devices.

Given the similarities in the attack on Krebs and the attack on Dyn, it is probable that both of these attacks were undertaken by the same party. This doesn’t, by itself, tell us very much.

It is common for botnets to be owned by criminal organisations that hire them out by the hour. They often have online payment gateways, telephone customer support and operate basically like normal businesses.

So, if this botnet is available for hire then the parties who hired it might be different. However, there is one other similarity which makes this a lot spookier – the lack of an obvious commercial motive.

The motive

Mostly DDoS attacks are either (a) political or (b) extortion. In both cases the identity of the attackers is generally known, in some sense. For political DDOS attacks (“hacktivism”) the targets have often recently been in the news, and are generally quite aware of why they’re attacked.

Extortion using DDoS attacks is extremely common – anyone who makes money on the Internet will have received threats, and have been attacked and many will have paid out to prevent or stop a DDoS.  Banks, online gaming, DNS providers, VPN providers and ecommerce sites are all common targets – many of them so common that they have experienced operations teams in place who know how to handle these things.

To my knowledge no threats were made to Dyn or Krebs before the attacks and nobody tried to get money out of them to stop them.

What they have in common is their state-of-the-art protection. Brian Krebs was hosted by Akamai, a very well-respected content delivery company who have huge resources – and for whom protecting against DDOS is a line of business. Dyn host the DNS for some of the world’s largest Internet firms, and similarly are able to deploy huge resources to combat DDOS.

This looks an awful lot like someone testing out their botnet on some very well protected targets, before using it in earnest.

The identity of the attacker

It looks likely therefore that there are two possibilities for the attacker. Either it is (a) a criminal organisation looking to hire out their new botnet or (b) a state actor.

If it is a criminal organisation then right now they have the best botnet in the world. Nobody is able to combat this effectively.  Anyone who owns this can hire it out to the highest bidder, who can threaten to take entire countries off the Internet – or entire financial institutions.

A state actor is potentially as disturbing. Given the targets were in the US it is unlikely to be a western government that controls this botnet – but it could be one of dozens from North Korea to Israel, China, Russia, India, Pakistan or others.

As with many weapons a botnet is most effective if used as a threat, and we many never know if it is used as a threat – or who the victims might be.

What should you do?

As an individual, DDoS attacks aren’t the only risk from a compromised device. Anyone who can compromise one of these devices can get into your home network, which should give everyone pause – think about the private information you casually keep on your home computers.

So, take some care in the IoT devices you buy, and buy from reputable vendors who are likely to be taking care over their products. Unfortunately the devices most likely to be secure are also likely to be the most expensive.

One of the greatest things about the IoT is how cheap these devices are, and the capability they can provide at this low price. Many classes of device don’t necessarily even have reliable vendors working in that space. Being expensive and well made is no long-term protection – devices routinely go out of support after a few years and become liabilities.

Anything beyond this is going to require concerted effort on a number of fronts. Home router vendors need to build in capabilities for detecting compromised devices and disconnecting them. ISPs need to take more responsibility for the traffic coming from their networks. Until being compromised causes devices to malfunction for their owner there will be no incentives to improve them.

It is likely that the ultimate fix for this will be Moore’s Law – the safety net our entire industry has relied on for decades. Many of the reasons for IoT vulnerabilities are to do with their small amounts of memory and low computing power. When these devices can run more capable software they can also have the management interfaces and automated patching we’ve become used to on home computers.

 

The economics of innovation

One of the services we provide is innovation support. We help companies of all sizes when they need help with the concrete parts of developing new digital products or services for their business, or making significant changes to their existing products.

A few weeks ago the Royal Swedish Academy of Sciences awarded the Nobel Prize for Economics to Oliver Hart and Bengt Holmström for their work in contract theory. This prompted me to look at some of his previous work (for my sins I find economics fascinating), and I came across his 1998 paper Agency Costs and Innovation. This is so relevant to some of my recent experiences I wanted to share it.

Imagine you have a firm or a business unit and you have decided that you need to innovate.

This is a pretty common situation – you know strategically that your existing product is starting to lose traction. Maybe you can see commoditisation approaching in your sector. Or perhaps, as is often the case, you can see the Internet juggernaut bearing down on your traditional business and you know you need to change things up to survive.

What do you do about it?  If you’ve been in this situation the following will probably resonate:

agency2

This describes the principal-agent problem, which is a classic in economics. This describes how a principal (who wants something) can incentivise an agent to do what they want. The agent and “contracting” being discussed here could be any kind of contracting including full time staff.

A good example of the principal-agent problem is how you pay a surgeon. You want to reward their work, but you can’t observe everything they do. The outcome of surgery depends on team effort, not just an individual. They have other things they need to do other than just surgery – developing standards, mentoring junior staff and so forth. Finally the activity itself is very high risk inherently – which means surgeons will make mistakes, no matter how competent. This means their salary would be at risk, which means you need to pay huge bonuses to encourage them to undertake the work at all.

In fact commonly firms will try and innovate using their existing teams, who are delivering the existing product. These teams understand their market. They know the capabilities and constraints of existing systems. They have domain expertise and would seem to be the ideal place to go.

However, these teams have a whole range of tasks available to them (just as with our surgeon above), and choices in how they allocate their time. This is the “multitasking effect”. This is particularly problematic for innovative tasks.

My personal experience of this is that, when people have choices between R&D type work and “normal work”, they will choose to do the normal work (all the while complaining that their work isn’t interesting enough, of course):

variance

This leads large firms to have separate R&D divisions – this allows R&D investment decisions to take place between options that have some homogeneity of risk, which means incentives are more balanced.

However, large firms have a problem with bureaucratisation. This is a particular problem when you wish to innovate:

monitoring

Together this leads to a problem we’ve come across a number of times, where large firms have strong market incentives to spend on innovation – but find their own internal incentive systems make this extremely challenging.

If you are experiencing these sorts of problems please do give us a call and see how we can help.

I am indebted to Kevin Bryan’s excellent A Fine Theorem blog for introducing me to Holmström’s work.

 

A new Isotoma Whitepaper: Chatbots

Over the last six months we’ve had a lot of interest from customers in the emerging area of chatbots, particularly ones using Facebook Messenger as a platform.

While bots have been around, in some form or other, for a very long time the Facebook Messenger platform has catapulted them into prominence.  Access to one billion of the world’s consumers is a tempting prospect for many businesses.

We’ve reviewed the ecosystem that is emerging around chatbots and provide a guide to some of the factors you should consider if you are thinking about building and deploying chatbots, in our new whitepaper.

chatbots-thumbnails

The contents include

  • The history of chat interfaces
  • What conversational interfaces can do, and why
  • Natural Language Processing
  • Features provided by chatbot platforms
  • An in-depth review of eight of the top chatbot platforms
  • Recommendations for next steps, and a look to the future

Please, download the whitepaper, and let us know what you think.

 

Our plants need watering, part I

Here at Isotoma Towers we’ve recently started filling our otherwise spartan office with plants. Plants are lovely but they do require maintenance, and in particular they need timely watering.

Plants.

Since we’re all about automation here, we decided to use this as a test case for building some Internet of Things (IoT) devices.  One of my colleagues pointed out this great moisture sensor from Catnip (right).

This forms the basis of our design.Catnip I2C soil moisture sensor

There are lots and lots of choices for how to build something like this, and this blog post is going to talk about design decisions.  See below the fold for more.

 

Continue reading