Category Archives: Agile

SOMA in use during the 2017 Edinburgh Festival

Live video mixing with the BBC: Lessons learned

In this post I am going to reflect on some of the more interesting aspects of this project and the lessons they might provide for other projects.BBC R&D logo

This post is one of a series talking about our work on the SOMA video mixing application for the BBC. The previous posts in the series are:

  1. Building a live television video mixing application for the browser
  2. The challenges of mixing live video streams over IP networks
  3. Rapid user research on an Agile project
  4. Compositing and mixing video in the browser
  5. Taming the Async Beast with FRP and RxJS
  6. RxJS: An object lesson in terrible good software
  7. Video: The future of TV broadcasting
  8. Integrating UX with agile development

In my view there are three broad areas where this project has some interesting lessons.

Novel domains

First is the novel domain.

This isn’t unfamiliar – we often work in novel domains that we have little to no knowledge of. It is the nature of technical agency in fact – while we have some domains that we’ve worked in for many years such as healthcare and education there are always novel businesses with entirely new subjects to wrap our heads around.  (To give you some idea, a few recent examples include store-and-forward television broadcasting, horse racing odds, medical curricula, epilepsy diagnosis, clustering automation and datacentre hardware provisioning.)

Over the years this has been the thing that I have most enjoyed out of every aspect of our work. Plunging into an entirely new subject with a short amount of time to understand it and make a useful contribution is exhilarating.

Although it might sound a bit insane to throw a team who know nothing about a domain at a problem, what we’re very good at is designing and building products. As long as our customers can provide the domain expertise, we can bring the product build. It is easier for us to learn the problem domain than it is for a domain expert to learn how to build great products.

The greatest challenge with a new domain is the assumptions. We all have these in our work – the things we think are so well understood that we don’t even mention them. These are a terrible trap for software developers, because we can spend weeks building completely the wrong thing with no idea that we’re doing so.

We were very lucky in this respect to be working with a technical organisation within the BBC: Research & Development. They were aware of this risk and did a very good job of arranging our briefing, which included a visit to a vision mixing gallery. This is the kind of exercise that delivers a huge amount in tacit understanding, and allows us to ask the really stupid questions in the right setting.

I think of the core problem as a “Rumsfeld“. Although he got a lot of criticism for these comments I think they’re bizarrely insightful. There really are unknown unknowns, and what the hell do you do about them? You can often sense that they exist, but how do you turn them into known unknowns?

For many of these issues the challenge is not the answer, which is obvious once it has been found, but facilitating the conversation to produce the answer. It can be a long and frustrating process, but critical to success.

I’d encourage everyone to try and get the software team into the existing environment of the target stakeholder groups to try and understand at a fundamental level what they need.

The Iron Triangle

The timescale for this project was extraordinarily difficult – nine weeks from a standing start. In addition much of the scope was quite fixed – we were largely building core functionality that, if missing, would have rendered the application useless. In addition we wanted to achieve the level of finish for the UX that we generally deliver.

This was extremely ambitious, and in retrospect we bit off more than we could reasonably chew.

Time is the greatest enemy of software projects because of the challenges in estimation. For reasons covered in a different blog post, estimation for software projects is somewhere between an ineffable art reserved only for the angels, and completely impossible.

Triangle with sides labeled Quality, Scope and TimeWhen estimates are impossible, time becomes an even greater challenge. One of the truisms of our industry is the “Iron Triangle” of time, scope and quality. Like a good chinese buffet, you can only choose two. If you want a fixed time and scope, it is quality that will suffer.

Building good software takes thought and planning. Also, the first version of a component is rarely the best – it takes time to assemble, then consider it, and then perhaps shape it into something near its final form.

Quality is, itself, an aggregate quality. Haste lowers the standards for each part and so, by a process of multiplication, lowers far more the overall quality of a product. The only way to achieve a very high quality for the finished product is for every single part to be of similarly high quality. This is generally our goal.

However. Whisper it. It is possible to “manage” quality, if you understand your process and know the goal. Different kinds of testing can provide different levels of certainty of code quality. Manual testing, when done exhaustively, can substitute in some cases for perfection in code.

We therefore managed our quality, and I think actually did well here.

Asynchronous integration components had to be of absolute perfection because any bugs would result in general lack of stability which would be impossible to trace. The only way to build these is carefully, with a design and the only way to test these is exhaustively with unit and integration tests.

On the other hand, there were a lot of aspects of the UI where it was crucial that they performed and looked excellent, but the code could be rougher around the edges, and could just be hacked out. This was my area of the application, and my goal was to deliver features as fast as possible with just acceptable quality. Some of the code was quite embarrassing but we got the project over the line in the time, with the scope, and it all worked. This was sufficient for those areas.

Experimental technologies

I often talk about our approach using the concept of an innovation curve, and our position on it (I think I stole the idea from Ian Jindal – thanks Ian!).

If you can imagine a curve like this one, where the X axis is “how innovative your technologies are”, the Y axis is “pain”.

In practical terms this can be translated into “how likely I am to find the answer to my problems on Stack Overflow“.

At the very left, everything has been seen and done before, so there is no challenge from novelty – but you are almost certainly not making the most of available technologies.

At the far right, you are hand crafting your software from individual photons and you have to conduct high-energy physics experiments to debug your code. You are able to mould the entire universe to your whim – but it takes forever and costs a fortune.

There is no correct place to sit on this curve – where you sit is a strategic (and emotional) decision that depends on the forces at play in your particular situation.

Isotoma endeavours to be somewhere on the shoulder of the curve. The software we build generally needs to last 5+ years, so we can’t pick flash-in-the-pan technologies that will be gone in 18 months. But similarly we need to be relatively recent so it doesn’t become obsolete. This is sometimes called “leading edge”. Almost bleeding edge, but not so close you get cut. With careful choice of tools it is possible to maintain a position like this successfully.

This BBC project was off to the right of this curve, far closer to the bleeding edge than we’d normally choose, and we definitely suffered.

Some of the technologies we had to use had some serious issues:

  1. To use IPStudio, a properly cutting edge product developed internally by BBC R&D, we routinely had to read the C++ source code of the product to find answers to integration questions.
  2. We needed dozens of coordinated asynchronous streams running, for which we used RxJS. This was interesting enough to justify two posts on this blog on its own.
  3. WebRTC, which was the required delivery mechanism for the video, is absolutely not ready for this use case. The specification is unclear, browser implementation is incomplete and it is fundamentally unsuited at this time to synchronised video delivery.
  4. The video compositing technologies in browsers actually works quite well, but was entirely new to us and it took considerable time to gain sufficient expertise to do a good job. Also browser implementations still have surprising sharp edges (only 16 WebGL contexts are allowed! Why 16? I dunno.)

Any of these one issues could have sunk our project, so I am very proud we shipped good software, with all four issues.

Lessons learned? Task allocation is the key to this one I think.

One person, Alex, devoted his time to the IPStudio and WebRTC work for pretty much the entire project, and Ricey concentrated on video mixing.

Rather than try and skill up several people, concentrate the learning in a single brain. Although this is generally a terrible idea (because then you have a hard dependency on a single individual for a particular part of the codebase), in this case it was the only way through, and it worked.

Also, don’t believe any documentation, or in fact anything written in any human languages. When working on the bleeding edge you must “Use The Source, Luke”. Go to the source code and get your head around it. Everything else lies.

Summary

I am proud, justifiably I think, that we delivered this project successfully. It was used at the Edinburgh festival and actual real live television was mixed using our product, given all the constraints above.

The lessons?

  1. Spend the time and effort to make sure your entire team understand the tacit requirements of the problem domain and the stakeholders.
  2. Have an approach to managing appropriate quality that delivers the scope and timescale, if these are heavily constrained.
  3. Understand your position on the innovation curve and choose a strategic approach to managing this.

The banner image at the top of the article, taken by Chris Northwood, shows SOMA in use during the 2017 Edinburgh Festival.

Integrating UX with agile development

Incorporating user centred design practices within Agile product development can be a major challenge. Most of us in the user experience field are more familiar with the waterfall “big design up front” methodology. Project managers and developers are also likely to be more comfortable with a discreet UX design phase that is completed before development commences. But this approach tends to be inefficient, slower and more expensive. How does the role of the UX designer change within Agile product development, with its focus on transparency and rapid iteration?

BBC R&D logoWhile at Isotoma we’ve always followed our own flavour of Agile product development, UX is still mostly front-loaded in a “discovery” phase, as at most agencies. Our recent vision mixer project for BBC Research & Development, however, required a more integrated approach. The project had a very tight timeframe, requiring overlapping UX and development, with weekly show & tells.

From a UX perspective, it was a positive experience and I’m happy with the result. This post lists some of the techniques and approaches that I think helped integrate UX with Agile. Of course, every project and organisation is different, so there is definitely no one-size-fits-all approach, but hopefully there is something here you can use in your work. Continue reading

Timesheets: some observations on observation

Just as a throwaway in my post on understanding your team’s progress I said something like “everyone hates timesheets”. And it’s true, they do. They’re onerous, boring and they’re usually seen as invasive, “big brother”-esque, make-work. But, as I also said in that post, good quality time recording is vital to understanding what’s going on within your teams.

Feeling the need

We first started looking at timesheet systems nine or ten years ago when it was becoming abundantly clear that we weren’t making the progress we were expecting on certain projects, but we didn’t know why.

The teams were skilled in the tools they were using, they were diligent, they’d done similar work before, but they just weren’t hitting the velocities that we had come to expect. On top of that, the teams themselves thought they were making good progress. And every which way we approached the problem we were missing the information needed to get to the bottom of the mismatch between expectation and reality.

At that point in the company’s life timesheets were anathema to us; we felt very strongly they indicated a lack of trust, and in a company built entirely on the principles behind the Agile Manifesto… Well… You can see our problem.

Build projects around motivated individuals.
Give them the environment and support they need,
and trust them to get the job done.

But however we cut it we really needed to understand what people were actually doing with their day. We trusted that if people thought they were making good progress then they were, but we definitely knew that we weren’t making the same kind of progress that we had been a year ago on the same types of project. And back then we were often on fixed price projects and billing by the day, so when projects started to overrun our financial performance started to dip and the quality of our code went the same way (for all the reasons I outlined in that previous post).

So we hit on Harvest (at the time one of the poster children of the burgeoning Rails SaaS community) and asked everyone to fill in their sheets for a couple of months so we could generate some data.

We had an all hands meeting, we explained exactly why we were doing it, and we asked, cajoled and bullied people into using it so that at least we had something to work on and perhaps uncover the problems we were hitting.

And of course we found it quickly enough, because accurate timesheets filled in honestly expose exactly what’s going on. By our nature we are both helpful and curious – that’s how we ended up doing what we’re doing. But helpful and curious is easily distracted; a colleague asking for help, an old customer with a quick question, a project manager from another project with an urgent request, the account management team asking “can you just…” And all of this added up. In the worst cases some people were only spending four hours a day on the project they were allocated to; the rest of their time spent helping colleagues and old customers… However, how you cope with these things is probably the subject of another post.
My point here is that once we had that data we realised how valuable it was and knew that we couldn’t go without it again. Our key takeaway was that timesheets are a key part of a company’s introspection and without good data you don’t know the problem you’re actually trying to solve. And so we had to make timesheets part of our everyday processes.

Loving the alien

Like I said; people hate timesheets. They’re invasive. They’re time consuming. They feel like you’re being watched, judged. They imply no trust. They’re alien to an agile environment. And the data they produce is a key part of someone else’s reporting, too. So how do you make sure they’re filled in accurately and honestly? And not just in month one, when you first introduce them, but in month fifty seven when your business relies on them and you may not be watching quite so closely.

We’ve found the following works for us:

  • Make it crystal clear what they’re for, and what they’re not
  • Make it explicit that timesheets are for tracking the performance of estimates and ensuring that progress can be reported accurately
  • It’s not about how much you do, but how much got done
  • Tie them together with things like iDoneThis, so that people can give context to their timesheets in an informal unstructured manner
  • Make sure that everyone who uses the data throughout the management chain is incentivised to treat it honestly – this means your project managers mustn’t feel the need to manipulate it or worse manipulate how it’s entered (we’ve seen this more than once in other organisations)

And Dan, one of our project managers, sends round a gentle chivvying email each evening (filled with the day’s fun facts, of course) to make sure that people actually fill them in.

[Photo by Sabri Tuzcu on Unsplash]

Taming the Async Beast with FRP and RxJS

The Problem

We’ve recently been working on an in-browser vision mixer for the BBC (previous blog posts here, here, here, and here). Live vision mixing involves keeping track of a large number of BBC R&D logointerdependent data streams. Our application receives timing data for video tapes and live video streams via webrtc data channels and websocket connections and we’re sending video and audio authoring decisions over other websockets to the live rendering backend.

Many of the data streams we’re handling are interdependent; we don’t want to send an authoring decision to the renderer to cut to a video tape until the video tape is loaded and ready to play, so we need to wait until the video tape is ready to play before we send an authoring decision; if the authoring websocket has closed we’ll need to reconnect to it then retry sending that authoring decision.

Orchestrating interdependent asynchronous data streams is a fundamentally complex problem.

Promises are one popular solution for composing asynchronous operations and safely transforming the results, however they have a number of limitations. The primary issue is that they cannot be cancelled, so we need to handle teardown separately somehow. We could use the excellent fluture or Task Future libraries instead, both of which support cancellation (and are lazy and chainable and fantasy-land compliant), but futures and promises handle one single future value (or error value), not a stream of many values (or error value). The team working this project are fans of futures (less so of promises) and were aiming to write the majority of the codebase in a functional style using folktale and ramda (and react-redux) so wanted a functional, composable way to handle ongoing streams of data that could sit comfortably within the rest of the codebase.

A Solution

After some debate, we decided to use FRP (functional reactive programming) powered by the observable pattern. Having used RxJS (with redux-observable) for smaller projects in the past, we were confident that it could be an elegant solution to our problem. You can find out more about RxJS here and here but, in short, it’s a library that allows subscribers to listen to and transform the output of a data stream as per the observer pattern, and allows the observable (the thing subscribed to) to “complete” its stream when it runs out of data (or whatever), similar to an iterator from the iterator pattern. Observables also allow their subscribers to terminate them at any point, and typically observables will encapsulate teardown logic related to their data source – a websocket, long-poll, webrtc data channel, or similar.

RxJS implements the observer pattern in a functional way that allows developers to compose together observables, just as they’d compose functions or types. RxJS has its roots in functional reactive programming and leverages the power of monadic composition to chain together streams while also ensuring that teardown logic is preserved and handled as you’d expect.

Why FRP and Observables?

The elegance and power of observables is much more easily demonstrated than explained in a wordy paragraph. I’ll run through the basics and let your imagination think through the potential of it all.

A simple RxJS observable looks like this:

Observable.of(1, 2, 3)

It can be subscribed to as follows:

Observable.of(1, 2, 3).subscribe({
  next: val => console.log(`Next: ${val}`),
  error: err => console.error(err),
  complete: () => console.log('Completed!')
});

Which would emit the following to the console:

Next: 1
Next: 2
Next: 3
Completed!

We can also transform the data just as we’d transform values in an array:

Observable.of(1, 2, 3).map(x => x * 2).filter(x => x !== 4).subscribe(...)
2
6
Completed!

Observables can also be asynchronous:

Observable.interval(1000).subscribe(...)
0 [a second passes]
1 [a second passes]
2 [a second passes]
...

Observables can represent event streams:

Observable.fromEvent(window, 'mousemove').subscribe(...)
[Event Object]
[Event Object]
[Event Object]

Which can also be transformed:

Observable.fromEvent(window, 'mousemove')
  .map(ev => [ev.clientX, ev.clientY])
  .subscribe(...)
[211, 120]
[214, 128]
[218, 139]
...

We can cancel the subscriptions which will clean up the event listener:

const subscription = Observable.fromEvent(window, 'mousemove')
  .map(ev => [ev.clientX, ev.clientY])
  .subscribe(...)

subscription.unsubscribe();

Or we can unsubscribe in a dot-chained functional way:

Observable.of(1, 2, 3)
  .take(2)  // After receiving two values, complete the observable early
  .subscribe(...)
1
2
Completed!
Observable.fromEvent(window, 'mousemove')
  .map(ev => [ev.clientX, ev.clientY])
   // Stop emitting when the user clicks
  .takeUntil(Observable.fromEvent(window, 'click'))
  .subscribe(...)

Note that those last examples left no variables lying around. They are entirely self-contained bits of functionality that clean up after themselves.

Many common asynchronous stream use-cases are catered for natively, in such a way that the “operators” (the observable methods e.g. “throttle”, “map”, “delay”, “filter”) take care of all of the awkward state required to track emitted values over time.

Observable.fromEvent(window, 'mousemove')
  .map(...)
  .throttle(1000) // only allow one event through per second
  .subscribe(...);

… and that’s barely scratching the surface.

The Benefits

Many of the benefits of RxJS are the benefits of functional programming. The avoidance of state, the readability and testability of short, pure functions. By encapsulating the side-effects associated with your application in a generic, composable way, developers can maximise the reusability of the asynchronous logic in their codebase.

By seeing the application as a series of data transformations between the external application interfaces, we can describe those transformations by composing short, pure functions and lazily applying data to them as it is emitted in real-time.

Messy, temporary, imperative variables are replaced by functional closure to give observables access to previously emitted variables in a localised way that limits the amount of the application logic and state a developer must hold in their head at any given time.

Did It Work?

Sort of.  We spent a lot of our time in a state of low-level fury at RxJS, so much so that we’ve written up a long list of complaints, in another post.

There are some good bits though:

FRP and the observable pattern are both transformative approaches to writing complex asynchronous javascript code, producing fewer bugs and drastically improving the reusability of our codebase.

RxJS operators can encapsulate extremely complex asynchronous operations and elegantly describe dependencies in a terse, declarative way that leaves no state lying around.

In multiple standups throughout the project we’ve enthusiastically raved about how these operators have turned a fundamentally complex part of our implementation into a two line solution. Sure those two lines usually took a long time to craft and get right, but once working, it’s difficult to write many bugs in just two lines of code (when compared to the hundreds of lines of imperative code we’d otherwise need to write if we rolled our own).

That said, RxJS is a functional approach to writing code so developers should expect to incur a penalty if they’re new to the paradigm as they go from an imperative, object-oriented approach to system design to a functional, data-flow-driven approach instead. There is also a very steep learning curve required to feel the benefits of RxJS as developers familiarise themselves with the toolbox and the idiosyncrasies.

Would We Use It Again?

Despite the truly epic list of shortcomings, I would still recommend an FRP approach to complex async javascript projects. In future we’ll be trying out most.js to see if it solves the myriad of problems we found with RxJS. If it doesn’t, I’d consider implementing an improved Observable that keeps its hands off my errors.

It’s also worth mentioning that we used RxJS with react-redux to handle all redux side-effects. We used redux-observable to achieve this and it was terrific. We’ll undoubtedly be using redux-observable again.

 

Screenshot of SOMA vision mixer

Compositing and mixing video in the browser

This blog post is the 4th part of our ongoing series working with the BBC Research & Development team. If you’re new to this project, you should start at the beginning!

BBC R&D logoLike all vision mixers, SOMA (Single Operator Mixing Application) has a “preview” and “transmission” monitor. Preview is used to see how different inputs will appear when composed together – in our case, a video input, a “lower third” graphic such as a caption which fades in and out, and finally a “DOG” such as a channel or event identifier shown in the top corner throughout a broadcast.

When switching between video feeds SOMA offers a fast cut between inputs or a slower mix between the two. As and when edit decisions are made, the resulting output is shown in the transmission monitor.

The problem with software

However, one difference with SOMA is that all the composition and mixing is simulated. SOMA is used to build a set of edit decisions which can be replayed later by a broadcast quality renderer. The transmission monitor is not just a view of the output after the effects have been applied as the actual rendering of the edit decisions hasn’t happened yet. The app needs to provide an accurate simulation of what the edit decision will look like.

The task of building this required breaking down how output is composed – during a mix both the old and new input sources are visible, so six inputs are required.

VideoContext to the rescue

Enter VideoContext, a video scheduling and compositing library created by BBC R&D. This allowed us to represent each monitor as a graph of nodes, with video nodes playing each input into transition nodes allowing mix and opacity to be varied over time, and a compositing node to bring everything together, all implemented using WebGL to offload video processing to the GPU.

The flexible nature of this library allowed us to plug in our own WebGL scripts to cut the lower third and DOG graphics out using chroma-keying (where a particular colour is declared to be transparent – normally green), and with a small patch to allow VideoContext to use streaming video we were off and going.

Devils in the details

The fiddly details of how edits work were as fiddly as expected: tracking the mix between two video inputs versus the opacity of two overlays appeared to be similar problems but required different solutions. The nature of the VideoContext graph meant we also had to keep track of which node was current rather than always connecting the current input to the same node. We put a lot of unit tests around this to ensure it works as it should now and in future.

By comparison a seemingly tricky problem of what to do if a new edit decision was made while a mix was in progress was just a case of swapping out the new input, to avoid the old input reappearing unexpectedly.

QA testing revealed a subtler problem that when switching to a new input the video takes a few tens of milliseconds to start. Cutting immediately causes a distracting flicker as a couple of blank frames are rendered – waiting until the video is ready adds a slight delay but this is significantly less distracting.

Later in the project a new requirement emerged to re-frame videos within the application and the decision to use VideoContext paid off as we could add an effect node into the graph to crop and scale the video input before mixing.

And finally

VideoContext made the mixing and compositing operations a lot easier than they would have been otherwise. Towards the end we even added an image source (for paused VTs) using the new experimental Chrome feature captureStream, and that worked really well.

After making it all work the obvious point of possible concern is performance, and overall it works pretty well.  We needed to have half-a-dozen or so VideoContexts running at once and this was effective on a powerful machine.  Many more and the computer really starts struggling.

Even a few years ago attempting this in the browser would have been madness, so its great to see such a lot of progress in something so challenging, and opening up a whole new range of software to work in the browser!

Read part 5 of this project with BBC R&D where Developer Alex Holmes talks about Taming async with FRP and RxJS.

Sign pointing to Usability Lab

Rapid user research on an Agile project

Our timeline to build an in-browser vision mixer for BBC R&D (previously, previously) is extremely tight – just 2 months. UX and development runs concurrently in Agile fashion (a subject for a future blog post), but design was largely done within the first month.

Too frequently for projects on such timescales there is pressure to omit user testing in the interest of expediency. One could say it’s just a prototype, and leave it until the first trials to see how it performs, and hopefully get a chance to work the learnings into a version 2. Or, since we have weekly show & tell sessions with project stakeholders, one could argue complacently that as long as they’re happy with what they’re seeing, the design’s on track.

Why test?

But the stakeholders represent our application’s target users only slightly better than ourselves, which is not very well – they won’t be the ones using it. Furthermore, this project aims to broaden the range of potential operators – from what used to be the domain of highly experienced technicians, to something that could be used by a relative novice within hours. So I wanted to feel confident that even people who aren’t familiar with the project would be able to use it – both experts and novices. I’m not experienced in this field at all, so I was making lots of guesses and assumptions, and I didn’t want to go too far before finding they’re wrong.

One of the best things about working at the BBC is the ingrained culture of user centred design, so there was no surprise at the assumption that I’d be testing paper prototypes by the 2nd week. Our hosts were very helpful in finding participants within days – and with 100s of BBC staff working at MediaCity there is no danger of using people with too much knowledge of the project, or re-using participants. Last but not least, BBC R&D has a fully equipped usability lab – complete with two-way mirror and recording equipment. Overkill for my purposes – I would’ve managed with an ordinary office – but having the separate viewing room helped ensure that I got the entire team observing the sessions without crowding my subject. I’m a great believer in getting everyone on the project team seeing other people interact with and talk about the application.

Paper prototypes

Annoted paper prototyping test scriptPaper prototypes are A3 printouts of the wireframes, each representing a state of the application. After giving a brief description of what the application is used for, I show the page representing the application’s initial state, and change the pages in response to user actions as if it were the screen. (Users point to what they would click.) At first, I ask task-based questions: “add a camera and an audio source”; “create a copy of Camera 2 that’s a close-up”; etc. As we linger on a screen, I’ll probe more about their understanding of the interface: “How would you change the keyboard shortcut for Camera 1?”; “What do you think Undo/Redo would do on this screen?”; “What would happen if you click that?”; and so on. It doesn’t matter that the wireframes are incomplete – when users try to go to parts of the application that haven’t been designed yet, I ask them to describe what they expect to see and be able to do there.

In all, I did paper prototype testing with 6 people on week 2, and with a further 3 people on week 3. (With qualitative testing even very few participants tend to find the major issues.) In keeping with the agile nature of the project, there was no expectation of me producing a report of findings that everyone would read, although I do type up my notes in a shared document to help fix them in my memory. Rather, my learnings go straight into the design – I’m usually champing at the bit to make the changes that seem so obvious after seeing a person struggle, feeling really happy to have caught them so early on. Fortunately, user testing showed that the broad screen layout worked well – the main changes were to button labels, icon designs, and generally improved affordances.

Interactive prototypes

By week 4 my role had transitioned into front-end development, in which I’m responsible for creating static HTML mockups with the final design and CSS, which the developers use as reference markup for the React components. While this isn’t mainstream practice in our industry, I find it has numerous advantages, especially for an Agile project, as it enables me to leave the static, inexact medium of wireframes behind and refine the design and interaction directly within the browser. (I add some some dynamic interactivity using jQuery, but this is throwaway code for demo purposes only.)

The other advantage of HTML mockups is that they afford us an opportunity to do interactive user testing using a web browser, well before the production application is stable enough to test. Paper prototyping is fine up to a point, but you have plenty of limitations – for example, you can’t scroll, there are no mousever events, you can’t resize the screen, etc.

So by week 5 I was able to test nearly all parts of the application, in the browser, with 11 users. (This included two groups of 4, which worked better than I expected – one person manning the mouse and keyboard, but everyone in the group thinking out loud.) It was really good being able to see the difference that interactivity made, such as hover states, and seeing people actually trying to click or drag things rather than just saying what they’d do gave me an added level of confidence in my findings. Again, immediately afterwards, I made several changes that I’m confident improves the application – removing a redundant button that never got clicked, adding labels to some icons, strengthening a primary action by adding an icon, among others. Not to mention fixing numerous technical bugs that came up during testing. (I use Github comments to ensure developers are aware of any HTML changes to components at this stage.)

Never stop testing

Hopefully we’ll have time for another round of testing with the production application. This should give a more faithful representation of the vision mixing workflow, since in the mockups the application is always in the same state, using dummy content. With every test we can feel more confident – and our stakeholders can feel more confident – that what we’re building will meet its goals, and make users productive rather than frustrated. And on a personal level, I’m just relieved that we won’t be launching with any of the embarrassing gotchas that cropped up and got fixed during testing.

Read part 4 of this project working with BBC R&D where we talk about compositing and mixing video in the browser.

The challenges of mixing live video streams over IP networks

Welcome to our second post on the work we’re doing with BBC Research & Development. If you’ve not read the first post, you should go read that first 😉

Introducing IP Studio

BBC R&D logo

The first part of the infrastructure we’re working with here is something called IP Studio. In essence this is a platform for discovering, connecting and transforming video streams in a generic way, using IP networking – the standard on which pretty much all Internet, office and home networks are based.

Up until now video cameras have used very simple standards such as SDI to move video around. Even though SDI is digital, it’s just point-to-point – you connect the camera to something using a cable, and there it is. The reason for the remarkable success of IP networks, however, is their ability to connect things together over a generic set of components, routing between connecting devices. Your web browser can get messages to and from this blog over the Internet using a range of intervening machines, which is actually pretty clever.

Doing this with video is obviously in some senses well-understood – we’ve all watched videos online. There are some unique challenges with doing this for live television though!

Why live video is different

First, you can’t have any buffering: this is live. It’s unacceptable for everyone watching TV to see a buffering message because the production systems aren’t quick enough.

Second is quality. These are 4K streams, not typical internet video resolution. 4K streams have (roughly) 4000 horizontal pixels compared to the (roughly) 2000 for a 1080p stream (weirdly 1080p, 720p etc are named for their vertical pixels instead). this means they need about 4 times as much bandwidth – which even in 2017 is quite a lot. Specialist networking kit and a lot of processing power is required.

Third is the unique requirements of production – we’re not just transmitting a finished, pre-prepared video, but all the components from which to make one: multiple cameras, multiple audio feeds, still images, pre-recorded video. Everything you need to create the finished live product. This means that to deliver a final product you might need ten times as much source material – which is well beyond the capabilities of any existing systems.

IP Studio addresses this with a cluster of powerful servers sitting on a very high speed network. It allows engineers to connect together “nodes” to form processing “pipelines” that deliver video suitable for editing. This means capturing the video from existing cameras (using SDI) and transforming them into a format which will allow them to be mixed together later.

It’s about time

That sounds relatively straightforward, except for one thing: time. When you work with live signals on traditional analogue or point-to-point digital systems, then live means, well, live. There can be transmission delays in the equipment but they tend to be small and stable. A system based on relatively standard hardware and operating systems (IP Studio uses Linux, naturally) is going to have all sorts of variable delays in it, which need to be accommodated.

IP Studio is therefore based on “flows” comprising “grains”. Each grain has a quantum of payload (for example a video frame) and timing information. the timing information allows multiple flows to be combined into a final output where everything happens appropriately in synchronisation. This might sound easy but is fiendishly difficult – some flows will arrive later than others, so systems need to hold back some of them until everything is running to time.

To add to the complexity, we need two versions of the stream, one at 4k and one at a lower resolution.

Don’t forget the browser

Within the video mixer we’re building, we need the operator to be able to see their mixing decisions (cutting, fading etc.) happening in front of them in real time. We also need to control the final transmitted live output. There’s no way a browser in 2017 is going to show half-a-dozen 4k streams at once (and it would be a waste to do so). This means we are showing lower resolution 480p streams in the browser, while sending the edit decisions up to the output rendering systems which will process the 4k streams, before finally reducing them to 1080p for broadcast.

So we’ve got half-a-dozen 4k streams, and 480p equivalents, still images, pre-recorded video and audio, all being moved around in near-real-time on a cluster of commodity equipment from which we’ll be delivering live television!

Read part 3 of this project with BBC R&D where we delve into rapid user research on an Agile project.

MediaCity UK offices

Building a live television video mixing application for the browser

BBC R&D logoThis is the first in a series of posts about some work we are doing with BBC Research & Development.

The BBC now has, in the lab, the capability to deliver live television using high-end commodity equipment direct to broadcast, over standard IP networks. What we’ve been tasked with is building one of the key front-end applications – the video mixer. This will enable someone to mix an entire live television programme, at high quality, from within a standard web-browser on a normal laptop.

In this series of posts we’ll be talking in great depth about the design decisions, implementation technologies and opportunities presented by these platforms.

What is video mixing?

Video editing used to be a specialist skill requiring very expensive, specialist equipment. Like most things this has changed because of commodity, high-powered computers and now anyone can edit video using modestly priced equipment and software such as the industry standard Adobe Premiere. This has fed the development of services such as YouTube where 300 hours of video are uploaded every minute.

“Video Mixing” is the activity of getting the various different videos and stills in your source material and mixing them together to produce a single, linear output. It can involve showing single sources, cutting and fading between them, compositing them together, showing still images and graphics and running effects over them. Sound can be similarly manipulated. Anyone who has used Premiere, even to edit their family videos, will have some idea of the options involved.

Live television is a very different problem

First you need to produce high-fidelity output in real time. If you’ve ever used something like Premiere you’ll know that when you finally render your output it can take quite a long time – it can easily spend an hour rendering 20 minutes of output. That would be no good if you were broadcasting live! This means the technology used is very different – you can’t just use commodity hardware, you need specialist equipment that can work with these streams in realtime.

Second the capacity for screw up is immensely higher. Any mistakes in a live broadcast are immediately apparent, and potentially tricky to correct. It is a high-stress environment, even for experienced operators.

Finally, the range of things you might choose to do is much more limited, because you can spend little time setting it up. This means live television tends to use a far smaller ‘palette’ of mixing operations.

Even then, a live broadcast might require half a dozen people even for a modest production. You need someone to set up the cameras and control them, a sound engineer to get the sound right, someone to mix the audio, a vision mixer, a VT Operator (to run any pre-recorded videos you insert – perhaps the titles and credits) and someone to set up the still image overlays (for example, names and logos).

If that sounds bad, imagine a live broadcast away from the studio – the Outside Broadcast. All the people and equipment needs to be on site, hence the legendary “OB Van”:

P1040518

Inside one of those vans is the equipment and people needed to run a live broadcast for TV. They’d normally transmit the final output directly to air by satellite – which is why you generally see a van with a massive dish on it nearby. This equipment runs into millions and millions of pounds and can’t be deployed on a whim. When you only have a few channels of course you don’t need many vans…

The Internet Steamroller

The Internet is changing all of this. Services like YouTube Live and Facebook Live mean that anyone with a phone and decent coverage can run their own outside broadcast. Where once you needed a TV network and millions of pounds of equipment now anyone can do it. Quality is poor and there are few options for mixing, but it is an amazingly powerful tool for citizen journalism and live reporting.

Also, the constraints of “channels” are going. Where once there was no point owning more OB Vans than you have channels, now you could run dozens of live feeds simultaneously over the Internet. As the phone becomes the first screen and the TV in the corner turns into just another display many of the constraints that we have taken for granted look more and more anachronistic.

These new technologies provide an opportunity, but also some significant challenges. The major one is standards – there is a large ecosystem of manufacturers and suppliers whose equipment needs to interoperate. The standards used, such as SDI (Serial Digital Interface) have been around for decades and are widely supported. Moving to an Internet-based standard needs cooperation across the industry.

BBC R&D has been actively working towards this with their IP Studio  project, and the standards they are developing with industry for Networked Media.

Read part 2 of this project with BBC R&D where I’ll describe some of the technologies involved, and how we’re approaching the project.

Links

There is a new version of gunicorn, 19.0 which has a couple of significant changes, including some interesting workers (gthread and gaiohttp) and actually responding to signals properly, which will make it work with Heroku.

The HTTP RFC, 2616, is now officially obsolete. It has been replaced by a bunch of RFCs from 7230 to 7235, covering different parts of the specification. The new RFCs look loads better, and it’s worth having a look through them to get familiar with them.

Some kind person has produced a recommended set of SSL directives for common webservers, which provide an A+ on the SSL Labs test, while still supporting older IEs. We’ve struggled to find a decent config for SSL that provides broad browser support, whilst also having the best levels of encryption, so this is very useful.

A few people are still struggling with Git.  There are lots of git tutorials around the Internet, but this one from Git Tower looks like it might be the best for the complete beginner. You know it’s for noobs, of course, because they make a client for the Mac 🙂

I haven’t seen a lot of noise about this, but the EU has outlawed pre-ticked checkboxes.  We have always recommended that these are not used, since they are evil UX, but now there’s an argument that might persuade everyone.

Here is a really nice post about splitting user stories. I think we are pretty good at this anyhow, but this is a nice way of describing the approach.

@monkchips gave a talk at IBM Impact about the effect of Mobile First. I think we’re on the right page with most of these things, but it’s interesting to see mobile called-out as one of the key drivers for these changes.

I’d not come across the REST Cookbook before, but here is a decent summary of how to treat PUT vs POST when designing RESTful APIs.

Fastly have produced a spectacularly detailed article about how to get tracking cookies working with Varnish.  This is very relevant to consumer facing projects.

This post from Thought Works is absolutely spot on, and I think accurately describes an important aspect of testing The Software Testing Cupcake.

As an example for how to make unit tests less fragile, this is a decent description of how to isolate tests, which is a key technique.

The examples are Ruby, but the principle is valid everywhere. Still on unit testing, Facebook have open sourced a Javascript unit testing framework called Jest. It looks really very good.

A nice implementation of “sudo mode” for Django. This ensures the user has recently entered their password, and is suitable for protecting particularly valuable assets in a web application like profile views or stored card payments.

If you are using Redis directly from Python, rather than through Django’s cache wrappers, then HOT Redis looks useful. This provides atomic operations for compound Python types stored within Redis.

Just Enough Documentation: an ideal web design process

A recent project reminded me how much I like the “just enough documentation” approach, and how inefficiently web design is often done.

Tasked with redesigning an Agile project management web application (to improve usability but not re-brand), I used the following process:

1. Paper sketching for a day or so to get my ideas straight about page and widget layout. This is the best time to explore wildly different alternatives, and make most of the big decisions.

2. Wireframes in Omnigraffle of the key pages. Since the visual style wasn’t changing, I could do high-fidelity wireframes closely resembling the final design. I work on a large format, so there’s ample space to overlay notes and state changes, as well as alternative ideas. The wireframes were shared with the client on a continuous basis for feedback.

Lots of things weren’t wireframed, including less critical pages, and many dynamic states.

3. I did only two pages in Photoshop, in which I did just enough to give me my final measurements, and the few image elements that needed cutting out. I refined designs a little from the wireframes, but most of this I left to the next stage. The wireframes and this stage together took about 5 days.

4. HTML and CSS. Because in my design I was switching from a fixed-width layout to a liquid layout, you can only really appraise the design once you implement it, so it was imperative to get to this stage as quickly as possible. It is also far, far more efficient to fine-tune margins, typography and colours in CSS than in Photoshop. Furthermore, I could quickly mock up most dynamic interactions – of which this app has plenty – in jQuery (which I had not used before but was ridiculously easy to pick up.) How much more effective than doing so in wireframes, where you (and the client) have to imagine how it will “feel” to the user and you can never be sure whether it will work?

By far the bulk of my time – about 2 weeks – was spent working in HTML and CSS. Pages or elements that weren’t wireframed were much quicker to design directly as HTML and CSS, without the duplication of effort. New ideas inevitably arise during this stage, and are immediately incorporated. I occasionally went back to paper or Photoshop or Omnigraffle to work out small details, but there was no need to keep those design documents “up to date” – they had served their purpose. The client was able to appraise the designs exactly as it renders in the browser, and the resulting artifact was also ready to be integrated into their back-end.

Not only could the client and I be confident about how the design will actually look once implemented, I could also ensure that accessibility measures were built in, as well as the ability for it to be easily re-skinned. (As the app is sometimes white-labelled.)

Conclusion

If I reflect on the design process on many past projects, most seem ridiculously inefficient. Where UX designers had to exhaustively wireframe every single page and every possible interaction, with complex state diagrams for all the dynamic elements. And to keep the wireframes up to date throughout the course of the project. Where designers had to mock up dozens of pages in Photoshop, with everything finalised to the last pixel (but usually omitting hover effects and never considering liquid layout or browser differences). All the foregoing needing to be done before a client will sign it off and a single line of markup can be written. The front-end developer reduced to an automaton incapable of taste or decisions, who just had to implement the Photoshop mockups to the pixel, and all interactions exactly as specified in the wireframes. It was usually impossible to refine aspects of design or interaction at this stage (everything having been signed off), let alone adding new ideas, despite it being the most fluid.

Obviously this won’t suit all projects or clients. For example, I had the advantage of not starting from scratch, and a client who was prepared to work in this way. It was a relatively small project with few stakeholders. On some projects there may be good reasons for an exhaustively documented waterfall process. But very often in my experience it was simply because it was the only way the agency knew how to work, due to the excessive compartmentalisation of skills, and an inadequate understanding of the meaning of web design. And my, what a lot of time and money it wastes.