Tag Archives: BBC

Screenshot of SOMA vision mixer

Compositing and mixing video in the browser

This blog post is the 4th part of our ongoing series working with the BBC Research & Development team. If you’re new to this project, you should start at the beginning!

BBC R&D logoLike all vision mixers, SOMA (Single Operator Mixing Application) has a “preview” and “transmission” monitor. Preview is used to see how different inputs will appear when composed together – in our case, a video input, a “lower third” graphic such as a caption which fades in and out, and finally a “DOG” such as a channel or event identifier shown in the top corner throughout a broadcast.

When switching between video feeds SOMA offers a fast cut between inputs or a slower mix between the two. As and when edit decisions are made, the resulting output is shown in the transmission monitor.

The problem with software

However, one difference with SOMA is that all the composition and mixing is simulated. SOMA is used to build a set of edit decisions which can be replayed later by a broadcast quality renderer. The transmission monitor is not just a view of the output after the effects have been applied as the actual rendering of the edit decisions hasn’t happened yet. The app needs to provide an accurate simulation of what the edit decision will look like.

The task of building this required breaking down how output is composed – during a mix both the old and new input sources are visible, so six inputs are required.

VideoContext to the rescue

Enter VideoContext, a video scheduling and compositing library created by BBC R&D. This allowed us to represent each monitor as a graph of nodes, with video nodes playing each input into transition nodes allowing mix and opacity to be varied over time, and a compositing node to bring everything together, all implemented using WebGL to offload video processing to the GPU.

The flexible nature of this library allowed us to plug in our own WebGL scripts to cut the lower third and DOG graphics out using chroma-keying (where a particular colour is declared to be transparent – normally green), and with a small patch to allow VideoContext to use streaming video we were off and going.

Devils in the details

The fiddly details of how edits work were as fiddly as expected: tracking the mix between two video inputs versus the opacity of two overlays appeared to be similar problems but required different solutions. The nature of the VideoContext graph meant we also had to keep track of which node was current rather than always connecting the current input to the same node. We put a lot of unit tests around this to ensure it works as it should now and in future.

By comparison a seemingly tricky problem of what to do if a new edit decision was made while a mix was in progress was just a case of swapping out the new input, to avoid the old input reappearing unexpectedly.

QA testing revealed a subtler problem that when switching to a new input the video takes a few tens of milliseconds to start. Cutting immediately causes a distracting flicker as a couple of blank frames are rendered – waiting until the video is ready adds a slight delay but this is significantly less distracting.

Later in the project a new requirement emerged to re-frame videos within the application and the decision to use VideoContext paid off as we could add an effect node into the graph to crop and scale the video input before mixing.

And finally

VideoContext made the mixing and compositing operations a lot easier than they would have been otherwise. Towards the end we even added an image source (for paused VTs) using the new experimental Chrome feature captureStream, and that worked really well.

After making it all work the obvious point of possible concern is performance, and overall it works pretty well.  We needed to have half-a-dozen or so VideoContexts running at once and this was effective on a powerful machine.  Many more and the computer really starts struggling.

Even a few years ago attempting this in the browser would have been madness, so its great to see such a lot of progress in something so challenging, and opening up a whole new range of software to work in the browser!

 

MediaCity UK offices

Building a live television video mixing application for the browser

BBC R&D logoThis is the first in a series of posts about some work we are doing with BBC Research & Development.

The BBC now has, in the lab, the capability to deliver live television using high-end commodity equipment direct to broadcast, over standard IP networks. What we’ve been tasked with is building one of the key front-end applications – the video mixer. This will enable someone to mix an entire live television programme, at high quality, from within a standard web-browser on a normal laptop.

In this series of posts we’ll be talking in great depth about the design decisions, implementation technologies and opportunities presented by these platforms.

What is video mixing?

Video editing used to be a specialist skill requiring very expensive, specialist equipment. Like most things this has changed because of commodity, high-powered computers and now anyone can edit video using modestly priced equipment and software such as the industry standard Adobe Premiere. This has fed the development of services such as YouTube where 300 hours of video are uploaded every minute.

“Video Mixing” is the activity of getting the various different videos and stills in your source material and mixing them together to produce a single, linear output. It can involve showing single sources, cutting and fading between them, compositing them together, showing still images and graphics and running effects over them. Sound can be similarly manipulated. Anyone who has used Premiere, even to edit their family videos, will have some idea of the options involved.

Live television is a very different problem

First you need to produce high-fidelity output in real time. If you’ve ever used something like Premiere you’ll know that when you finally render your output it can take quite a long time – it can easily spend an hour rendering 20 minutes of output. That would be no good if you were broadcasting live! This means the technology used is very different – you can’t just use commodity hardware, you need specialist equipment that can work with these streams in realtime.

Second the capacity for screw up is immensely higher. Any mistakes in a live broadcast are immediately apparent, and potentially tricky to correct. It is a high-stress environment, even for experienced operators.

Finally, the range of things you might choose to do is much more limited, because you can spend little time setting it up. This means live television tends to use a far smaller ‘palette’ of mixing operations.

Even then, a live broadcast might require half a dozen people even for a modest production. You need someone to set up the cameras and control them, a sound engineer to get the sound right, someone to mix the audio, a vision mixer, a VT Operator (to run any pre-recorded videos you insert – perhaps the titles and credits) and someone to set up the still image overlays (for example, names and logos).

If that sounds bad, imagine a live broadcast away from the studio – the Outside Broadcast. All the people and equipment needs to be on site, hence the legendary “OB Van”:

P1040518

Inside one of those vans is the equipment and people needed to run a live broadcast for TV. They’d normally transmit the final output directly to air by satellite – which is why you generally see a van with a massive dish on it nearby. This equipment runs into millions and millions of pounds and can’t be deployed on a whim. When you only have a few channels of course you don’t need many vans…

The Internet Steamroller

The Internet is changing all of this. Services like YouTube Live and Facebook Live mean that anyone with a phone and decent coverage can run their own outside broadcast. Where once you needed a TV network and millions of pounds of equipment now anyone can do it. Quality is poor and there are few options for mixing, but it is an amazingly powerful tool for citizen journalism and live reporting.

Also, the constraints of “channels” are going. Where once there was no point owning more OB Vans than you have channels, now you could run dozens of live feeds simultaneously over the Internet. As the phone becomes the first screen and the TV in the corner turns into just another display many of the constraints that we have taken for granted look more and more anachronistic.

These new technologies provide an opportunity, but also some significant challenges. The major one is standards – there is a large ecosystem of manufacturers and suppliers whose equipment needs to interoperate. The standards used, such as SDI (Serial Digital Interface) have been around for decades and are widely supported. Moving to an Internet-based standard needs cooperation across the industry.

BBC R&D has been actively working towards this with their IP Studio  project, and the standards they are developing with industry for Networked Media.

Read part 2 of this project with BBC R&D where I’ll describe some of the technologies involved, and how we’re approaching the project.