Screenshot of SOMA vision mixer

Compositing and mixing video in the browser

This blog post is the 4th part of our ongoing series working with the BBC Research & Development team. If you’re new to this project, you should start at the beginning!

BBC R&D logoLike all vision mixers, SOMA (Single Operator Mixing Application) has a “preview” and “transmission” monitor. Preview is used to see how different inputs will appear when composed together – in our case, a video input, a “lower third” graphic such as a caption which fades in and out, and finally a “DOG” such as a channel or event identifier shown in the top corner throughout a broadcast.

When switching between video feeds SOMA offers a fast cut between inputs or a slower mix between the two. As and when edit decisions are made, the resulting output is shown in the transmission monitor.

The problem with software

However, one difference with SOMA is that all the composition and mixing is simulated. SOMA is used to build a set of edit decisions which can be replayed later by a broadcast quality renderer. The transmission monitor is not just a view of the output after the effects have been applied as the actual rendering of the edit decisions hasn’t happened yet. The app needs to provide an accurate simulation of what the edit decision will look like.

The task of building this required breaking down how output is composed – during a mix both the old and new input sources are visible, so six inputs are required.

VideoContext to the rescue

Enter VideoContext, a video scheduling and compositing library created by BBC R&D. This allowed us to represent each monitor as a graph of nodes, with video nodes playing each input into transition nodes allowing mix and opacity to be varied over time, and a compositing node to bring everything together, all implemented using WebGL to offload video processing to the GPU.

The flexible nature of this library allowed us to plug in our own WebGL scripts to cut the lower third and DOG graphics out using chroma-keying (where a particular colour is declared to be transparent – normally green), and with a small patch to allow VideoContext to use streaming video we were off and going.

Devils in the details

The fiddly details of how edits work were as fiddly as expected: tracking the mix between two video inputs versus the opacity of two overlays appeared to be similar problems but required different solutions. The nature of the VideoContext graph meant we also had to keep track of which node was current rather than always connecting the current input to the same node. We put a lot of unit tests around this to ensure it works as it should now and in future.

By comparison a seemingly tricky problem of what to do if a new edit decision was made while a mix was in progress was just a case of swapping out the new input, to avoid the old input reappearing unexpectedly.

QA testing revealed a subtler problem that when switching to a new input the video takes a few tens of milliseconds to start. Cutting immediately causes a distracting flicker as a couple of blank frames are rendered – waiting until the video is ready adds a slight delay but this is significantly less distracting.

Later in the project a new requirement emerged to re-frame videos within the application and the decision to use VideoContext paid off as we could add an effect node into the graph to crop and scale the video input before mixing.

And finally

VideoContext made the mixing and compositing operations a lot easier than they would have been otherwise. Towards the end we even added an image source (for paused VTs) using the new experimental Chrome feature captureStream, and that worked really well.

After making it all work the obvious point of possible concern is performance, and overall it works pretty well.  We needed to have half-a-dozen or so VideoContexts running at once and this was effective on a powerful machine.  Many more and the computer really starts struggling.

Even a few years ago attempting this in the browser would have been madness, so its great to see such a lot of progress in something so challenging, and opening up a whole new range of software to work in the browser!

 

“What CMS?” is the wrong question

When I meet new customers, I often relate a story about CMS work.

It goes like this:

A decade or so ago, CMS projects used to make up the majority of Isotoma’s output.
These days the number is much lower. There are a few reasons for this but the main one is that back in the day, CMS work used to be hard. The projects were complex, expensive and fragile.
As an illustration of how much things have changed, the other day I built a website for my sister’s business using Squarespace. It took longer for me to upload photos to the gallery than it did to plan, build, populate and deploy the entire site.

This peppy anecdote disguises the fact that there is complexity still in CMS work; projects that involve content management have the ability to stymie organisations and put their digital roadmap back years – but almost all of this complexity resides outside of CMS platform choice.
This post looks into why that is.

“What CMS should I use?” is a boring question.

People still tend to start with the question “What platform are we going to use?” because, historically, it’s been a really important one. Back in the day, progress was slow and the costs of being wrong were astronomically high.

These days though, compared to some of the other decisions you need to make, platform choice is a relative doddle.

Why do I say this? Because the CMS market is commoditised, modern and highly competitive.

  • There are free/open source solutions that are as good as (or better than) anything you pay through the nose for
  • There’s a huge number of agencies who will compete with each other for your business and this keeps prices constantly low
  • The majority of features you could ever want from a content management system are now standardised and distributed across the marketplace. Anyone telling you otherwise is selling you something
  • As features increase; costs shrink. As costs shrink, the traditional organisational worries about sunk costs and having a product that’s wrong become less important

Indeed from a given list of modern, open source content management systems, you’d have to be going some to get a bad fit for 99% of organisations.

So having made a broad and eye-catching statement like that, let me now run through a brief list of actual interesting questions to ask about your CMS project.

1. What is my content strategy?

Back in the day, agencies would be asked by their customers “Will the section headings be editable by admins?” and then we’d go to quite a lot of trouble to make sure that the section headings were in fact editable by admins.

What agencies should have been doing instead was investigating why the section headings needed to be editable in the first place.

If you nail the content strategy – if you develop content and a structure that speaks to the requirements of your actual customers – then the need to make structural changes to your site should be, while not removed entirely, at least put on a slower and more predictable timetable.

2. Who is driving this project?

This is somewhat related to content strategy but deserves its own section. One of the reasons that a content strategy is needed is that the fundamental question “Who is the audience for this site?” can have multiple, arguably correct answers depending on who inside the organisation you ask.

And sometimes the reason that you have multiple answers to the question is because two (or more!) departments or individuals within the customer’s organisation have competing opinions about the fundamentals of the project: what it’s for, what the outcomes and priorities should be… etc.

Resolving these tensions can be difficult, even for members of the customer’s team. For employees of an outside agency the difficulty increases exponentially but, worst of all is when an attempt is made to solve a problem like this with the top down application of a technology choice. Drupal has many skills but senior management negotiation is not one of them.

3. Who are the audiences for this new CMS?

Is this a site made to make the company look attractive to customers? Or is this a site that is designed to make an internal task easier for the admins of the site? Or both? None of these answers are wrong but failing to ask the question can result in tens of thousands of pounds being spent on something of little demonstrable value to the customer.

So in summary then…

Asking the above questions is a far better use of your time in the run up to starting a project than asking questions about Wagtail vs WordPress.

You can take this one step further and use the same approach in selecting an agency: When you first engage with them, do they talk about the platform and technology choice or about how they’re going to help you increase your reach, or implement a content strategy? Or how they’ll help affect change within your organisation?

We’re on the Government G-Cloud 9 Marketplace

Good news, everyone! Isotoma are pleased to announce that our services are now available to public sector bodies for procurement via the G-Cloud 9 portal.

This means that you can find Isotoma’s services on the Digital Marketplace including cloud hosting, software and support. The Digital Marketplace is the new online platform that all public sector organisations can use to find and buy UK government approved cloud-based services.

We already deliver our services to organisations around the world. With this new accreditation, Isotoma is ready to deliver our best-in-industry services to even more public sector bodies.

Here’s an outline of the Isotoma services available on the G-Cloud 9 Digital Marketplace. Don’t hesitate to get in touch if we can provide more info.

Sign pointing to Usability Lab

Rapid user research on an Agile project

Our timeline to build an in-browser vision mixer for BBC R&D (previously, previously) is extremely tight – just 2 months. UX and development runs concurrently in Agile fashion (a subject for a future blog post), but design was largely done within the first month.

Too frequently for projects on such timescales there is pressure to omit user testing in the interest of expediency. One could say it’s just a prototype, and leave it until the first trials to see how it performs, and hopefully get a chance to work the learnings into a version 2. Or, since we have weekly show & tell sessions with project stakeholders, one could argue complacently that as long as they’re happy with what they’re seeing, the design’s on track.

Why test?

But the stakeholders represent our application’s target users only slightly better than ourselves, which is not very well – they won’t be the ones using it. Furthermore, this project aims to broaden the range of potential operators – from what used to be the domain of highly experienced technicians, to something that could be used by a relative novice within hours. So I wanted to feel confident that even people who aren’t familiar with the project would be able to use it – both experts and novices. I’m not experienced in this field at all, so I was making lots of guesses and assumptions, and I didn’t want to go too far before finding they’re wrong.

One of the best things about working at the BBC is the ingrained culture of user centred design, so there was no surprise at the assumption that I’d be testing paper prototypes by the 2nd week. Our hosts were very helpful in finding participants within days – and with 100s of BBC staff working at MediaCity there is no danger of using people with too much knowledge of the project, or re-using participants. Last but not least, BBC R&D has a fully equipped usability lab – complete with two-way mirror and recording equipment. Overkill for my purposes – I would’ve managed with an ordinary office – but having the separate viewing room helped ensure that I got the entire team observing the sessions without crowding my subject. I’m a great believer in getting everyone on the project team seeing other people interact with and talk about the application.

Paper prototypes

Annoted paper prototyping test scriptPaper prototypes are A3 printouts of the wireframes, each representing a state of the application. After giving a brief description of what the application is used for, I show the page representing the application’s initial state, and change the pages in response to user actions as if it were the screen. (Users point to what they would click.) At first, I ask task-based questions: “add a camera and an audio source”; “create a copy of Camera 2 that’s a close-up”; etc. As we linger on a screen, I’ll probe more about their understanding of the interface: “How would you change the keyboard shortcut for Camera 1?”; “What do you think Undo/Redo would do on this screen?”; “What would happen if you click that?”; and so on. It doesn’t matter that the wireframes are incomplete – when users try to go to parts of the application that haven’t been designed yet, I ask them to describe what they expect to see and be able to do there.

In all, I did paper prototype testing with 6 people on week 2, and with a further 3 people on week 3. (With qualitative testing even very few participants tend to find the major issues.) In keeping with the agile nature of the project, there was no expectation of me producing a report of findings that everyone would read, although I do type up my notes in a shared document to help fix them in my memory. Rather, my learnings go straight into the design – I’m usually champing at the bit to make the changes that seem so obvious after seeing a person struggle, feeling really happy to have caught them so early on. Fortunately, user testing showed that the broad screen layout worked well – the main changes were to button labels, icon designs, and generally improved affordances.

Interactive prototypes

By week 4 my role had transitioned into front-end development, in which I’m responsible for creating static HTML mockups with the final design and CSS, which the developers use as reference markup for the React components. While this isn’t mainstream practice in our industry, I find it has numerous advantages, especially for an Agile project, as it enables me to leave the static, inexact medium of wireframes behind and refine the design and interaction directly within the browser. (I add some some dynamic interactivity using jQuery, but this is throwaway code for demo purposes only.)

The other advantage of HTML mockups is that they afford us an opportunity to do interactive user testing using a web browser, well before the production application is stable enough to test. Paper prototyping is fine up to a point, but you have plenty of limitations – for example, you can’t scroll, there are no mousever events, you can’t resize the screen, etc.

So by week 5 I was able to test nearly all parts of the application, in the browser, with 11 users. (This included two groups of 4, which worked better than I expected – one person manning the mouse and keyboard, but everyone in the group thinking out loud.) It was really good being able to see the difference that interactivity made, such as hover states, and seeing people actually trying to click or drag things rather than just saying what they’d do gave me an added level of confidence in my findings. Again, immediately afterwards, I made several changes that I’m confident improves the application – removing a redundant button that never got clicked, adding labels to some icons, strengthening a primary action by adding an icon, among others. Not to mention fixing numerous technical bugs that came up during testing. (I use Github comments to ensure developers are aware of any HTML changes to components at this stage.)

Never stop testing

Hopefully we’ll have time for another round of testing with the production application. This should give a more faithful representation of the vision mixing workflow, since in the mockups the application is always in the same state, using dummy content. With every test we can feel more confident – and our stakeholders can feel more confident – that what we’re building will meet its goals, and make users productive rather than frustrated. And on a personal level, I’m just relieved that we won’t be launching with any of the embarrassing gotchas that cropped up and got fixed during testing.

Read part 4 of this project working with BBC R&D where we talk about compositing and mixing video in the browser.

The challenges of mixing live video streams over IP networks

Welcome to our second post on the work we’re doing with BBC Research & Development. If you’ve not read the first post, you should go read that first 😉

Introducing IP Studio

BBC R&D logo

The first part of the infrastructure we’re working with here is something called IP Studio. In essence this is a platform for discovering, connecting and transforming video streams in a generic way, using IP networking – the standard on which pretty much all Internet, office and home networks are based.

Up until now video cameras have used very simple standards such as SDI to move video around. Even though SDI is digital, it’s just point-to-point – you connect the camera to something using a cable, and there it is. The reason for the remarkable success of IP networks, however, is their ability to connect things together over a generic set of components, routing between connecting devices. Your web browser can get messages to and from this blog over the Internet using a range of intervening machines, which is actually pretty clever.

Doing this with video is obviously in some senses well-understood – we’ve all watched videos online. There are some unique challenges with doing this for live television though!

Why live video is different

First, you can’t have any buffering: this is live. It’s unacceptable for everyone watching TV to see a buffering message because the production systems aren’t quick enough.

Second is quality. These are 4K streams, not typical internet video resolution. 4K streams have (roughly) 4000 horizontal pixels compared to the (roughly) 2000 for a 1080p stream (weirdly 1080p, 720p etc are named for their vertical pixels instead). this means they need about 4 times as much bandwidth – which even in 2017 is quite a lot. Specialist networking kit and a lot of processing power is required.

Third is the unique requirements of production – we’re not just transmitting a finished, pre-prepared video, but all the components from which to make one: multiple cameras, multiple audio feeds, still images, pre-recorded video. Everything you need to create the finished live product. This means that to deliver a final product you might need ten times as much source material – which is well beyond the capabilities of any existing systems.

IP Studio addresses this with a cluster of powerful servers sitting on a very high speed network. It allows engineers to connect together “nodes” to form processing “pipelines” that deliver video suitable for editing. This means capturing the video from existing cameras (using SDI) and transforming them into a format which will allow them to be mixed together later.

It’s about time

That sounds relatively straightforward, except for one thing: time. When you work with live signals on traditional analogue or point-to-point digital systems, then live means, well, live. There can be transmission delays in the equipment but they tend to be small and stable. A system based on relatively standard hardware and operating systems (IP Studio uses Linux, naturally) is going to have all sorts of variable delays in it, which need to be accommodated.

IP Studio is therefore based on “flows” comprising “grains”. Each grain has a quantum of payload (for example a video frame) and timing information. the timing information allows multiple flows to be combined into a final output where everything happens appropriately in synchronisation. This might sound easy but is fiendishly difficult – some flows will arrive later than others, so systems need to hold back some of them until everything is running to time.

To add to the complexity, we need two versions of the stream, one at 4k and one at a lower resolution.

Don’t forget the browser

Within the video mixer we’re building, we need the operator to be able to see their mixing decisions (cutting, fading etc.) happening in front of them in real time. We also need to control the final transmitted live output. There’s no way a browser in 2017 is going to show half-a-dozen 4k streams at once (and it would be a waste to do so). This means we are showing lower resolution 480p streams in the browser, while sending the edit decisions up to the output rendering systems which will process the 4k streams, before finally reducing them to 1080p for broadcast.

So we’ve got half-a-dozen 4k streams, and 480p equivalents, still images, pre-recorded video and audio, all being moved around in near-real-time on a cluster of commodity equipment from which we’ll be delivering live television!

Read part 3 of this project with BBC R&D where we delve into rapid user research on an Agile project.

Why the hell is my project over-running?

As a project manager, I rank a C+ at best. That’s why I work in sales where the damage I can do is more limited. However, where I differ from a lot of not-very-good project managers is that I know exactly how bad I am and am happy to share my hard won experience with you all. Lucky old you, eh?

The topic for today is the most-asked question in agencies up and down the land: Why is my project over-running and what can I do about it? (It’s kind of a companion piece to Andy’s previous post “How do I know my developers are on track?”)

One of the reasons why this question is so often asked is that it is a defining feature of so many digital projects. Managing overrun is often one of the key tasks of any senior management team within an agency. This blog post is an not an attempt to exhaustively catalogue all the reasons why it happens. Instead I thought it’d be useful to discuss some of the less discussed reasons.

Let’s start with an easy one:

The estimate wasn’t an estimate

There are as many ways to generate an estimate as there are uses for that estimate. All bodies will develop their own idiosyncratic approach and – fun! – multiple approaches to estimation can also exist within an organisation at one time.

This tendency often leads to confusion or amnesia about what the estimate was for in the first place. Two extreme examples here are an internal estimate of developer effort and an estimate of price meant for an external, customer audience. The developers might estimate, say, a 6 week build for a given feature At the same time, for reasons of political expediency, you might only want to charge for 4 of those.

There’s nothing necessarily wrong or even surprising about a calculation like that, but it’s vitally important that you have the organisational memory to remember that the devs need 6 weeks in the resource planning and reporting areas of your business. In a month’s time when you’re looking back on the project and reporting its profit/loss, what tools are available for you to track this information beyond your fallible, over-worked brain?

How much are you losing to churn?

There are countless posts on the damaging effect that context-switching can have on a development team. If you’re delivering work into a team of generalist developers who are having to switch from one project to another over the course of a day, then you’re going to be having an impact on productivity from the get go. I can think of countless examples from my day to day life where an hour of timesheeted development somehow managed to cost a whole day of my precious budget.

So we all instinctively know this – but what I see agencies do again and again is both fail to develop tools and processes to compensate for this tendency *or* build any of this into planning. Spending time and resource developing ways to automate the build/deploy process, for example, can save a ton of tooth-grinding right at the most stressful moment of the ‘task-switching’ process.

The only other solution is to plan projects sensitively enough that the plans take into account the workload of other projects and/or assign an amount of ‘dead time’ to a developer on the assumption that they’re going to be doing nothing profitable for a proportion of the project.

From a commercial point of view, this is a more or less indefensible position. From a practical point of view it borders on the hilariously implausible. (Show me an agency with a resource planning horizon of more than 4 weeks and I will show you the authors of the greatest work of fiction the world has ever seen.)

And speaking of which…

Your developers have an outside-context problem?

This is really common across the industry but is particularly true where you’ve got a team of middle-weight generalists – which is where dev teams in marcoms agencies naturally tend to gravitate.

The problem goes like this: The client asks for something new. You’ve never built it before but you’ve built something like it. The team diligently estimate the project based on the thing you’ve built before and everything is going fine until, suddenly, horribly, it isn’t. The differences that seemed so small from a distance are, up close, suddenly yawning and irreconcilable. This thing isn’t like the other thing at all. This thing is like nothing the team have seen before. It’s an outside context problem; outside the context of the team to estimate, deliver or support.

And now, not only are your estimates wrong, but you’re two thirds of the way through the budget and three quarters of the way through the allotted time.

There are a number of related issues here – and they really need to be unpicked in a longer post – but briefly it can be incredibly difficult to spot these things before you’re in their grip.

The short-term obvious preventative here is to review progress against budgets religiously and provide your teams with enough space and time to estimate properly in the first place, however there is a more long term and harder to spot solution too.

You need to foster an internal atmosphere where teams can feel both empowered enough to speak up when things are going wrong and can be sensitive enough to human weakness to know when “I don’t know how to estimate this” is the only correct answer.

In my experience, even senior devs can have a tendency to nail themselves to the crosses they manufacture in the project initiation stages. (I’ve seen management help make these crosses too.) In a healthy company there should be no need for this.

Get help

Watch now, reader, as I pivot effortlessly into an advert for my company’s services here: Mid-sized, mid-weight dev teams are, by their nature, prone to making the kinds of mistakes outlined above; it’s the downside of fast-moving, flexible team.

As a software agency of some 13 years age, we have seen pretty much all the ways that a digital project can go wrong and we’re happy to use this knowledge to help make sure that yours don’t. Our team of developers and project managers is experienced and mature enough to be able to give accurate, actionable advice that will be good for the duration of the project.

If you want to give us a shout email me, Joe or pop along to our contact page.

MediaCity UK offices

Building a live television video mixing application for the browser

BBC R&D logoThis is the first in a series of posts about some work we are doing with BBC Research & Development.

The BBC now has, in the lab, the capability to deliver live television using high-end commodity equipment direct to broadcast, over standard IP networks. What we’ve been tasked with is building one of the key front-end applications – the video mixer. This will enable someone to mix an entire live television programme, at high quality, from within a standard web-browser on a normal laptop.

In this series of posts we’ll be talking in great depth about the design decisions, implementation technologies and opportunities presented by these platforms.

What is video mixing?

Video editing used to be a specialist skill requiring very expensive, specialist equipment. Like most things this has changed because of commodity, high-powered computers and now anyone can edit video using modestly priced equipment and software such as the industry standard Adobe Premiere. This has fed the development of services such as YouTube where 300 hours of video are uploaded every minute.

“Video Mixing” is the activity of getting the various different videos and stills in your source material and mixing them together to produce a single, linear output. It can involve showing single sources, cutting and fading between them, compositing them together, showing still images and graphics and running effects over them. Sound can be similarly manipulated. Anyone who has used Premiere, even to edit their family videos, will have some idea of the options involved.

Live television is a very different problem

First you need to produce high-fidelity output in real time. If you’ve ever used something like Premiere you’ll know that when you finally render your output it can take quite a long time – it can easily spend an hour rendering 20 minutes of output. That would be no good if you were broadcasting live! This means the technology used is very different – you can’t just use commodity hardware, you need specialist equipment that can work with these streams in realtime.

Second the capacity for screw up is immensely higher. Any mistakes in a live broadcast are immediately apparent, and potentially tricky to correct. It is a high-stress environment, even for experienced operators.

Finally, the range of things you might choose to do is much more limited, because you can spend little time setting it up. This means live television tends to use a far smaller ‘palette’ of mixing operations.

Even then, a live broadcast might require half a dozen people even for a modest production. You need someone to set up the cameras and control them, a sound engineer to get the sound right, someone to mix the audio, a vision mixer, a VT Operator (to run any pre-recorded videos you insert – perhaps the titles and credits) and someone to set up the still image overlays (for example, names and logos).

If that sounds bad, imagine a live broadcast away from the studio – the Outside Broadcast. All the people and equipment needs to be on site, hence the legendary “OB Van”:

P1040518

Inside one of those vans is the equipment and people needed to run a live broadcast for TV. They’d normally transmit the final output directly to air by satellite – which is why you generally see a van with a massive dish on it nearby. This equipment runs into millions and millions of pounds and can’t be deployed on a whim. When you only have a few channels of course you don’t need many vans…

The Internet Steamroller

The Internet is changing all of this. Services like YouTube Live and Facebook Live mean that anyone with a phone and decent coverage can run their own outside broadcast. Where once you needed a TV network and millions of pounds of equipment now anyone can do it. Quality is poor and there are few options for mixing, but it is an amazingly powerful tool for citizen journalism and live reporting.

Also, the constraints of “channels” are going. Where once there was no point owning more OB Vans than you have channels, now you could run dozens of live feeds simultaneously over the Internet. As the phone becomes the first screen and the TV in the corner turns into just another display many of the constraints that we have taken for granted look more and more anachronistic.

These new technologies provide an opportunity, but also some significant challenges. The major one is standards – there is a large ecosystem of manufacturers and suppliers whose equipment needs to interoperate. The standards used, such as SDI (Serial Digital Interface) have been around for decades and are widely supported. Moving to an Internet-based standard needs cooperation across the industry.

BBC R&D has been actively working towards this with their IP Studio  project, and the standards they are developing with industry for Networked Media.

Read part 2 of this project with BBC R&D where I’ll describe some of the technologies involved, and how we’re approaching the project.

Heavens to Betsy! We did a podcast!

Launching our new podcast channel…

while True:
on topic ramblings of a technical agency

Over the past few months we’ve been busy recording the first few episodes of our new podcast that we hope you’ll find fascinating. We’re excited to announce the launch of our channel with 4 episodes ready for you to listen to right now!

In this series we’ll be talking to all the various bits that make up the Isotoma team; from user experience, design, and consultancy, through coding and development, and on to quality assurance and testing. We’re covering a huge range of topics with experts and guests, and of course with our usual mix of technical knowledge and (occasional) humour.

Listen & subscribe to get the new podcast each month!

iTunes // SoundCloud // RSS // Pocket Casts // Stitcher

Navigating the maze of testing virtual reality – Part 2

This is the second and final post about QA testing in virtual reality. If you didn’t read the first post, Navigating the maze of testing virtual reality – Part 1, then you should read that first.

Test Execution

It is hard to describe what it is like wearing the headset for the first time. It’s a mixture of disorientation, fascination and being a little inundated with trying to remember what the test script was that you were following, using an Xbox controller for movement and still testing the actual functionality at the same time. It did take a good day of testing to get into the routine of following the below process:

  • Read test script
  • Put headset on
  • Trigger level as facilitator
  • Navigate through the maze
  • Remove headset
  • Mark test steps as pass/fail

Highlights and lessons learnt during the testing phase

High level test scripts Initially these were written quite detailed, but we soon realised that they needed to be written more high level. This saved time when writing the test scripts compared to having to write detailed step by step test scripts. The key was to keep things general within the test scripts. Although the tester needs to be aware to raise issues that are cosmetically incorrect, they do not have to include the specific details to check. The test scripts being general will also reduce the amount of maintenance required for them.

2D Map
The addition of a 2D map view of exactly what was being shown to the Oculus user was an advantage during the testing phase. This actually comes with the Oculus and we did not have to build this ourselves. This was the only view that allowed us to take screenshots as we couldn’t do this within the Oculus itself. We couldn’t take screen shots of everything but this could be used for at least half of the found issues. It also allowed some of the tests to be run without the headset as it wasn’t essential to be 3D tested which gave us a break from the headset.

Facilitator map view
This was a requirement, which we found really helpful during testing and would recommend this to be built for future projects. This allowed us to have an overview of each level as a whole including the following:

  • Maze starting point
  • Maze end point
  • Location of objects within the maze
  • Location of the subject as they move around the maze in real time 

Two testers are better than one
During the testing cycle we were able to parallel test on the project. Obviously this does mean higher costs than a traditional project, however, a lot of time is saved by the first tester not having to run through the test scripts beforehand or having to repeatedly take the headset off and therefore losing their current position in the maze. We found that the second tester worked well to support the first tester and had advantages which should be seriously considered during estimation and scoping on any virtual reality project:

  • Read through and mark pass/fail on the test scripts while the first is wearing the headset
  • Guide the first tester through the maze using the facilitator map view
  • Second pair of eyes on the maze using the 2D map view
  • Take screenshots of defects using the 2D map view
  • Write detailed defects as the first tester described them during the execution. This ensures defects are more accurate as you remember all of the information.

Glasses v Contact Lens

The Oculus can accommodate people wearing glasses, although taking the headset on and off can be really annoying. We found this helpful as using the Oculus really dries your eyes out which is a lot more noticeable and uncomfortable when wearing contact lens.

Rift Face
We found that after extensive use ‘Rift Face’ markings would appear on your face outlining exactly where the Oculus had been sat.

Unexpected issues we came across during the testing phase

Time
Test runs took lot longer than estimated. Being unable to skip to a specific part of a maze meant tests relating to the end of a maze for example would require us to work through the beginning part of the maze as a prerequisite. Testing each level wasn’t just the functional testing of the maze, it included several other items:

  • Recording of the results (As in timings, turns taken)
  • Audio (Instructions, background)
  • Controls
  • Visual effects

Scheduled break times needed to be planned in as the recommendation for using an Oculus is 10-15 mins of break time after every 30 minutes which means for each hour of testing you perform you would need to allow 1 hour 20 minutes!

Adverse Effects 
We found that virtual reality affects each person differently. One of us had a few minutes of nausea which improved with each use, and disappeared after using the Oculus for a couple of hours. The other one of us had much worse nausea, experiencing around an hour of feeling ill after their first half hour of testing on the Oculus. This only eased slightly over the course of the entire testing cycle. Other side effects experienced were dizziness and fatigue. These were all despite the fact that guidance around motion sickness had been taken into account during development.

Monotony
After the initial excitement of testing virtual reality wears off, the repetitiveness soon becomes challenging. Having to re-run one specific level ten times in a row can get extremely dull. We found we had to be vigilant so as to not subconsciously rush through the level as this could easily result in missing defects.

Recreating Defects
Although we had the use of a 2D map, is was extremely difficult to recreate defects for the developer. When working alone, the tester would have to lift their headset to take the screenshot which often meant the view had changed and screenshots were not as accurate as we would have liked. To overcome this we had to put as much details into our defect tickets as possible.

User Acceptance Testing

As mentioned previously our client was heavily involved in the whole process. They have their own Oculus Rift which meant that specific QA/UAT times did not have to be scheduled which reduced time pressures. It also meant that whilst the clients were testing one phase we were able to progress to the next phase of development without affecting the UAT environment. Each time we released to the client they worked on an ad hoc basis testing each individual ticket rather doing it all at once. This allowed the client to test based on priorities.

The agile methodology worked really well as we were constantly getting feedback regarding user experience. This meant we could address these areas quite early on rather than having to rip out chunks at a later stage.

How do we regression test … ?

This so far is not yet tried and tested for us. In an ideal world there would be an automated test suite created to handle this. As the project is still in flight we are not in a position to automate this yet and will review it at a later date.

Conclusion

In conclusion working with the Oculus Rift has been a fun yet challenging project. We have really only dipped a toe in the water in relation to virtual reality testing. It has been overall a positive experience which we have learnt a great deal from.

Until we gain more experience and develop our approach it will be more time consuming and cost more than testing on non virtual reality software. We have adapted our usual practices for the whole of the test phase and even though we are pleased with our new process for this type of project we are not out of the maze yet and it is still very much a work in progress…

Written by Test Analysts Lydia Hewitt and Robin Angland.

Navigating the maze of testing virtual reality – Part 1

Lets start at the beginning

We are currently working on a virtually reality project for the Oculus Rift. The scope is a set of mazes to be built where the user will be given the task of successfully navigating through various different routes, whilst finding objects within the maze. Results are to be recorded for time, accuracy and route recall.

There will also be a desktop app to display Facilitator screens which will run parallel, where a controller can coordinate the levels for the user.

Sounds like fun and something new to get our teeth into, but the main question is how do we go about testing it when ‘Current Oculus Rift testing experience within our whole QA team = 0?’.
We did some research (googled it) and found a lack of information and experience on the internet and our usual sources. We then decided to write this blog to hopefully provide some guidance for other testers in similar situations.

We decided that we would use Agile for this project with several iterations. There is a high priority focus on the user experience as well as functionality during QA. The requirements would be based around user stories with QA test scripting being driven by the acceptance criteria from those stories.

Lets see what we’ve got to work with

We were able to use our existing tool set on this project. Requirements and defect tracking stored in Jira.

Test scripting, planning and execution are done within Smartbear QA Complete.
As the Oculus is restricted to development and use on Windows machines, both the development and testing of the project was completed on a machine running Windows 10.

Before providing some estimates we did our homework. We were given Wireframes which were extremely detailed. We were very lucky as our stakeholder was heavily involved with the project from the start which gave us the advantage of even more detailed requirements and less assumptions. As virtual reality is very new to us, it was very important to take on board as much information as possible in the planning phase as we didn’t have previous experience to rely to predict defect patterns.

Another bonus was the developer was able to do a demonstration of ‘What was built so far’ before any test scripting was undertaken. This not only gave us direction for our test scripting, but also a general idea of potential execution timings. We also had access to some of the mini demos on the Oculus which including ‘Showdown’ and ‘Invasion!’ which gave us some knowledge of knowing what it was like setting up the Oculus and wearing the headset.

Planning and estimation

Now we were aware of the hardware and software that we were to use and we had read all of the documentation available we were ready to start the estimation process. We used story points for a bottom up approach. Points of consideration for estimating:

  • Figuring out how to design the tests
  • Use of a headset and Xbox controller (Rather than keyboard and monitors)
  • Visual walk through of each maze level – We deemed this as high priority due to the User Experience
  • Opinions of experts
  • Average run times

As part of the planning phase we designed the acceptance criteria for each user story and this assisted us in doing initial estimates for the testing. We evolved this as the project developed and were able to refine our estimates to gauge test size and potential execution times.

Potential risks and obstacles during planning and our solutions to mitigate them

How do you ensure physical safety of the user?
It was obvious that we required a separate area for testing. It just so happened that we had a small office away from the hustle and bustle of the main open plan working area which was perfect.

Operating system
As the program was developed on a Windows PC, the setup was already covered in our new ‘Office’. The obstacle was actually us testers getting used to using Windows again when we are very accustomed to the Linux Ubuntu operating system. As standard we use Linux so we needed to re familiarise ourselves with Windows.

Where do you start when writing the test scripts ?
This was just a concern. Like other projects, test scripting evolved around what was to be delivered and then broken up into each maze level. The choice to split it into each level was down to the fact that we thought it would flow better rather than splitting into roles. This is exactly how we would have started had the project been 2D.

How do you even run a test script whilst wearing a headset ?
It was decided that the best approach to make the test steps short and easier to remember so that we could reduce the number of times we would have to take the headset off to refer to the script. Updating of the actual run of the steps would have to be done retrospectively.

Planning releases

As there were many ‘unknowns’ in the early stages, releases were heavily based on when each development phase was completed rather than working towards specific dates. It worked well for us as it meant functionality was delivered in one piece rather than having to leave bits behind in another sprint. In the future we would plan the release dates especially as we now have the experience to know how long each phase will take us. The Agile approach allowed flexibility to realign what was being delivered per release.

Test Scripting

Even though the user experience was a high priority focus for QA testing, the functionality was still our main priority.

Did each level do exactly what it was required to do? 

The user experience could be tested from our own personal perspective, but you must remember that each person is different and it would be impossible to test all aspects for every user. We decided it would be advisable to stick to testing the most obvious and basic user actions whilst running through the mazes.

Test coverage was led by the Acceptance Criteria which was driven from the user stories. However, we did need to consider a few extra factors:

  • Audio instructions
  • Background audio
  • Speed/smoothness of movement
  • Visual – Is it realistic?
  • Visual – Has the entire viewable space been covered (As in no missed areas)
  • How the controller works for movement and button presses – Could a novice use this without any prior training?

Whilst test scripting we used the following testing techniques:

  • Exploratory testing
  • User case and scenario based testing
  • Equivalence partitioning
  • Boundary value analysis
  • Negative testing

Development

As we were using Agile, there was a lot of Developer/Tester interaction. A project for virtual reality just wouldn’t work in a waterfall methodology as the developer and tester relationship is more collaborative. The project was released to QA iteratively. Defects were resolved and released back to QA for retest.

One piece of advice is to have more than one headset as due to our specific scenario we had one between us meaning we could not have development and QA happening at the same time.

Coming up in Part 2…

In Navigating the maze of testing virtual reality – Part 2 I look at test execution, lessons learnt during testing; as well as unexpected issues we encountered during the testing phase.

Written by Test Analysts Lydia Hewitt and Robin Angland.