麻豆约拍

The Challenges of Mixing Live Video Streams over IP Networks

Published: 7 June 2017

Welcome to our second post on the work we鈥檙e doing with software development agency . They're helping us produce a browser-based vision mixing interface to support low-cost production run by a single operator. You can and in some of the links below. Isotoma's Doug Winter has the latest installment in the series.

Introducing IP Studio

The first part of the infrastructure we鈥檙e working with here is something called . In essence this is a platform for discovering, connecting and transforming video streams in a generic way, using IP networking 鈥 the standard on which pretty much all Internet, office and home networks are based.

Up until now video cameras have used very simple standards such as SDI to move video around. Even though SDI is digital, it鈥檚 just point-to-point 鈥 you connect the camera to something using a cable, and there it is. The reason for the remarkable success of IP networks, however, is their ability to connect things together over a generic set of components, routing between connecting devices. Your web browser can get messages to and from this blog over the Internet using a range of intervening machines, which is actually pretty clever.Doing this with video is obviously in some senses well-understood 鈥 we鈥檝e all watched videos online. There are some unique challenges with doing this for live television though!

Why live video is different

First, you can鈥檛 have any buffering: this is live. It鈥檚 unacceptable for everyone watching TV to see a buffering message because the production systems aren鈥檛 quick enough.Second is quality. These are 4K streams, not typical internet video resolution. 4K streams have (roughly) 4000 horizontal pixels compared to the (roughly) 2000 for a 1080p stream (weirdly 1080p, 720p etc are named for their vertical pixels instead). this means they need about 4 times as much bandwidth 鈥 which even in 2017 is quite a lot.

Specialist networking kit and a lot of processing power is required.Third is the unique requirements of production 鈥 we鈥檙e not just transmitting a finished, pre-prepared video, but all the components from which to make one: multiple cameras, multiple audio feeds, still images, pre-recorded video. Everything you need to create the finished live product. This means that to deliver a final product you might need ten times as much source material 鈥 which is well beyond the capabilities of any existing systems.

IP Studio addresses this with a cluster of powerful servers sitting on a very high speed network. It allows engineers to connect together 鈥渘odes鈥 to form processing 鈥減ipelines鈥 that deliver video suitable for editing. This means capturing the video from existing cameras (using SDI) and transforming them into a format which will allow them to be mixed together later.

It鈥檚 about time

That sounds relatively straightforward, except for one thing: time. When you work with live signals on traditional analogue or point-to-point digital systems, then live means, well, live. There can be transmission delays in the equipment but they tend to be small and stable. A system based on relatively standard hardware and operating systems (IP Studio uses Linux, naturally) is going to have all sorts of variable delays in it, which need to be accommodated.

IP Studio is therefore based on 鈥渇lows鈥 comprising 鈥済rains鈥. Each grain has a quantum of payload (for example a video frame) and timing information. the timing information allows multiple flows to be combined into a final output where everything happens appropriately in synchronisation. This might sound easy but is fiendishly difficult 鈥 some flows will arrive later than others, so systems need to hold back some of them until everything is running to time.To add to the complexity, we need two versions of the stream, one at 4k and one at a lower resolution.

Don鈥檛 forget the browser

Within the video mixer we鈥檙e building, we need the operator to be able to see their mixing decisions (cutting, fading etc.) happening in front of them in real time. We also need to control the final transmitted live output. There鈥檚 no way a browser in 2017 is going to show half-a-dozen 4k streams at once (and it would be a waste to do so).

This means we are showing lower resolution 480p streams in the browser, while sending the edit decisions up to the output rendering systems which will process the 4k streams, before finally reducing them to 1080p for broadcast.

So we鈥檝e got half-a-dozen 4k streams, and 480p equivalents, still images, pre-recorded video and audio, all being moved around in near-real-time on a cluster of commodity equipment from which we鈥檒l be delivering live television!

Rebuild Page

The page will automatically reload. You may need to reload again if the build takes longer than expected.

Useful links

Theme toggler

Select a theme and theme mode and click "Load theme" to load in your theme combination.

Theme:
Theme Mode: