Tech Talk

29 Sep 2022

Crash Course On Live Streaming Encoders

by Alex Zids, COO of Bettercast

The success or failure of a live stream is often decided by the quality of the stream. Of course, the same is functionally true of any video-based production; bad audio, low-resolution footage and mountains of lag is just as likely to kill a Netflix special as it will a live streamed event. However, with live streaming, it all needs to be right on the night. You can’t go back for a reshoot later.

Fundamental to those factors listed above (as well as dozens of others) is the encoder. The process of taking raw footage and converting it into a digital stream that can be broadcast around the world is perhaps the most pivotal. It is effectively the technology that live streaming is built upon. Get something wrong here and failure is almost guaranteed.

What is a live stream encoder?

Given the audience of this magazine, it’s probably fair to assume that the overwhelming majority of people reading this will already know the answer to the question posed by the sub-heading above. However, for the benefit of those just coming into the live streaming industry, we will very briefly summarise it. Or, to be more accurate, Phil Cluff of MUX did in a recent interview with Bettercast CEO Ben Powell on the BetterPODcast:

“We kind of think about it in three phases. Firstly, a stream comes to us live through an RTMP stream; a very old protocol from the Flash video days that has stuck around in spite of its many modern competitors. There, you’re sending up one high-quality stream from your encoder on site.”

“That’s then picked up by our video processing stack. The first thing we do is transcode that. We’re taking that inbound stream and we’re creating a bunch of different qualities of that content with different bitrates and resolutions. We’re doing that because peoples’ internet connections change all the time and they might not have the same bandwidth at the end of a stream as they did at the start. As an example, I genuinely once lived somewhere where the Wi-Fi router was sat on top of the microwave, so the Wi-Fi would just drop dead whenever someone was making porridge.”

“The next step is packaging. We cut each of the different versions of content into chunks of anything between two and six seconds, always in a regular cadence. So, if it starts as two seconds, it will be two seconds all the way to the end. We then host and deliver those through content delivery networks to ensure reach and quality across large footprints of users. Every request that goes to MUX goes through one or many content delivery networks, which then feeds those video chunks and feeds them into your player.”

Of course, not all encoders are created equally. Firstly, there’s the difference between hardware and software encoders. Beyond that, you have the inherent differences you find in different platforms created with different mentalities and priorities. Even though they’re all encoders and achieve the same ultimate purpose, how they achieve it is what separates them and can spell either simple success or devastating failure for a live stream.

Buffering…

Perhaps the most important part of the encoding process is chunking. The size of the chunk determines how much latency there is in the stream; basically, how far behind live events the footage being shown on the stream is. However, it also impacts the need for buffering – the time taken for the player to download and play the next chunks.

“Generally, a player is going to have a few chunks in buffer at any one time,” explained Phil. “Apple’s iOS will have three so, if your chunk size is two seconds, iOS natively adds six seconds on top of that as the minimum buffer. It’s tuneable in a lot of players, though not iOS specifically. If you have a bigger chunk size, you’re going to have more latency.”

So, while longer chunks will certainly reduce buffering as there is more time for the next chunk to download and be queued up in the player, it also creates significantly more latency. This point was vividly illustrated in a previous article about WebRTC with the example of watching a major football match and reading about each goal scored on Twitter long before it has been shown on the stream. There are solutions to these issues, but that’s a whole article in its own right.

Encoders in Bettercast

We could try comparing the many, many encoders available to try to come up with a conclusion of which is best. However, aside from the fact that this article would take about an hour to read, our conclusions might not be the same as yours. A good encoder for one type of event might not work so well with another. Instead, we can explain Bettercast’s choice.

“We built our very first version of Bettercast on IVS, purely because it was incredibly simple and relatively low-cost,” said Bettercast CEO Ben Powell in the same podcast that featured Phil Cluff. IVS is Amazon’s Interactive Video Service, which is a capable and widely used platform.

“We have moved to MUX and I’m really happy to have done that, not only because of the simplicity of what you guys are doing, but for the level of data feedback we get from your systems,” he added.

Phil, who is the Product Lead at MUX, listed off some of the many ways MUX and IVS differ. In fairness, he praised the flexibility of the Amazon platform, as well as its integration with AWS Media Services and other aspects of Amazon Web Services. However, MUX went for a plug-and-play approach.

“Compared to IVS, we see ourselves as providing much more of an ecosystem. Any live stream you do in MUX is instantly available as an on-demand asset, for example. You don’t have to download it from S3 and process it. It’s all just in that ecosystem.”

One of the key differences – and arguably the main reason why Bettercast shifted to MUX – is the data feedback Ben mentioned above.

“The first product we went and built was and still is a data platform,” said Phil. “This is a platform to understand how well your video is performing. That’s used by some of the biggest streaming platforms in the world. It’s even been used on the Superbowl for the last three years.”

“It helps people to understand the experience people are having when they watch a video: are they experiencing buffering? How long does the video take to start? Is the video low-quality? All these sorts of things aren’t common in video applications, so the first product that we built was a product to enable people to monitor their video applications. That now informs everything we build. We use that data to make decisions. Data is critical to the feedback loop of building better video.”

Subscribe

Published monthly since 1991, our famous AV industry magazine is free for download or pay for print. Subscribers also receive CX News, our free weekly email with the latest industry news and jobs.