How to Improve Your Video Performance

Video performance optimization is hard. Here are 7 patterns that will help.

Video performance optimization--namely, maximizing the quality of your users’ video experience on web and mobile--is hard. It is impacted by a myriad of variables: geography, devices, encoding settings, network connectivity, and implementation quirks of various platforms and libraries. There’s no silver bullet to make your videos faster. Any solution has to be contextualized to your use case. There’s some detective work involved, but at least we’ll help you get started with this step-by-step guide.

I work at Apollo 350, and we specialize in deploying video solutions for our clients. We have, in some shape or form, helped the majority of our clients solve problems around video performance.

More often than not in video, the challenge isn’t so much applying fixes to these problems as it is accurately identifying what the problems are in the first place. We’ve seen it all, and we’ve noticed patterns. Over time, we’ve developed a comprehensive, step-by-step playbook around assessing video performance.

The goal is to turn:

“We think our users are experiencing issues with video.”

Into:

“Since we launched internationally, the bulk of our users have been consuming 1.5 fewer videos per session, likely due to a 150% increase in initial load time, and a 300% increase in buffer events. These, in turn, are caused by an average download time increase of 1 second per segment, likely due to a 20% decrease in cache hit rate and file sizes 2-3x the industry standard.”

Once you have a detailed assessment, fixing the problems is often a trivial, or at least very tractable problem. For this reason, we take this initial diagnostic very seriously. And while tools like Mux are great for getting a high-level picture of how you’re performing on different platforms, we find that you never have the full picture unless you--at least to some degree--measure it yourself.

Here are 7 battle-tested steps we’ve taken time and time again to pinpoint and solve video performance issues, both for ourselves and our clients. This guide is geared towards Video On Demand, but most of the lessons here can be applied to Live Video and RTC as well.

1. Use an Adaptive Streaming Format

Adaptive streaming gives you the ability to switch qualities based on variations in users’ network qualities, provides a solution for live streaming, and works on a variety of platforms. The two most common formats are HLS (HTTP Live Streaming) and DASH (Dynamic Adaptive Streaming over HTTP). Here is an overview of each:

  • HLS: The most common format; works on all devices; works natively on mobile and Safari; proprietary (owned by Apple)
  • DASH: Open source; not supported on mobile Safari

2. Know What You’re Measuring

This might seem so obvious that it shouldn’t even count as a step, but it’s actually the most important, and in some ways the trickiest. Here are some guidelines:

  • Measure percentiles, not averages. Everywhere, but especially in video, outliers can really throw off your data. Local caching can paint an unrealistically good picture, while a few long downloads can skew things negative. For most measurements, we find that the 95th percentile gives us a solid assessment of what most users are experiencing.
  • Know what you mean by “initial load time”. Forget video download time. All that matters is when a user believes a video should start playing compared to when it actually starts playing. Typically this is going to be (a) at page load, (b) upon clicking play, or (c), upon scrolling the video into view.
  • Not all buffering is the same. Users can tolerate buffering as long as it’s expected. Make sure you are able to tell the difference between spontaneous buffering (which is bad) and buffering as a result of a user interaction like seeking (which is usually forgivable).

3. Measure It Yourself, Fix, and Measure Again

Most video players and plugins these days give you fairly granular access to player events or hooks. We’ve had success creating a performance tracking service on both web and mobile native that monitors a cascade of loading events and how long each took. After collecting this data, we can identify the bottlenecks in the system, make the changes necessary, and verify through data that the changes worked. Here’s what we specifically track to triage the initial video start time:

  1. Player Mount: The moment the component or view that contains the player is instantiated.
  2. Play Start Intent: The moment we know the user’s intent is to watch the video. If the video is supposed to autoplay, this could be when the page loads. In a manual play scenario, it’s when the user clicks or taps play. Depending on the implementation, this could happen before Player Mount.
  3. Manifest Downloaded: Both DASH and HLS have a concept of manifests (small ~1kb file with video metadata). This should be downloaded and parsed within 100ms in all but the slowest networks.
  4. First Segment Downloaded: This is the most common bottleneck. It should not be taking more than 1s in most cases. If it’s much longer than that, either (a) your default bitrate files are too big or (b) your video segments are missing the CDN cache.
  5. Playback Begins: This should be very quick (~100ms) after the first segment downloads. If it’s longer, check that (a) your player isn’t waiting to download additional segments before playing, and (b) the player itself isn’t taking too long to process the video data (this can be the case for non-native HLS or DASH players. It could mean you want to look into a different player vendor).

There are subtle differences between every video implementation. Some players need to fetch additional metadata, or render ads. Sometimes the steps that impact playback the most, like an ad call timing out, aren’t even related to the core video content.

4. Limit Your Streams’ Sizes

Modern transcoding tools like AWS’s MediaConvert give you turn-key options for configuring encode jobs, but don’t necessarily provide you with the most cost-effective options by default. 

To test this, try the following steps:

  1. Take note of all of your active encoding outputs. For adaptive formats like DASH and HLS, there are likely several.
  2. Pick a single segment of video, and download it at each output bitrate.
  3. Setup VMAF with FFmpeg. This guide has some information to get started.
  4. Run VMAF once for each output, comparing it to the highest quality bitrate. VMAF will give you a score from 0 to 100 indicating how similar the two videos are (more on this below).
  5. Consider removing any outputs that score less than 5 points higher than the next highest-scoring output. Typically, the video quality difference between those outputs will be barely perceptible.

For example, if we had 8 outputs whose VMAF scores were 98.6, 97.1, 96.9, 96.0, 90.1, 89.9, 84.3, 66.2, we would only keep around the ones scoring 96.0, 89.9 84.3 and 66.2. It’s with the higher qualities where you can prune a lot of bandwidth with minimal quality loss.

We discovered that a recent client was serving 1080p video at several different bitrates, with video segment file sizes ranging from 5.8Mb to 15Mb. What was striking about this was that the 5.8Mb was only 2% lower quality than the 15Mb one, and yet on some devices took 5 seconds less time to download!

What do we mean by 2% lower quality? We mean that, if a large sampling of randomly selected viewers were to rate the video quality out of 100, on average the rating for the smaller video would be 2% lower. The ML models underlying VMAF were trained on a large variety of samples with actual people rating videos of varying quality.


8500mbps, 13.7 MB, VMAF score of 98.8
6000mbps, 9.6 MB, VMAF score of 96.5
3500mbps, 5.9 MB, VMAF score of 90.1

The above sample images represent a more common, less extreme example, where above a certain threshold, a linear increase in bitrate does not correspond to a linear increase in quality. I recommend opening the images in their own windows to take a closer look at the difference.

5. Validate Your Streams

There are tools that can analyze your streams and point out common problems. This is an easy step that goes a long way. For instance, use Apple’s free MediaStreamValidator to ensure your manifests are formatted correctly. Below are common warning you may see, and they are all actionable:

  • The server MUST deliver playlists using gzip content‑encoding
  • You MUST include the AVERAGE‑BANDWIDTH attribute
  • Master playlists that are delivered over cellular networks MUST contain a variant whose peak BANDWIDTH is less than or equal to 192kb/s

6. Audit Your Edge Caching

In both VOD and Live Streaming, you should expect a cache hit rate of >90% for any actual video segment data (.ts files or .mp4 files). If this isn’t the case, you should:

  • Ensure you have a higher default TTL
  • Ensure you are not forwarding query params or cookies, as these can result in cache misses
  • If need be, create a different cache behavior specifically for segment files so that you can be more aggressive about these

In addition, a common pitfall occurs when clients move to an international market, but do not update their edge locations to outside of the US. With most CDNs, this is a premium option you can choose (CloudFront calls this Price Class).

7. Stress Test

Auditing your CDN will give you an average picture of your cache efficiency, but it’s a good idea to also test a worst case scenario. Here’s how we’ve typically done that:

  1. If you don’t already have it, set up a QA environment where you can host a stream alongside your production storage. Setup a clone of your production CDN distribution as well.
  2. Copy a video with all of its assets to this environment.
  3. Setup a load testing tool. We’ve had a lot of luck with Locust, but there are many others.
  4. (optional) Provision some servers from which to run your load test. Locust supports workers that are meant to run on separate machines and lets you run heavier tests.
  5. Install a tool specifically meant for testing video streams. These can read manifest files and download the respective segments as though they’re video players. We’re big fans of this one.
  6. Run it! Locust gives you an easy-to-use UI for starting and stopping a stress test. We’ve settled on ~2000 total users and ~500 simultaneous requests as a good starting point.

Conclusion

These 7 tips will get you a long way, but they’re also just the beginning. Adding components like an ad library and peer-to-peer chat come with their own challenges. But regardless of the clients’ specific needs, we always start with the steps presented here.