Netflix and the Art of Content-Adaptive Encoding
When Netflix speaks, everyone in the streaming video industry tends to listen. For instance, in a recent blog post titled “Per-Title Encode Optimization,” Netflix announced that it is re-encoding its entire collection of videos using bitrates appropriate to the content in each particular video.
This is a laudable step but not the whole story in encoding optimization. Read on to see what’s going on under the hood when it comes to optimization and the ongoing need to address videos on a frame-by-frame basis instead of a per-video-title formula.
Following industry standards, Netflix had previously encoded its videos according to a fixed bitrate ladder, where the encoding bitrate was determined strictly based on frame resolution, not on content. A fixed bitrate ladder typically produces good quality encodings, but Netflix noted what EuclidIQ had discovered as part of its fundamental research into video quality and optimization of content: a large video collection (such as Netflix’s video library) contains a “very high diversity of signal characteristics,” such that sometimes the encoding bitrate from the fixed bitrate ladder was too high (resulting in wasted storage and transmission bits) and sometimes too low (resulting in poor quality encodings).
Netflix’s solution, as outlined in its blog post, is to determine – for each video in its collection – a unique bitrate ladder “tailored to its specific complexity characteristics.” One could perform such a process manually, but to handle the thousands of videos in its collection, Netflix will need an automated process that uses a reliable measure of video quality to determine the best bitrate ladder for a given video.
In the blog post, Netflix mentioned it has jointly collaborated with academic researchers to craft a perceptual quality metric called video multi-method assessment fusion (VMAF) and hinted that the VMAF metric will be involved in its “per-title encode optimization” (PTEO) process.
The fact that Netflix is considering a PTEO process is validation of EuclidIQ’s fundamental approach to video optimization, which we call content-adaptive encoding. Others call it content-aware encoding, but the premise is the same: video encoding is adapted based on video content to improve compression performance.
PTEO adapts encoding bitrate and frame resolution to video content as a preprocessing step, prior to encoding. But the challenge with preprocessing techniques, as detailed in our previous blog, is that they tend to make general assumptions about the video data that can be inaccurate for all or some parts of the video.
An example might be an attempt to derive an encoding bitrate from a two-hour movie and apply it both to a talking-heads scene and to an action scene in the same movie. On that basis, in the case of PTEO, it is unlikely that a single encoding bitrate will be appropriate for every scene in the movie, even if the PTEO encoding bitrate is generally adapted to the overall content in the movie.
In other words, not only is there a “high diversity of signal characteristics” across movies within Netflix’s library, there is likely a high diversity of signal characteristics within the same movie. Thus, a PTEO-adjusted encoding bitrate may still be too high in some parts of a video (resulting in wasted bits) or too low in others (resulting in poor quality).
We commend Netflix for bringing industry attention to the issues surrounding video quality optimization – and the encoding improvements noted in their blog indicate that PTEO is one step in the right direction – but we’d like to challenge the industry to think beyond a one-size-fits-all approach to encoding long-form content.
From EuclidIQ’s perspective, the solution to this challenge is to adapt encoding to the content on a per-frame basis. Our IQ264 technology employs perceptual quality optimization to improve H.264 encoding performance from frame to frame, with IQ264 operating within the encoder, interfacing with the prediction and quantization steps of the encoding process to improve the perceptual quality of encoded video frames during the encoding.
IQ264 produces perceptually better encoded frames and perceptually better reference frames for future encoded frames in the video. The Netflix PTEO process, while a step in the right direction in terms of adapting encoding parameters to the video data, does not address this ability to increase perceptual quality on a frame-by-frame basis.
The logical next step in further video optimization is clear: adapt the encoding itself to the video data. This is precisely what IQ264 is designed to do. To find out more about IQ264, contact us at firstname.lastname@example.org.