Efficient Bitrate Ladder Construction for Content-Optimized Adaptive Video Streaming

Abstract

One of the challenges faced by many video providers is the heterogeneity of network specifications, user requirements, and content compression performance. The universal solution of a fixed bitrate ladder is inadequate in ensuring a high quality of user experience without re-buffering or introducing annoying compression artifacts. However, a content-tailored solution, based on extensively encoding across all resolutions and over a wide quality range is highly expensive in terms of computational, financial, and energy costs. Inspired by this, we propose an approach that exploits machine learning to predict a content-optimized bitrate ladder. The method extracts spatio-temporal features from the uncompressed content, trains machine-learning models to predict the Pareto front parameters, and, based on that, builds the ladder within a defined bitrate range. The method has the benefit of significantly reducing the number of encodes required per sequence. The presented results, based on 100 HEVC-encoded sequences, demonstrate a reduction in the number of encodes required when compared to an exhaustive search and an interpolation-based method, by 89.06% and 61.46%, respectively, at the cost of an average Bj{\o}ntegaard Delta Rate difference of 1.78% compared to the exhaustive approach. Finally, a hybrid method is introduced that selects either the proposed or the interpolation-based method depending on the sequence features. This results in an overall 83.83% reduction of required encodings at the cost of an average Bj{\o}ntegaard Delta Rate difference of 1.26%

    Similar works