126 research outputs found

    Machine Learning based Efficient QT-MTT Partitioning Scheme for VVC Intra Encoders

    Full text link
    The next-generation Versatile Video Coding (VVC) standard introduces a new Multi-Type Tree (MTT) block partitioning structure that supports Binary-Tree (BT) and Ternary-Tree (TT) splits in both vertical and horizontal directions. This new approach leads to five possible splits at each block depth and thereby improves the coding efficiency of VVC over that of the preceding High Efficiency Video Coding (HEVC) standard, which only supports Quad-Tree (QT) partitioning with a single split per block depth. However, MTT also has brought a considerable impact on encoder computational complexity. In this paper, a two-stage learning-based technique is proposed to tackle the complexity overhead of MTT in VVC intra encoders. In our scheme, the input block is first processed by a Convolutional Neural Network (CNN) to predict its spatial features through a vector of probabilities describing the partition at each 4x4 edge. Subsequently, a Decision Tree (DT) model leverages this vector of spatial features to predict the most likely splits at each block. Finally, based on this prediction, only the N most likely splits are processed by the Rate-Distortion (RD) process of the encoder. In order to train our CNN and DT models on a wide range of image contents, we also propose a public VVC frame partitioning dataset based on existing image dataset encoded with the VVC reference software encoder. Our proposal relying on the top-3 configuration reaches 46.6% complexity reduction for a negligible bitrate increase of 0.86%. A top-2 configuration enables a higher complexity reduction of 69.8% for 2.57% bitrate loss. These results emphasis a better trade-off between VTM intra coding efficiency and complexity reduction compared to the state-of-the-art solutions

    CTU Depth Decision Algorithms for HEVC: A Survey

    Get PDF
    High-Efficiency Video Coding (HEVC) surpasses its predecessors in encoding efficiency by introducing new coding tools at the cost of an increased encoding time-complexity. The Coding Tree Unit (CTU) is the main building block used in HEVC. In the HEVC standard, frames are divided into CTUs with the predetermined size of up to 64x64 pixels. Each CTU is then divided recursively into a number of equally sized square areas, known as Coding Units (CUs). Although this diversity of frame partitioning increases encoding efficiency, it also causes an increase in the time complexity due to the increased number of ways to find the optimal partitioning. To address this complexity, numerous algorithms have been proposed to eliminate unnecessary searches during partitioning CTUs by exploiting the correlation in the video. In this paper, existing CTU depth decision algorithms for HEVC are surveyed. These algorithms are categorized into two groups, namely statistics and machine learning approaches. Statistics approaches are further subdivided into neighboring and inherent approaches. Neighboring approaches exploit the similarity between adjacent CTUs to limit the depth range of the current CTU, while inherent approaches use only the available information within the current CTU. Machine learning approaches try to extract and exploit similarities implicitly. Traditional methods like support vector machines or random forests use manually selected features, while recently proposed deep learning methods extract features during training. Finally, this paper discusses extending these methods to more recent video coding formats such as Versatile Video Coding (VVC) and AOMedia Video 1(AV1)

    Fast Intra-frame Coding Algorithm for HEVC Based on TCM and Machine Learning

    Get PDF
    High Efficiency Video Coding (HEVC) is the latest video coding standard. Compared with the previous standard H.264/AVC, it can reduce the bit-rate by around 50% while maintaining the same perceptual quality. This performance gain on compression is achieved mainly by supporting larger Coding Unit (CU) size and more prediction modes. However, since the encoder needs to traverse all possible choices to mine out the best way of encoding data, this large flexibility on block size and prediction modes has caused a tremendous increase in encoding time. In HEVC, intra-frame coding is an important basis, and it is widely used in all configurations. Therefore, fast algorithms are always required to alleviate the computational complexity of HEVC intra-frame coding. In this thesis, a fast intra-frame coding algorithm based on machine learning is proposed to predict CU decisions. Hence the computational complexity can be significantly reduced with negligible loss in the coding efficiency. Machine learning models like Bayes decision, Support Vector Machine (SVM) are used as decision makers while the Laplacian Transparent Composite Model (LPTCM) is selected as a feature extraction tool. In the main version of the proposed algorithm, a set of features named with Summation of Binarized Outlier Coefficients (SBOC) is extracted to train SVM models. An online training structure and a performance control method are introduced to enhance the robustness of decision makers. When applied on All Intra Main (AIM) full test and compared with HM 16.3, the main version of the proposed algorithm can achieve, on average, 48% time reduction with 0.78% BD-rate increase. Through adjusting parameter settings, the algorithm can change the trade-off between encoding time and coding efficiency, which can generate a performance curve to meet different requirements. By testing different methods on the same machine, the performance of proposed method has outperformed all CU decision based HEVC fast intra-frame algorithms in the benchmarks

    Fast Intra-frame Coding Algorithm for HEVC Based on TCM and Machine Learning

    Get PDF
    High Efficiency Video Coding (HEVC) is the latest video coding standard. Compared with the previous standard H.264/AVC, it can reduce the bit-rate by around 50% while maintaining the same perceptual quality. This performance gain on compression is achieved mainly by supporting larger Coding Unit (CU) size and more prediction modes. However, since the encoder needs to traverse all possible choices to mine out the best way of encoding data, this large flexibility on block size and prediction modes has caused a tremendous increase in encoding time. In HEVC, intra-frame coding is an important basis, and it is widely used in all configurations. Therefore, fast algorithms are always required to alleviate the computational complexity of HEVC intra-frame coding. In this thesis, a fast intra-frame coding algorithm based on machine learning is proposed to predict CU decisions. Hence the computational complexity can be significantly reduced with negligible loss in the coding efficiency. Machine learning models like Bayes decision, Support Vector Machine (SVM) are used as decision makers while the Laplacian Transparent Composite Model (LPTCM) is selected as a feature extraction tool. In the main version of the proposed algorithm, a set of features named with Summation of Binarized Outlier Coefficients (SBOC) is extracted to train SVM models. An online training structure and a performance control method are introduced to enhance the robustness of decision makers. When applied on All Intra Main (AIM) full test and compared with HM 16.3, the main version of the proposed algorithm can achieve, on average, 48% time reduction with 0.78% BD-rate increase. Through adjusting parameter settings, the algorithm can change the trade-off between encoding time and coding efficiency, which can generate a performance curve to meet different requirements. By testing different methods on the same machine, the performance of proposed method has outperformed all CU decision based HEVC fast intra-frame algorithms in the benchmarks

    SVM based approach for complexity control of HEVC intra coding

    Get PDF
    The High Efficiency Video Coding (HEVC) is adopted by various video applications in recent years. Because of its high computational demand, controlling the complexity of HEVC is of paramount importance to appeal to the varying requirements in many applications, including power-constrained video coding, video streaming, and cloud gaming. Most of the existing complexity control methods are only capable of considering a subset of the decision space, which leads to low coding efficiency. While the efficiency of machine learning methods such as Support Vector Machines (SVM) can be employed for higher precision decision making, the current SVM-based techniques for HEVC provide a fixed decision boundary which results in different coding complexities for different video content. Although this might be suitable for complexity reduction, it is not acceptable for complexity control. This paper proposes an adjustable classification approach for Coding Unit (CU) partitioning, which addresses the mentioned problems of complexity control. Firstly, a novel set of features for fast CU partitioning is designed using image processing techniques. Then, a flexible classification method based on SVM is proposed to model the CU partitioning problem. This approach allows adjusting the performance-complexity trade-off, even after the training phase. Using this model, and a novel adaptive thresholding technique, an algorithm is presented to deliver video encoding within the target coding complexity, while maximizing the coding efficiency. Experimental results justify the superiority of this method over the state-of-the-art methods, with target complexities ranging from 20% to 100%.acceptedVersionPeer reviewe

    Towards one video encoder per individual : guided High Efficiency Video Coding

    Get PDF

    Kvazaar HEVC videokooderin pakkaustehokkuuden ja suorituskyvyn optimointi

    Get PDF
    Growing video resolutions have led to an increasing volume of Internet video traffic, which has created a need for more efficient video compression. New video coding standards, such as High Efficiency Video Coding (HEVC), enable a higher level of compression, but the complexity of the corresponding encoder implementations is also higher. Therefore, encoders that are efficient in terms of both compression and complexity are required. In this work, we implement four optimizations to Kvazaar HEVC encoder: 1) uniform inter and intra cost comparison; 2) concurrency-oriented SAO implementation; 3) resolution-adaptive thread allocation; and 4) fast cost estimation of coding coefficients. Optimization 1 changes the selection criterion of the prediction mode in fast configurations, which greatly improves the coding efficiency. Optimization 2 replaces the implementation of one of the in-loop filters with one that better supports concurrent processing. This allows removing some dependencies between encoding tasks, which provides more opportunities for parallel processing to increase coding speed. Optimization 3 reduces the overhead of thread management by spawning fewer threads when there is not enough work for all available threads. Optimization 4 speeds up the computation of residual coefficient coding costs by switching to a faster but less accurate estimation. The impact of the optimizations is measured with two coding configurations of Kvazaar: the ultrafast preset, which aims for the fastest coding speed, and the veryslow preset, which aims for the best coding efficiency. Together, the introduced optimizations give a 2.8× speedup in the ultrafast configuration and a 3.4× speedup in the veryslow configuration. The trade-off for the speedup with the veryslow preset is a 0.15 % bit rate increase. However, with the ultrafast preset, the optimizations also improve coding efficiency by 14.39 %
    corecore