Search CORE

2 research outputs found

K-means based clustering and context quantization

Author: Xu Mantao
Publication venue: University of Joensuu
Publication date
Field of study

Efficient Coding of Transform Coefficient Levels in Hybrid Video Coding

Author: Nguyen Tung
Publication venue
Publication date: 01/01/2023
Field of study

All video coding standards of practical importance, such as Advanced Video Coding (AVC), its successor High Efficiency Video Coding (HEVC), and the state-of-the-art Versatile Video Coding (VVC), follow the basic principle of block-based hybrid video coding. In such an architecture, the video pictures are partitioned into blocks. Each block is first predicted by either intra-picture or motion-compensated prediction, and the resulting prediction errors, referred to as residuals, are compressed using transform coding. This thesis deals with the entropy coding of quantization indices for transform coefficients, also referred to as transform coefficient levels, as well as the entropy coding of directly quantized residual samples. The entropy coding of quantization indices is referred to as level coding in this thesis. The presented developments focus on both improving the coding efficiency and reducing the complexity of the level coding for HEVC and VVC. These goals were achieved by modifying the context modeling and the binarization of the level coding. The first development presented in this thesis is a transform coefficient level coding for variable transform block sizes, which was introduced in HEVC. It exploits the fact that non-zero levels are typically concentrated in certain parts of the transform block by partitioning blocks larger than \square{4} samples into \square{4} sub-blocks. Each \square{4} sub-block is then coded similarly to the level coding specified in AVC for \square{4} transform blocks. This sub-block processing improves coding efficiency and has the advantage that the number of required context models is independent of the set of supported transform block sizes. The maximum number of context-coded bins for a transform coefficient level is one indicator for the complexity of the entropy coding. An adaptive binarization of absolute transform coefficient levels using Rice codes is presented that reduces the maximum number of context-coded bins from 15 (as used in AVC) to three for HEVC. Based on the developed selection of an appropriate Rice code for each scanning position, this adaptive binarization achieves virtually the same coding efficiency as the binarization specified in AVC for bit-rate operation points typically used in consumer applications. The coding efficiency is improved for high bit-rate operation points, which are used in more advanced and professional applications. In order to further improve the coding efficiency for HEVC and VVC, the statistical dependencies among the transform coefficient levels of a transform block are exploited by a template-based context modeling developed in this thesis. Instead of selecting the context model for a current scanning position primarily based on its location inside a transform block, already coded neighboring locations inside a local template are utilized. To further increase the coding efficiency achieved by the template-based context modeling, the different coding phases of the initially developed level coding are merged into a single coding phase. As a consequence, the template-based context modeling can utilize the absolute levels of the neighboring frequency locations, which provides better conditional probability estimates and further improves coding efficiency. This template-based context modeling with a single coding phase is also suitable for trellis-coded quantization (TCQ), since TCQ is state-driven and derives the next state from the current state and the parity of the current level. TCQ introduces different context model sets for coding the significance flag depending on the current state. Based on statistical analyses, an extension of the state-dependent context modeling of TCQ is presented, which further improves the coding efficiency in VVC. After that, a method to reduce the complexity of the level coding at the decoder is presented. This method separates the level coding into a coding phase exclusively consisting of context-coded bins and another one consisting of bypass-coded bins only. For retaining the state-dependent context selection, which significantly contributes to the coding efficiency of TCQ, a dedicated parity flag is introduced and coded with context models in the first coding phase. An adaptive approach is then presented that further reduces the worst-case complexity, effectively lowering the maximum number of context-coded bins per transform coefficient to 1.75 without negatively affecting the coding efficiency. In the last development presented in this thesis, a dedicated level coding for transform skip blocks, which often occur in screen content applications, is introduced for VVC. This dedicated level coding better exploits the statistical properties of directly quantized residual samples for screen content. Various modifications to the level coding improve the coding efficiency for this type of content. Examples for these modifications are a binarization with additional context-coded flags and the coding of the sign information with adaptive context models

Institutional Repository of the Freie Universität Berlin