2 research outputs found
Neural Video Compression with Temporal Layer-Adaptive Hierarchical B-frame Coding
Neural video compression (NVC) is a rapidly evolving video coding research
area, with some models achieving superior coding efficiency compared to the
latest video coding standard Versatile Video Coding (VVC). In conventional
video coding standards, the hierarchical B-frame coding, which utilizes a
bidirectional prediction structure for higher compression, had been
well-studied and exploited. In NVC, however, limited research has investigated
the hierarchical B scheme. In this paper, we propose an NVC model exploiting
hierarchical B-frame coding with temporal layer-adaptive optimization. We first
extend an existing unidirectional NVC model to a bidirectional model, which
achieves -21.13% BD-rate gain over the unidirectional baseline model. However,
this model faces challenges when applied to sequences with complex or large
motions, leading to performance degradation. To address this, we introduce
temporal layer-adaptive optimization, incorporating methods such as temporal
layer-adaptive quality scaling (TAQS) and temporal layer-adaptive latent
scaling (TALS). The final model with the proposed methods achieves an
impressive BD-rate gain of -39.86% against the baseline. It also resolves the
challenges in sequences with large or complex motions with up to -49.13% more
BD-rate gains than the simple bidirectional extension. This improvement is
attributed to the allocation of more bits to lower temporal layers, thereby
enhancing overall reconstruction quality with smaller bits. Since our method
has little dependency on a specific NVC model architecture, it can serve as a
general tool for extending unidirectional NVC models to the ones with
hierarchical B-frame coding