2,298 research outputs found
Layer Selection in Progressive Transmission of Motion-Compensated JPEG2000 Video
MCJ2K (Motion-Compensated JPEG2000) is a video codec based on MCTF (Motion- Compensated Temporal Filtering) and J2K (JPEG2000). MCTF analyzes a sequence of images, generating a collection of temporal sub-bands, which are compressed with J2K. The R/D (Rate-Distortion) performance in MCJ2K is better than the MJ2K (Motion JPEG2000) extension, especially if there is a high level of temporal redundancy. MCJ2K codestreams can be served by standard JPIP (J2K Interactive Protocol) servers, thanks to the use of only J2K standard file formats. In bandwidth-constrained scenarios, an important issue in MCJ2K is determining the amount of data of each temporal sub-band that must be transmitted to maximize the quality of the reconstructions at the client side. To solve this problem, we have proposed two rate-allocation algorithms which provide reconstructions that are progressive in quality. The first, OSLA (Optimized Sub-band Layers Allocation), determines the best progression of quality layers, but is computationally expensive. The second, ESLA (Estimated-Slope sub-band Layers Allocation), is sub-optimal in most cases, but much faster and more convenient for real-time streaming scenarios. An experimental comparison shows that even when a straightforward motion compensation scheme is used, the R/D performance of MCJ2K competitive is compared not only to MJ2K, but also with respect to other standard scalable video codecs
Transcoding of H.264/AVC to SVC with motion data refinement
In this paper, we present motion-refined transcoding of H.264/AVC streams to SVC in the transform domain. By accurately taking into account both rate and distortion in the different layers on the one hand, and the SVC inter-layer motion prediction mechanisms on the other hand, the proposed transcoding architecture is able to improve rate-distortion performance over existing approaches. We propose a multilayer control mechanism that trades off performance between the different layers, resulting in 0.5 dB gains in the output SVC base layer
Side-information generation for temporally and spatially scalablewyner-ziv codecs
The distributed video coding paradigmenables video codecs to operate with reversed complexity, in which the complexity is shifted from the encoder toward the decoder. Its performance is heavily dependent on the quality of the side information generated by motio estimation at the decoder. We compare the rate-distortion performance of different side-information estimators, for both temporally and spatially scalableWyner-Ziv codecs. For the temporally scalable codec we compared an established method with a new algorithm that uses a linear-motion model to produce side-information. As a continuation of previous works, in this paper, we propose to use a super-resolution method to upsample the nonkey frame, for the spatial scalable codec, using the key frames as reference.We verify the performance of the spatial scalableWZcoding using the state-of-the-art video coding standard H.264/AVC
Flexible distribution of complexity by hybrid predictive-distributed video coding
There is currently limited flexibility for distributing complexity in a video coding system. While rate-distortion-complexity (RDC) optimization techniques have been proposed for conventional predictive video coding with encoder-side motion estimation, they fail to offer true flexible distribution of complexity between encoder and decoder since the encoder is assumed to have always more computational resources available than the decoder. On the other hand, distributed video coding solutions with decoder-side motion estimation have been proposed, but hardly any RDC optimized systems have been developed.
To offer more flexibility for video applications involving multi-tasking or battery-constrained devices, in this paper, we propose a codec combining predictive video coding concepts and techniques from distributed video coding and show the flexibility of this method in distributing complexity. We propose several modes to code frames, and provide complexity analysis illustrating encoder and decoder computational complexity for each mode. Rate distortion results for each mode indicate that the coding efficiency is similar. We describe a method to choose which mode to use for coding each inter frame, taking into account encoder and decoder complexity constraints, and illustrate how complexity is distributed more flexibly
Distributed video coding for wireless video sensor networks: a review of the state-of-the-art architectures
Distributed video coding (DVC) is a relatively new video coding architecture originated from two fundamental theorems namely, Slepian–Wolf and Wyner–Ziv. Recent research developments have made DVC attractive for applications in the emerging domain of wireless video sensor networks (WVSNs). This paper reviews the state-of-the-art DVC architectures with a focus on understanding their opportunities and gaps in addressing the operational requirements and application needs of WVSNs
Neural Video Compression with Temporal Layer-Adaptive Hierarchical B-frame Coding
Neural video compression (NVC) is a rapidly evolving video coding research
area, with some models achieving superior coding efficiency compared to the
latest video coding standard Versatile Video Coding (VVC). In conventional
video coding standards, the hierarchical B-frame coding, which utilizes a
bidirectional prediction structure for higher compression, had been
well-studied and exploited. In NVC, however, limited research has investigated
the hierarchical B scheme. In this paper, we propose an NVC model exploiting
hierarchical B-frame coding with temporal layer-adaptive optimization. We first
extend an existing unidirectional NVC model to a bidirectional model, which
achieves -21.13% BD-rate gain over the unidirectional baseline model. However,
this model faces challenges when applied to sequences with complex or large
motions, leading to performance degradation. To address this, we introduce
temporal layer-adaptive optimization, incorporating methods such as temporal
layer-adaptive quality scaling (TAQS) and temporal layer-adaptive latent
scaling (TALS). The final model with the proposed methods achieves an
impressive BD-rate gain of -39.86% against the baseline. It also resolves the
challenges in sequences with large or complex motions with up to -49.13% more
BD-rate gains than the simple bidirectional extension. This improvement is
attributed to the allocation of more bits to lower temporal layers, thereby
enhancing overall reconstruction quality with smaller bits. Since our method
has little dependency on a specific NVC model architecture, it can serve as a
general tool for extending unidirectional NVC models to the ones with
hierarchical B-frame coding
- …