5,200 research outputs found

    One Pass Quality Control and Low Complexity RDO in A Quadtree Based Scalable Image Coder

    Get PDF
    International audienceThis paper presents a joint quality control (QC) and rate distortion optimization (RDO) algorithm applied to a still image codec called Locally Adaptive Resolution (LAR). LAR supports scalability in resolution for both lossy and lossless coding and has low complexity. This algorithm is based on the study of the relationship between compression efficiency and relative parameters. The RDO model is proposed firstly to find suitable parameters. Relying on this optimization, relationships between the distortion of reconstructed image and quantization parameter can be described with a new linear model. This model is used for parametric configuration to control compression distortion. Experimental results show that this algorithm provides an effective solution for an efficient one pass codec with automatic parameters selection and accurate QC. This algorithm could be extended to codecs with similar functions, such as High Efficiency Video Coding (HEVC)

    On Sparse Coding as an Alternate Transform in Video Coding

    Get PDF
    In video compression, specifically in the prediction process, a residual signal is calculated by subtracting the predicted from the original signal, which represents the error of this process. This residual signal is usually transformed by a discrete cosine transform (DCT) from the pixel, into the frequency domain. It is then quantized, which filters more or less high frequencies (depending on a quality parameter). The quantized signal is then entropy encoded usually by a context-adaptive binary arithmetic coding engine (CABAC), and written into a bitstream. In the decoding phase the process is reversed. DCT and quantization in combination are efficient tools, but they are not performing well at lower bitrates and creates distortion and side effect. The proposed method uses sparse coding as an alternate transform which compresses well at lower bitrates, but not well at high bitrates. The decision which transform is used is based on a rate-distortion optimization (RDO) cost calculation to get both transforms in their optimal performance range. The proposed method is implemented in high efficient video coding (HEVC) test model HM-16.18 and high efficient video coding for screen content coding (HEVC-SCC) for test model HM-16.18+SCM-8.7, with a Bjontegaard rate difference (BD-rate) saving, which archives up to 5.5%, compared to the standard

    Image and Video Coding/Transcoding: A Rate Distortion Approach

    Get PDF
    Due to the lossy nature of image/video compression and the expensive bandwidth and computation resources in a multimedia system, one of the key design issues for image and video coding/transcoding is to optimize trade-off among distortion, rate, and/or complexity. This thesis studies the application of rate distortion (RD) optimization approaches to image and video coding/transcoding for exploring the best RD performance of a video codec compatible to the newest video coding standard H.264 and for designing computationally efficient down-sampling algorithms with high visual fidelity in the discrete Cosine transform (DCT) domain. RD optimization for video coding in this thesis considers two objectives, i.e., to achieve the best encoding efficiency in terms of minimizing the actual RD cost and to maintain decoding compatibility with the newest video coding standard H.264. By the actual RD cost, we mean a cost based on the final reconstruction error and the entire coding rate. Specifically, an operational RD method is proposed based on a soft decision quantization (SDQ) mechanism, which has its root in a fundamental RD theoretic study on fixed-slope lossy data compression. Using SDQ instead of hard decision quantization, we establish a general framework in which motion prediction, quantization, and entropy coding in a hybrid video coding scheme such as H.264 are jointly designed to minimize the actual RD cost on a frame basis. The proposed framework is applicable to optimize any hybrid video coding scheme, provided that specific algorithms are designed corresponding to coding syntaxes of a given standard codec, so as to maintain compatibility with the standard. Corresponding to the baseline profile syntaxes and the main profile syntaxes of H.264, respectively, we have proposed three RD algorithms---a graph-based algorithm for SDQ given motion prediction and quantization step sizes, an algorithm for residual coding optimization given motion prediction, and an iterative overall algorithm for jointly optimizing motion prediction, quantization, and entropy coding---with them embedded in the indicated order. Among the three algorithms, the SDQ design is the core, which is developed based on a given entropy coding method. Specifically, two SDQ algorithms have been developed based on the context adaptive variable length coding (CAVLC) in H.264 baseline profile and the context adaptive binary arithmetic coding (CABAC) in H.264 main profile, respectively. Experimental results for the H.264 baseline codec optimization show that for a set of typical testing sequences, the proposed RD method for H.264 baseline coding achieves a better trade-off between rate and distortion, i.e., 12\% rate reduction on average at the same distortion (ranging from 30dB to 38dB by PSNR) when compared with the RD optimization method implemented in H.264 baseline reference codec. Experimental results for optimizing H.264 main profile coding with CABAC show 10\% rate reduction over a main profile reference codec using CABAC, which also suggests 20\% rate reduction over the RD optimization method implemented in H.264 baseline reference codec, leading to our claim of having developed the best codec in terms of RD performance, while maintaining the compatibility with H.264. By investigating trade-off between distortion and complexity, we have also proposed a designing framework for image/video transcoding with spatial resolution reduction, i.e., to down-sample compressed images/video with an arbitrary ratio in the DCT domain. First, we derive a set of DCT-domain down-sampling methods, which can be represented by a linear transform with double-sided matrix multiplication (LTDS) in the DCT domain. Then, for a pre-selected pixel-domain down-sampling method, we formulate an optimization problem for finding an LTDS to approximate the given pixel-domain method to achieve the best trade-off between visual quality and computational complexity. The problem is then solved by modeling an LTDS with a multi-layer perceptron network and using a structural learning with forgetting algorithm for training the network. Finally, by selecting a pixel-domain reference method with the popular Butterworth lowpass filtering and cubic B-spline interpolation, the proposed framework discovers an LTDS with better visual quality and lower computational complexity when compared with state-of-the-art methods in the literature

    Rate-Distortion Analysis of Multiview Coding in a DIBR Framework

    Get PDF
    Depth image based rendering techniques for multiview applications have been recently introduced for efficient view generation at arbitrary camera positions. Encoding rate control has thus to consider both texture and depth data. Due to different structures of depth and texture images and their different roles on the rendered views, distributing the available bit budget between them however requires a careful analysis. Information loss due to texture coding affects the value of pixels in synthesized views while errors in depth information lead to shift in objects or unexpected patterns at their boundaries. In this paper, we address the problem of efficient bit allocation between textures and depth data of multiview video sequences. We adopt a rate-distortion framework based on a simplified model of depth and texture images. Our model preserves the main features of depth and texture images. Unlike most recent solutions, our method permits to avoid rendering at encoding time for distortion estimation so that the encoding complexity is not augmented. In addition to this, our model is independent of the underlying inpainting method that is used at decoder. Experiments confirm our theoretical results and the efficiency of our rate allocation strategy

    Distributed Representation of Geometrically Correlated Images with Compressed Linear Measurements

    Get PDF
    This paper addresses the problem of distributed coding of images whose correlation is driven by the motion of objects or positioning of the vision sensors. It concentrates on the problem where images are encoded with compressed linear measurements. We propose a geometry-based correlation model in order to describe the common information in pairs of images. We assume that the constitutive components of natural images can be captured by visual features that undergo local transformations (e.g., translation) in different images. We first identify prominent visual features by computing a sparse approximation of a reference image with a dictionary of geometric basis functions. We then pose a regularized optimization problem to estimate the corresponding features in correlated images given by quantized linear measurements. The estimated features have to comply with the compressed information and to represent consistent transformation between images. The correlation model is given by the relative geometric transformations between corresponding features. We then propose an efficient joint decoding algorithm that estimates the compressed images such that they stay consistent with both the quantized measurements and the correlation model. Experimental results show that the proposed algorithm effectively estimates the correlation between images in multi-view datasets. In addition, the proposed algorithm provides effective decoding performance that compares advantageously to independent coding solutions as well as state-of-the-art distributed coding schemes based on disparity learning

    Distributed video coding for wireless video sensor networks: a review of the state-of-the-art architectures

    Get PDF
    Distributed video coding (DVC) is a relatively new video coding architecture originated from two fundamental theorems namely, Slepian–Wolf and Wyner–Ziv. Recent research developments have made DVC attractive for applications in the emerging domain of wireless video sensor networks (WVSNs). This paper reviews the state-of-the-art DVC architectures with a focus on understanding their opportunities and gaps in addressing the operational requirements and application needs of WVSNs
    • …
    corecore