165 research outputs found

    Efficient Scalable Video Coding Based on Matching Pursuits

    Get PDF

    Grayscale and colour image Codec based on matching pursuit in the spatio-frequency domain

    Get PDF
    This report presents and evaluates a novel idea for scalable lossy colour image coding with Matching Pursuit (MP) performed in a transform domain. The benefits of the idea of MP performed in the transform domain are analysed in detail. The main contribution of this work is extending MP with wavelets to colour coding and proposing a coding method. We exploit correlations between image subbands after wavelet transformation in RGB colour space. Then, a new and simple quantisation and coding scheme of colour MP decomposition based on Run Length Encoding (RLE), inspired by the idea of coding indexes in relational databases, is applied. As a final coding step arithmetic coding is used assuming uniform distributions of MP atom parameters. The target application is compression at low and medium bit-rates. Coding performance is compared to JPEG 2000 showing the potential to outperform the latter with more sophisticated than uniform data models for arithmetic coder. The results are presented for grayscale and colour coding of 12 standard test images

    A Posteriori Quantization of Progressive Matching Pursuit Streams

    Get PDF
    This paper proposes a rate-distortion optimal a posteriori quantization scheme for Matching Pursuit coefficients. The a posteriori quantization applies to a Matching Pursuit expansion that has been generated off-line, and cannot benefit of any feedback loop to the encoder in order to compensate for the quantization noise. The redundancy of the Matching Pursuit dictionary provides an indicator of the relative importance of coefficients and atom indices, and subsequently on the quantization error. It is used to define a universal upper-bound on the decay of the coefficients, sorted in decreasing order of magnitude. A new quantization scheme is then derived, where this bound is used as an Oracle for the design of an optimal a posteriori quantizer. The latter turns the exponentially distributed coefficient entropy-constrained quantization problem into a simple uniform quantization problem. Using simulations with random dictionaries, we show that the proposed exponentially upper-bounded quantization (EUQ) clearly outperforms classical schemes. Stepping on the ideal Oracle-based approach, a sub-optimal adaptive scheme is then designed that approximates the EUQ but still outperforms competing quantization methods in terms of rate-distortion characteristics. Finally, the proposed quantization method is studied in the context of image coding. It performs similarly to state-of-the-art coding methods (and even better at low rates), while interestingly providing a progressive stream, very easy to transcode and adapt to changing rate constraints

    A Posteriori Quantized Matching Pursuit

    Get PDF
    This paper studies quantization error in the context of Matching Pursuit coded streams and proposes a new coefficient quantization scheme taking benefit of the Matching Pursuit properties. The coefficients energy in Matching Pursuit indeed decreases with the iteration number, and the decay rate can be upper-bounded with an exponential curve driven by the redundancy of the dictionary. The redundancy factor is therefore used to design an optimal a posteriori quantization scheme for multi-resolution Matching Pursuit coding. Bits are optimally distributed between successive coefficients according to their relative contribution to the signal representation. The quantization range and the number of quantization steps are therefore reduced along the iteration number. Moreover, the quantization scheme selects the optimal number of Matching Pursuit iterations to be coded to satisfy rate constraints. Finally, the new exponentially upper-bounded quantization of Matching Pursuit coefficients clearly outperforms classical uniform quantization methods for both random dictionaries and Gabor dictionaries in the practical case of image coding

    Sparse image approximation with application to flexible image coding

    Get PDF
    Natural images are often modeled through piecewise-smooth regions. Region edges, which correspond to the contours of the objects, become, in this model, the main information of the signal. Contours have the property of being smooth functions along the direction of the edge, and irregularities on the perpendicular direction. Modeling edges with the minimum possible number of terms is of key importance for numerous applications, such as image coding, segmentation or denoising. Standard separable basis fail to provide sparse enough representation of contours, due to the fact that this kind of basis do not see the regularity of edges. In order to be able to detect this regularity, a new method based on (possibly redundant) sets of basis functions able to capture the geometry of images is needed. This thesis presents, in a first stage, a study about the features that basis functions should have in order to provide sparse representations of a piecewise-smooth image. This study emphasizes the need for edge-adapted basis functions, capable to accurately capture local orientation and anisotropic scaling of image structures. The need of different anisotropy degrees and orientations in the basis function set leads to the use of redundant dictionaries. However, redundant dictionaries have the inconvenience of giving no unique sparse image decompositions, and from all the possible decompositions of a signal in a redundant dictionary, just the sparsest is needed. There are several algorithms that allow to find sparse decompositions over redundant dictionaries, but most of these algorithms do not always guarantee that the optimal approximation has been recovered. To cope with this problem, a mathematical study about the properties of sparse approximations is performed. From this, a test to check whether a given sparse approximation is the sparsest is provided. The second part of this thesis presents a novel image approximation scheme, based on the use of a redundant dictionary. This scheme allows to have a good approximation of an image with a number of terms much smaller than the dimension of the signal. This novel approximation scheme is based on a dictionary formed by a combination of anisotropically refined and rotated wavelet-like mother functions and Gaussians. An efficient Full Search Matching Pursuit algorithm to perform the image decomposition in such a dictionary is designed. Finally, a geometric image coding scheme based on the image approximated over the anisotropic and rotated dictionary of basis functions is designed. The coding performances of this dictionary are studied. Coefficient quantization appears to be of crucial importance in the design of a Matching Pursuit based coding scheme. Thus, a quantization scheme for the MP coefficients has been designed, based on the theoretical energy upper bound of the MP algorithm and the empirical observations of the coefficient distribution and evolution. Thanks to this quantization, our image coder provides low to medium bit-rate image approximations, while it allows for on the fly resolution switching and several other affine image transformations to be performed directly in the transformed domain

    Toward sparse and geometry adapted video approximations

    Get PDF
    Video signals are sequences of natural images, where images are often modeled as piecewise-smooth signals. Hence, video can be seen as a 3D piecewise-smooth signal made of piecewise-smooth regions that move through time. Based on the piecewise-smooth model and on related theoretical work on rate-distortion performance of wavelet and oracle based coding schemes, one can better analyze the appropriate coding strategies that adaptive video codecs need to implement in order to be efficient. Efficient video representations for coding purposes require the use of adaptive signal decompositions able to capture appropriately the structure and redundancy appearing in video signals. Adaptivity needs to be such that it allows for proper modeling of signals in order to represent these with the lowest possible coding cost. Video is a very structured signal with high geometric content. This includes temporal geometry (normally represented by motion information) as well as spatial geometry. Clearly, most of past and present strategies used to represent video signals do not exploit properly its spatial geometry. Similarly to the case of images, a very interesting approach seems to be the decomposition of video using large over-complete libraries of basis functions able to represent salient geometric features of the signal. In the framework of video, these features should model 2D geometric video components as well as their temporal evolution, forming spatio-temporal 3D geometric primitives. Through this PhD dissertation, different aspects on the use of adaptivity in video representation are studied looking toward exploiting both aspects of video: its piecewise nature and the geometry. The first part of this work studies the use of localized temporal adaptivity in subband video coding. This is done considering two transformation schemes used for video coding: 3D wavelet representations and motion compensated temporal filtering. A theoretical R-D analysis as well as empirical results demonstrate how temporal adaptivity improves coding performance of moving edges in 3D transform (without motion compensation) based video coding. Adaptivity allows, at the same time, to equally exploit redundancy in non-moving video areas. The analogy between motion compensated video and 1D piecewise-smooth signals is studied as well. This motivates the introduction of local length adaptivity within frame-adaptive motion compensated lifted wavelet decompositions. This allows an optimal rate-distortion performance when video motion trajectories are shorter than the transformation "Group Of Pictures", or when efficient motion compensation can not be ensured. After studying temporal adaptivity, the second part of this thesis is dedicated to understand the fundamentals of how can temporal and spatial geometry be jointly exploited. This work builds on some previous results that considered the representation of spatial geometry in video (but not temporal, i.e, without motion). In order to obtain flexible and efficient (sparse) signal representations, using redundant dictionaries, the use of highly non-linear decomposition algorithms, like Matching Pursuit, is required. General signal representation using these techniques is still quite unexplored. For this reason, previous to the study of video representation, some aspects of non-linear decomposition algorithms and the efficient decomposition of images using Matching Pursuits and a geometric dictionary are investigated. A part of this investigation concerns the study on the influence of using a priori models within approximation non-linear algorithms. Dictionaries with a high internal coherence have some problems to obtain optimally sparse signal representations when used with Matching Pursuits. It is proved, theoretically and empirically, that inserting in this algorithm a priori models allows to improve the capacity to obtain sparse signal approximations, mainly when coherent dictionaries are used. Another point discussed in this preliminary study, on the use of Matching Pursuits, concerns the approach used in this work for the decompositions of video frames and images. The technique proposed in this thesis improves a previous work, where authors had to recur to sub-optimal Matching Pursuit strategies (using Genetic Algorithms), given the size of the functions library. In this work the use of full search strategies is made possible, at the same time that approximation efficiency is significantly improved and computational complexity is reduced. Finally, a priori based Matching Pursuit geometric decompositions are investigated for geometric video representations. Regularity constraints are taken into account to recover the temporal evolution of spatial geometric signal components. The results obtained for coding and multi-modal (audio-visual) signal analysis, clarify many unknowns and show to be promising, encouraging to prosecute research on the subject

    Colour image coding with wavelets and matching pursuit

    Get PDF
    This thesis considers sparse approximation of still images as the basis of a lossy compression system. The Matching Pursuit (MP) algorithm is presented as a method particularly suited for application in lossy scalable image coding. Its multichannel extension, capable of exploiting inter-channel correlations, is found to be an efficient way to represent colour data in RGB colour space. Known problems with MP, high computational complexity of encoding and dictionary design, are tackled by finding an appropriate partitioning of an image. The idea of performing MP in the spatio-frequency domain after transform such as Discrete Wavelet Transform (DWT) is explored. The main challenge, though, is to encode the image representation obtained after MP into a bit-stream. Novel approaches for encoding the atomic decomposition of a signal and colour amplitudes quantisation are proposed and evaluated. The image codec that has been built is capable of competing with scalable coders such as JPEG 2000 and SPIHT in terms of compression ratio

    Image coding using redundant dictionaries

    Get PDF
    This chapter discusses the problem of coding images using very redundant libraries of waveforms, also referred to as dictionaries. We start with a discussion of the shortcomings of classical approaches based on orthonormal bases. More specifically, we show why these redundant dictionaries provide an interesting alternative for image representation. We then introduce a special dictionary of 2-D primitives called anisotropic refinement atoms that are well suited for representing edge dominated images. Using a simple greedy algorithm, we design an image coder that performs very well at low bit rate. We finally discuss its performance and particular features such as geometric adaptativity and rate scalability

    Geometric Video Approximation Using Weighted Matching Pursuit

    Get PDF
    In recent years, many works on geometric image representation have appeared in the literature. Geometric video representation has not received such an important attention so far, and only some initial works in the area have been presented. Works on geometric multi-dimensional signal representations have established a close relation with signal expansions on redundant dictionaries. For this purpose, Matching Pursuits (MP) have shown to be an interesting tool to obtain such expansions. Recently, most important limitations of MP have been underlined, and alternative algorithms like Weighted-MP have been proposed to address these. This work explores the use of Weighted-MP as a new framework for motion-adaptive geometric video approximations. We study a novel algorithm to decompose video sequences in terms of few, salient video components that jointly represent the geometric and motion content of a scene. Experimental coding results on highly geometric content show that the proposed paradigm has the potential to exploit spatio-temporal geometry adaptation, as well as that 2D Weighted-MP improves the representation compared to those based on 2D MP. Furthermore, the extracted video components represent relevant visual structures with high saliency. In an example application, such components are effectively used as video descriptors for the joint audio-video analysis of multimedia sequences. Overall results are interesting, encouraging further research on the application of Weighted-MP for geometric video representations
    • …
    corecore