735 research outputs found

    State of the art in 2D content representation and compression

    Get PDF
    Livrable D1.3 du projet ANR PERSEECe rapport a été réalisé dans le cadre du projet ANR PERSEE (n° ANR-09-BLAN-0170). Exactement il correspond au livrable D3.1 du projet

    Toward sparse and geometry adapted video approximations

    Get PDF
    Video signals are sequences of natural images, where images are often modeled as piecewise-smooth signals. Hence, video can be seen as a 3D piecewise-smooth signal made of piecewise-smooth regions that move through time. Based on the piecewise-smooth model and on related theoretical work on rate-distortion performance of wavelet and oracle based coding schemes, one can better analyze the appropriate coding strategies that adaptive video codecs need to implement in order to be efficient. Efficient video representations for coding purposes require the use of adaptive signal decompositions able to capture appropriately the structure and redundancy appearing in video signals. Adaptivity needs to be such that it allows for proper modeling of signals in order to represent these with the lowest possible coding cost. Video is a very structured signal with high geometric content. This includes temporal geometry (normally represented by motion information) as well as spatial geometry. Clearly, most of past and present strategies used to represent video signals do not exploit properly its spatial geometry. Similarly to the case of images, a very interesting approach seems to be the decomposition of video using large over-complete libraries of basis functions able to represent salient geometric features of the signal. In the framework of video, these features should model 2D geometric video components as well as their temporal evolution, forming spatio-temporal 3D geometric primitives. Through this PhD dissertation, different aspects on the use of adaptivity in video representation are studied looking toward exploiting both aspects of video: its piecewise nature and the geometry. The first part of this work studies the use of localized temporal adaptivity in subband video coding. This is done considering two transformation schemes used for video coding: 3D wavelet representations and motion compensated temporal filtering. A theoretical R-D analysis as well as empirical results demonstrate how temporal adaptivity improves coding performance of moving edges in 3D transform (without motion compensation) based video coding. Adaptivity allows, at the same time, to equally exploit redundancy in non-moving video areas. The analogy between motion compensated video and 1D piecewise-smooth signals is studied as well. This motivates the introduction of local length adaptivity within frame-adaptive motion compensated lifted wavelet decompositions. This allows an optimal rate-distortion performance when video motion trajectories are shorter than the transformation "Group Of Pictures", or when efficient motion compensation can not be ensured. After studying temporal adaptivity, the second part of this thesis is dedicated to understand the fundamentals of how can temporal and spatial geometry be jointly exploited. This work builds on some previous results that considered the representation of spatial geometry in video (but not temporal, i.e, without motion). In order to obtain flexible and efficient (sparse) signal representations, using redundant dictionaries, the use of highly non-linear decomposition algorithms, like Matching Pursuit, is required. General signal representation using these techniques is still quite unexplored. For this reason, previous to the study of video representation, some aspects of non-linear decomposition algorithms and the efficient decomposition of images using Matching Pursuits and a geometric dictionary are investigated. A part of this investigation concerns the study on the influence of using a priori models within approximation non-linear algorithms. Dictionaries with a high internal coherence have some problems to obtain optimally sparse signal representations when used with Matching Pursuits. It is proved, theoretically and empirically, that inserting in this algorithm a priori models allows to improve the capacity to obtain sparse signal approximations, mainly when coherent dictionaries are used. Another point discussed in this preliminary study, on the use of Matching Pursuits, concerns the approach used in this work for the decompositions of video frames and images. The technique proposed in this thesis improves a previous work, where authors had to recur to sub-optimal Matching Pursuit strategies (using Genetic Algorithms), given the size of the functions library. In this work the use of full search strategies is made possible, at the same time that approximation efficiency is significantly improved and computational complexity is reduced. Finally, a priori based Matching Pursuit geometric decompositions are investigated for geometric video representations. Regularity constraints are taken into account to recover the temporal evolution of spatial geometric signal components. The results obtained for coding and multi-modal (audio-visual) signal analysis, clarify many unknowns and show to be promising, encouraging to prosecute research on the subject

    Video Coding Using a Deformation Compensation Algorithm Based on Adaptive Matching Pursuit Image Decompositions

    Get PDF
    Today's video codecs employ motion compensated prediction in combination with block matching techniques. These techniques, although achieving some level of adaptivity in their latest versions, continue to rely on the decomposition of frames on a set of artificial primitives: blocks. This paper presents a new approach to video coding. A geometrically adaptive image decomposition scheme using an over-complete basis is used to represent the scene. Using Matching Pursuit (MP), we are able to express local features such as position, anisotropic scale and orientation in terms of a set of spatio-frequential primitives. In order to perform frame prediction, only the changes in the parameters that determine these functions from frame to frame will have to be coded. Such an approach, in addition to being able to catch displacements in images deals as well in a natural way with local scale deformations and local rotations

    A Bayesian Approach to Video Expansions on Parametric Over-Complete 2-D Dictionaries

    Get PDF
    In this work, we explore a framework for the sparse representation of video sequences by means of spati

    A Panorama on Multiscale Geometric Representations, Intertwining Spatial, Directional and Frequency Selectivity

    Full text link
    The richness of natural images makes the quest for optimal representations in image processing and computer vision challenging. The latter observation has not prevented the design of image representations, which trade off between efficiency and complexity, while achieving accurate rendering of smooth regions as well as reproducing faithful contours and textures. The most recent ones, proposed in the past decade, share an hybrid heritage highlighting the multiscale and oriented nature of edges and patterns in images. This paper presents a panorama of the aforementioned literature on decompositions in multiscale, multi-orientation bases or dictionaries. They typically exhibit redundancy to improve sparsity in the transformed domain and sometimes its invariance with respect to simple geometric deformations (translation, rotation). Oriented multiscale dictionaries extend traditional wavelet processing and may offer rotation invariance. Highly redundant dictionaries require specific algorithms to simplify the search for an efficient (sparse) representation. We also discuss the extension of multiscale geometric decompositions to non-Euclidean domains such as the sphere or arbitrary meshed surfaces. The etymology of panorama suggests an overview, based on a choice of partially overlapping "pictures". We hope that this paper will contribute to the appreciation and apprehension of a stream of current research directions in image understanding.Comment: 65 pages, 33 figures, 303 reference

    On the joint source and channel coding of atomic image streams

    Get PDF
    This paper presents an error resilient coding scheme for atomic image bitstreams, as generated by Matching Pursuit encoders. A joint source and channel coding algorithm is proposed, that takes benefit of both the flexibility in the image representation, and the progressive nature of the bitstream, in order to finely adapt the channel rate to the relative importance of the bitstream components. An optimization problem is proposed, and a fast search algorithm determines the best rate allocation for given bit budget and loss process parameters. Simulation results show that the unequal error protection is quite efficient, even in very adverse conditions, and it clearly outperforms simple FEC schemes

    Semantic multimedia remote display for mobile thin clients

    Get PDF
    Current remote display technologies for mobile thin clients convert practically all types of graphical content into sequences of images rendered by the client. Consequently, important information concerning the content semantics is lost. The present paper goes beyond this bottleneck by developing a semantic multimedia remote display. The principle consists of representing the graphical content as a real-time interactive multimedia scene graph. The underlying architecture features novel components for scene-graph creation and management, as well as for user interactivity handling. The experimental setup considers the Linux X windows system and BiFS/LASeR multimedia scene technologies on the server and client sides, respectively. The implemented solution was benchmarked against currently deployed solutions (VNC and Microsoft-RDP), by considering text editing and WWW browsing applications. The quantitative assessments demonstrate: (1) visual quality expressed by seven objective metrics, e.g., PSNR values between 30 and 42 dB or SSIM values larger than 0.9999; (2) downlink bandwidth gain factors ranging from 2 to 60; (3) real-time user event management expressed by network round-trip time reduction by factors of 4-6 and by uplink bandwidth gain factors from 3 to 10; (4) feasible CPU activity, larger than in the RDP case but reduced by a factor of 1.5 with respect to the VNC-HEXTILE

    Sparse image approximation with application to flexible image coding

    Get PDF
    Natural images are often modeled through piecewise-smooth regions. Region edges, which correspond to the contours of the objects, become, in this model, the main information of the signal. Contours have the property of being smooth functions along the direction of the edge, and irregularities on the perpendicular direction. Modeling edges with the minimum possible number of terms is of key importance for numerous applications, such as image coding, segmentation or denoising. Standard separable basis fail to provide sparse enough representation of contours, due to the fact that this kind of basis do not see the regularity of edges. In order to be able to detect this regularity, a new method based on (possibly redundant) sets of basis functions able to capture the geometry of images is needed. This thesis presents, in a first stage, a study about the features that basis functions should have in order to provide sparse representations of a piecewise-smooth image. This study emphasizes the need for edge-adapted basis functions, capable to accurately capture local orientation and anisotropic scaling of image structures. The need of different anisotropy degrees and orientations in the basis function set leads to the use of redundant dictionaries. However, redundant dictionaries have the inconvenience of giving no unique sparse image decompositions, and from all the possible decompositions of a signal in a redundant dictionary, just the sparsest is needed. There are several algorithms that allow to find sparse decompositions over redundant dictionaries, but most of these algorithms do not always guarantee that the optimal approximation has been recovered. To cope with this problem, a mathematical study about the properties of sparse approximations is performed. From this, a test to check whether a given sparse approximation is the sparsest is provided. The second part of this thesis presents a novel image approximation scheme, based on the use of a redundant dictionary. This scheme allows to have a good approximation of an image with a number of terms much smaller than the dimension of the signal. This novel approximation scheme is based on a dictionary formed by a combination of anisotropically refined and rotated wavelet-like mother functions and Gaussians. An efficient Full Search Matching Pursuit algorithm to perform the image decomposition in such a dictionary is designed. Finally, a geometric image coding scheme based on the image approximated over the anisotropic and rotated dictionary of basis functions is designed. The coding performances of this dictionary are studied. Coefficient quantization appears to be of crucial importance in the design of a Matching Pursuit based coding scheme. Thus, a quantization scheme for the MP coefficients has been designed, based on the theoretical energy upper bound of the MP algorithm and the empirical observations of the coefficient distribution and evolution. Thanks to this quantization, our image coder provides low to medium bit-rate image approximations, while it allows for on the fly resolution switching and several other affine image transformations to be performed directly in the transformed domain
    • …
    corecore