3,674 research outputs found

    Progressive low bit rate coding of simple 3D objects with Matching Pursuit

    Get PDF
    This paper presents a low rate progressive 3D mesh compression scheme for simple genus-zero 3D objects. The proposed scheme is based on signal representation using redundant expansions. 3D model representation is constructed using a Matching Pursuit algorithm, with an over-complete dictionary of atoms defined on a sphere. A specific dictionary construction is proposed, that is adapted to the characteristics of 3D models. The dictionary is built on both low-frequency atoms, and anisotropic refinement to capture singularities of the signal, living on a 2-D sphere. The novel coding method is shown to favorably compare to state-of-the-art compression schemes at low bit rate, while providing a flexible and progressive representation. It therefore represents an very interesting alternative for simple 3D model representations, especially in view-dependent or scalable applications

    Progressive coding of 3D objects based on overcomplete decompositions

    Get PDF
    This paper presents a progressive coding scheme for 3D objects, based on an overcomplete decomposition of the 3D model on a sphere. Due to increased freedom in the bases construction, redundant expansions have shown interesting approximation properties in the decomposition of signals with multidimensional singularities organized along embedded submanifolds. We propose to map simple 3D models on 2D spheres and then to decompose the signal over a redundant dictionary of oriented and anisotropic atoms that live on the sphere. The signal expansion is computed iteratively with a Matching Pursuit algorithm, which greedily selects the most prominent components of the 3D model. The decomposition therefore inherently represents a progressive stream of atoms, which is advantageously used in the design of scalable representations. An encoder is proposed that compresses the stream of atoms by adaptive coefficient quantization, and entropy coding of atom indexes. Experimental results show that the novel coding strategy outperforms state-of-the-art progressive coders in terms of distortion, mostly at low bit rate. Furthermore, since the dictionary is built on structured atoms, the representation simultaneously offers an increased flexibility. This enables easy stream manipulations, and we finally illustrate this advantage in the design of a view-dependent transmission scheme

    Visual Importance-Biased Image Synthesis Animation

    Get PDF
    Present ray tracing algorithms are computationally intensive, requiring hours of computing time for complex scenes. Our previous work has dealt with the development of an overall approach to the application of visual attention to progressive and adaptive ray-tracing techniques. The approach facilitates large computational savings by modulating the supersampling rates in an image by the visual importance of the region being rendered. This paper extends the approach by incorporating temporal changes into the models and techniques developed, as it is expected that further efficiency savings can be reaped for animated scenes. Applications for this approach include entertainment, visualisation and simulation

    On unifying sparsity and geometry for image-based 3D scene representation

    Get PDF
    Demand has emerged for next generation visual technologies that go beyond conventional 2D imaging. Such technologies should capture and communicate all perceptually relevant three-dimensional information about an environment to a distant observer, providing a satisfying, immersive experience. Camera networks offer a low cost solution to the acquisition of 3D visual information, by capturing multi-view images from different viewpoints. However, the camera's representation of the data is not ideal for common tasks such as data compression or 3D scene analysis, as it does not make the 3D scene geometry explicit. Image-based scene representations fundamentally require a multi-view image model that facilitates extraction of underlying geometrical relationships between the cameras and scene components. Developing new, efficient multi-view image models is thus one of the major challenges in image-based 3D scene representation methods. This dissertation focuses on defining and exploiting a new method for multi-view image representation, from which the 3D geometry information is easily extractable, and which is additionally highly compressible. The method is based on sparse image representation using an overcomplete dictionary of geometric features, where a single image is represented as a linear combination of few fundamental image structure features (edges for example). We construct the dictionary by applying a unitary operator to an analytic function, which introduces a composition of geometric transforms (translations, rotation and anisotropic scaling) to that function. The advantage of this approach is that the features across multiple views can be related with a single composition of transforms. We then establish a connection between image components and scene geometry by defining the transforms that satisfy the multi-view geometry constraint, and obtain a new geometric multi-view correlation model. We first address the construction of dictionaries for images acquired by omnidirectional cameras, which are particularly convenient for scene representation due to their wide field of view. Since most omnidirectional images can be uniquely mapped to spherical images, we form a dictionary by applying motions on the sphere, rotations, and anisotropic scaling to a function that lives on the sphere. We have used this dictionary and a sparse approximation algorithm, Matching Pursuit, for compression of omnidirectional images, and additionally for coding 3D objects represented as spherical signals. Both methods offer better rate-distortion performance than state of the art schemes at low bit rates. The novel multi-view representation method and the dictionary on the sphere are then exploited for the design of a distributed coding method for multi-view omnidirectional images. In a distributed scenario, cameras compress acquired images without communicating with each other. Using a reliable model of correlation between views, distributed coding can achieve higher compression ratios than independent compression of each image. However, the lack of a proper model has been an obstacle for distributed coding in camera networks for many years. We propose to use our geometric correlation model for distributed multi-view image coding with side information. The encoder employs a coset coding strategy, developed by dictionary partitioning based on atom shape similarity and multi-view geometry constraints. Our method results in significant rate savings compared to independent coding. An additional contribution of the proposed correlation model is that it gives information about the scene geometry, leading to a new camera pose estimation method using an extremely small amount of data from each camera. Finally, we develop a method for learning stereo visual dictionaries based on the new multi-view image model. Although dictionary learning for still images has received a lot of attention recently, dictionary learning for stereo images has been investigated only sparingly. Our method maximizes the likelihood that a set of natural stereo images is efficiently represented with selected stereo dictionaries, where the multi-view geometry constraint is included in the probabilistic modeling. Experimental results demonstrate that including the geometric constraints in learning leads to stereo dictionaries that give both better distributed stereo matching and approximation properties than randomly selected dictionaries. We show that learning dictionaries for optimal scene representation based on the novel correlation model improves the camera pose estimation and that it can be beneficial for distributed coding

    Scalable Motion-Adaptive Video Coding with Redundant Representations

    Get PDF
    This paper presents a scalable video coding scheme (MP3D), based on the use of a redundant 3-D spatio-temporal dictionary of functions. The spatial component of the dictionary consists of directional and anisotropically scaled functions, which form a rich collection of visual primitives. The temporal component is tuned to capture most of the energy along motion trajectories in the video sequences. The MP3D video coding first finds motion trajectories. It then applies a spatio-temporal decomposition using an adaptive approximation algorithm based on Matching Pursuit (MP). The coefficients and the function parameters are quantized and coded in a progressive fashion, under multiple rate constraints, allowing for adaptive decoding by simple bit stream truncation. The motion fields are losslessly coded and transmitted as side information to the decoder. The multi-resolution structure of the dictionary allows for flexible spatial and temporal resolution adaptation. This scheme is shown to yield comparable rate-distortion performances to state-of-the-art schemes, like H.264 and MPEG-4. It represents a promising alternative for low and medium rate applications, or as a flexible base layer for higher rate video systems

    Livrable D2.2 of the PERSEE project : Analyse/Synthese de Texture

    Get PDF
    Livrable D2.2 du projet ANR PERSEECe rapport a été réalisé dans le cadre du projet ANR PERSEE (n° ANR-09-BLAN-0170). Exactement il correspond au livrable D2.2 du projet. Son titre : Analyse/Synthese de Textur
    • …
    corecore