273 research outputs found

    MASCOT : metadata for advanced scalable video coding tools : final report

    Get PDF
    The goal of the MASCOT project was to develop new video coding schemes and tools that provide both an increased coding efficiency as well as extended scalability features compared to technology that was available at the beginning of the project. Towards that goal the following tools would be used: - metadata-based coding tools; - new spatiotemporal decompositions; - new prediction schemes. Although the initial goal was to develop one single codec architecture that was able to combine all new coding tools that were foreseen when the project was formulated, it became clear that this would limit the selection of the new tools. Therefore the consortium decided to develop two codec frameworks within the project, a standard hybrid DCT-based codec and a 3D wavelet-based codec, which together are able to accommodate all tools developed during the course of the project

    Motion Scalability for Video Coding with Flexible Spatio-Temporal Decompositions

    Get PDF
    PhDThe research presented in this thesis aims to extend the scalability range of the wavelet-based video coding systems in order to achieve fully scalable coding with a wide range of available decoding points. Since the temporal redundancy regularly comprises the main portion of the global video sequence redundancy, the techniques that can be generally termed motion decorrelation techniques have a central role in the overall compression performance. For this reason the scalable motion modelling and coding are of utmost importance, and specifically, in this thesis possible solutions are identified and analysed. The main contributions of the presented research are grouped into two interrelated and complementary topics. Firstly a flexible motion model with rateoptimised estimation technique is introduced. The proposed motion model is based on tree structures and allows high adaptability needed for layered motion coding. The flexible structure for motion compensation allows for optimisation at different stages of the adaptive spatio-temporal decomposition, which is crucial for scalable coding that targets decoding on different resolutions. By utilising an adaptive choice of wavelet filterbank, the model enables high compression based on efficient mode selection. Secondly, solutions for scalable motion modelling and coding are developed. These solutions are based on precision limiting of motion vectors and creation of a layered motion structure that describes hierarchically coded motion. The solution based on precision limiting relies on layered bit-plane coding of motion vector values. The second solution builds on recently established techniques that impose scalability on a motion structure. The new approach is based on two major improvements: the evaluation of distortion in temporal Subbands and motion search in temporal subbands that finds the optimal motion vectors for layered motion structure. Exhaustive tests on the rate-distortion performance in demanding scalable video coding scenarios show benefits of application of both developed flexible motion model and various solutions for scalable motion coding

    A Panorama on Multiscale Geometric Representations, Intertwining Spatial, Directional and Frequency Selectivity

    Full text link
    The richness of natural images makes the quest for optimal representations in image processing and computer vision challenging. The latter observation has not prevented the design of image representations, which trade off between efficiency and complexity, while achieving accurate rendering of smooth regions as well as reproducing faithful contours and textures. The most recent ones, proposed in the past decade, share an hybrid heritage highlighting the multiscale and oriented nature of edges and patterns in images. This paper presents a panorama of the aforementioned literature on decompositions in multiscale, multi-orientation bases or dictionaries. They typically exhibit redundancy to improve sparsity in the transformed domain and sometimes its invariance with respect to simple geometric deformations (translation, rotation). Oriented multiscale dictionaries extend traditional wavelet processing and may offer rotation invariance. Highly redundant dictionaries require specific algorithms to simplify the search for an efficient (sparse) representation. We also discuss the extension of multiscale geometric decompositions to non-Euclidean domains such as the sphere or arbitrary meshed surfaces. The etymology of panorama suggests an overview, based on a choice of partially overlapping "pictures". We hope that this paper will contribute to the appreciation and apprehension of a stream of current research directions in image understanding.Comment: 65 pages, 33 figures, 303 reference

    Toward sparse and geometry adapted video approximations

    Get PDF
    Video signals are sequences of natural images, where images are often modeled as piecewise-smooth signals. Hence, video can be seen as a 3D piecewise-smooth signal made of piecewise-smooth regions that move through time. Based on the piecewise-smooth model and on related theoretical work on rate-distortion performance of wavelet and oracle based coding schemes, one can better analyze the appropriate coding strategies that adaptive video codecs need to implement in order to be efficient. Efficient video representations for coding purposes require the use of adaptive signal decompositions able to capture appropriately the structure and redundancy appearing in video signals. Adaptivity needs to be such that it allows for proper modeling of signals in order to represent these with the lowest possible coding cost. Video is a very structured signal with high geometric content. This includes temporal geometry (normally represented by motion information) as well as spatial geometry. Clearly, most of past and present strategies used to represent video signals do not exploit properly its spatial geometry. Similarly to the case of images, a very interesting approach seems to be the decomposition of video using large over-complete libraries of basis functions able to represent salient geometric features of the signal. In the framework of video, these features should model 2D geometric video components as well as their temporal evolution, forming spatio-temporal 3D geometric primitives. Through this PhD dissertation, different aspects on the use of adaptivity in video representation are studied looking toward exploiting both aspects of video: its piecewise nature and the geometry. The first part of this work studies the use of localized temporal adaptivity in subband video coding. This is done considering two transformation schemes used for video coding: 3D wavelet representations and motion compensated temporal filtering. A theoretical R-D analysis as well as empirical results demonstrate how temporal adaptivity improves coding performance of moving edges in 3D transform (without motion compensation) based video coding. Adaptivity allows, at the same time, to equally exploit redundancy in non-moving video areas. The analogy between motion compensated video and 1D piecewise-smooth signals is studied as well. This motivates the introduction of local length adaptivity within frame-adaptive motion compensated lifted wavelet decompositions. This allows an optimal rate-distortion performance when video motion trajectories are shorter than the transformation "Group Of Pictures", or when efficient motion compensation can not be ensured. After studying temporal adaptivity, the second part of this thesis is dedicated to understand the fundamentals of how can temporal and spatial geometry be jointly exploited. This work builds on some previous results that considered the representation of spatial geometry in video (but not temporal, i.e, without motion). In order to obtain flexible and efficient (sparse) signal representations, using redundant dictionaries, the use of highly non-linear decomposition algorithms, like Matching Pursuit, is required. General signal representation using these techniques is still quite unexplored. For this reason, previous to the study of video representation, some aspects of non-linear decomposition algorithms and the efficient decomposition of images using Matching Pursuits and a geometric dictionary are investigated. A part of this investigation concerns the study on the influence of using a priori models within approximation non-linear algorithms. Dictionaries with a high internal coherence have some problems to obtain optimally sparse signal representations when used with Matching Pursuits. It is proved, theoretically and empirically, that inserting in this algorithm a priori models allows to improve the capacity to obtain sparse signal approximations, mainly when coherent dictionaries are used. Another point discussed in this preliminary study, on the use of Matching Pursuits, concerns the approach used in this work for the decompositions of video frames and images. The technique proposed in this thesis improves a previous work, where authors had to recur to sub-optimal Matching Pursuit strategies (using Genetic Algorithms), given the size of the functions library. In this work the use of full search strategies is made possible, at the same time that approximation efficiency is significantly improved and computational complexity is reduced. Finally, a priori based Matching Pursuit geometric decompositions are investigated for geometric video representations. Regularity constraints are taken into account to recover the temporal evolution of spatial geometric signal components. The results obtained for coding and multi-modal (audio-visual) signal analysis, clarify many unknowns and show to be promising, encouraging to prosecute research on the subject

    Real-time scalable video coding for surveillance applications on embedded architectures

    Get PDF

    Numerical Issues When Using Wavelets

    Get PDF
    International audienceWavelets and related multiscale representations pervade all areas of signal processing. The recent inclusion of wavelet algorithms in JPEG 2000 – the new still-picture compression standard– testifies to this lasting and significant impact. The reason of the success of the wavelets is due to the fact that wavelet basis represents well a large class of signals, and therefore allows us to detect roughly isotropic elements occurring at all spatial scales and locations. As the noise in the physical sciences is often not Gaussian, the modeling, in the wavelet space, of many kind of noise (Poisson noise, combination of Gaussian and Poisson noise, long-memory 1/f noise, non-stationary noise, ...) has also been a key step for the use of wavelets in scientific, medical, or industrial applications [1]. Extensive wavelet packages exist now, commercial (see for example [2]) or non commercial (see for example [3, 4]), which allows any researcher, doctor, or engineer to analyze his data using wavelets

    A toolbox for the lifting scheme on quincunx grids (LISQ)

    Get PDF
    A collection of functions written in MATLAB is presented. The functions include second generation wavelet decomposition and reconstruction tools for images as well as functions for the computation of moments. The wavelet schemes rely on the lifting scheme of Sweldens and use the splitting of rectangular grids into quincunx grids, also known as red-black ordering. The prediction filters include the Neville filters as well as a nonlinear maxmin filter. Custom-made filters can be used too. The various functions are described and examples are given. The toolbox is provided with appliances for the visualization of data on quincunx grids. The software can be downloaded from a website and is publicly available

    First-arrival Travel-Time Tomography using Second Generation Wavelets

    Get PDF
    International audienceWavelet decomposition of the slowness model has been proposed as a multiscale strategy for seismic first-arrival time tomography. We propose the introduction of so-called second generation wavelets which could be used for any mesh structure and does not need a number of samples as the power of two in each direction. Moreover, one can handle easily boundary effects. A linearized procedure for inverting delayed travel-times considering either slowness coefficients or wavelet coefficients. The ray tracing is solved at each iteration through an eikonal solver while the linear system to be solved at each iteration goes through an iterative solver as LSQR algorithm. We develop wavelet decomposition over constant patches (Haar wavelet) or over linear patches (Battle-Lemarie wavelet) of coefficients at different scales. This decomposition is introduced in the linear system to be solved and wavelet coefficients are considered as unknowns to be inverted. Synthetic examples show that the inversion behaves in a better way as wavelet decomposition seems to act as a preconditioner of the linear system. Local discretisation is possible but requires additional implementation as artefacs once built inside the model description never disappear because of the linearized approach. A binary mask operator is designed for each scale grid and could be applied locally leading to quite different spatial resolution depending on the analysis we could perform of the expected resolution at a given position of the medium. We show that indeed it is possible to design this binary operator and we apply it to synthetic examples as a crosswell experiment inside the Marmousi model. An application to a surface-surface experiment has been performed and the waveled decomposition shows that indeed we may recover detailed features nearby the free surface while preventing imprints of ray coverage at greater depths giving us smooth features at that depths. In spite of the increase demand of computer resources, the wavelet decomposition seems to be a rather promising alternative for controlling the resolution variation of seismic first-arrival tomography

    Distortion estimates for adaptive lifting transforms with noise

    Get PDF
    Multimedia analysis, enhancement and coding methods often resort to adaptive transforms that exploit local characteristics of the input source. Following the signal decomposition stage, the produced transform coefficients and the adaptive transform parameters can be subject to quantization and/or data corruption (e.g. due to transmission or storage limitations). As a result, mismatches between the analysis- and synthesis-side transform coefficients and adaptive parameters may occur, severely impacting the reconstructed signal and therefore affecting the quality of the subsequent analysis, processing and display task. Hence, a thorough understanding of the quality degradation ensuing from such mismatches is essential for multimedia applications that rely on adaptive signal decompositions. This paper focuses on lifting-based adaptive transforms that represent a broad class of adaptive decompositions. By viewing the mismatches in the transform coefficients and the adaptive parameters as perturbations in the synthesis system, we derive analytic expressions for the expected reconstruction distortion. Our theoretical results are experimentally assessed using 1D adaptive decompositions and motion-adaptive temporal decompositions of video signals

    MURPHY -- A scalable multiresolution framework for scientific computing on 3D block-structured collocated grids

    Full text link
    We present the derivation, implementation, and analysis of a multiresolution adaptive grid framework for numerical simulations on octree-based 3D block-structured collocated grids with distributed computational architectures. Our approach provides a consistent handling of non-lifted and lifted interpolating wavelets of arbitrary order demonstrated using second, fourth, and sixth order wavelets, combined with standard finite-difference based discretization operators. We first validate that the wavelet family used provides strict and explicit error control when coarsening the grid, and show that lifting wavelets increase the grid compression rate while conserving discrete moments across levels. Further, we demonstrate that high-order PDE discretization schemes combined with sufficiently high order wavelets retain the expected convergence order even at resolution jumps. We then simulate the advection of a scalar to analyze convergence for the temporal evolution of a PDE. The results shows that our wavelet-based refinement criterion is successful at controlling the overall error while the coarsening criterion is effective at retaining the relevant information on a compressed grid. Our software exploits a block-structured grid data structure for efficient multi-level operations, combined with a parallelization strategy that relies on a one-sided MPI-RMA communication approach with active PSCW synchronization. Using performance tests up to 16,384 cores, we demonstrate that this leads to a highly scalable performance. The associated code is available under a BSD-3 license at https://github.com/vanreeslab/murphy.Comment: submitted to SIAM Journal of Scientific Computing (SISC) on Dec 1
    • …
    corecore