14 research outputs found

    Error Concealment for Frame Losses in MDC

    Full text link

    AUTOMATED ESTIMATION, REDUCTION, AND QUALITY ASSESSMENT OF VIDEO NOISE FROM DIFFERENT SOURCES

    Get PDF
    Estimating and removing noise from video signals is important to increase either the visual quality of video signals or the performance of video processing algorithms such as compression or segmentation where noise estimation or reduction is a pre-processing step. To estimate and remove noise, effective methods use both spatial and temporal information to increase the reliability of signal extraction from noise. The objective of this thesis is to introduce a video system having three novel techniques to estimate and reduce video noise from different sources, both effectively and efficiently and assess video quality without considering a reference non-noisy video. The first (intensity-variances based homogeneity classification) technique estimates visual noise of different types in images and video signals. The noise can be white Gaussian noise, mixed Poissonian- Gaussian (signal-dependent white) noise, or processed (frequency-dependent) noise. The method is based on the classification of intensity-variances of signal patches in order to find homogeneous regions that best represent the noise signal in the input signal. The method assumes that noise is signal-independent in each intensity class. To find homogeneous regions, the method works on the downsampled input image and divides it into patches. Each patch is assigned to an intensity class, whereas outlier patches are rejected. Then the most homogeneous cluster is selected and its noise variance is considered as the peak of noise variance. To account for processed noise, we estimate the degree of spatial correlation. To account for temporal noise variations a stabilization process is proposed. We show that the proposed method competes related state-of-the-art in noise estimation. The second technique provides solutions to remove real-world camera noise such as signal-independent, signal-dependent noise, and frequency-dependent noise. Firstly, we propose a noise equalization method in intensity and frequency domain which enables a white Gaussian noise filter to handle real noise. Our experiments confirm the quality improvement under real noise while white Gaussian noise filter is used with our equalization method. Secondly, we propose a band-limited time-space video denoiser which reduces video noise of different types. This denoiser consists of: 1) intensity-domain noise equalization to account for signal dependency, 2) band-limited anti-blocking time-domain filtering of current frame using motion-compensated previous and subsequent frames, 3) spatial filtering combined with noise frequency equalizer to remove residual noise left from temporal filtering, and 4) intensity de-equalization to invert the first step. To decrease the chance of motion blur, temporal weights are calculated using two levels of error estimation; coarse (blocklevel) and fine (pixel-level). We correct the erroneous motion vectors by creating a homography from reliable motion vectors. To eliminate blockiness in block-based temporal filter, we propose three ideas: interpolation of block-level error, a band-limited filtering by subtracting the back-signal beforehand, and two-band motion compensation. The proposed time-space filter is parallelizable to be significantly accelerated by GPU. We show that the proposed method competes related state-ofthe- art in video denoising. The third (sparsity and dominant orientation quality index) technique is a new method to assess the quality of the denoised video frames without a reference (clean frames). In many image and video applications, a quantitative measure of image content, noise, and blur is required to facilitate quality assessment, when the ground-truth is not available. We propose a fast method to find the dominant orientation of image patches, which is used to decompose them into singular values. Combining singular values with the sparsity of the patch in the transform domain, we measure the possible image content and noise of the patches and of the whole image. To measure the effect of noise accurately, our method takes both low and high textured patches into account. Before analyzing the patches, we apply a shrinkage in the transform domain to increase the contrast of genuine image structure. We show that the proposed method is useful to select parameters of denoising algorithms automatically in different noise scenarios such as white Gaussian and real noise. Our objective and subjective results confirm the correspondence between the measured quality and the ground-truth and proposed method rivals related state-of-the-art approaches

    Motion estimation and signaling techniques for 2D+t scalable video coding

    Get PDF
    We describe a fully scalable wavelet-based 2D+t (in-band) video coding architecture. We propose new coding tools specifically designed for this framework aimed at two goals: reduce the computational complexity at the encoder without sacrificing compression; improve the coding efficiency, especially at low bitrates. To this end, we focus our attention on motion estimation and motion vector encoding. We propose a fast motion estimation algorithm that works in the wavelet domain and exploits the geometrical properties of the wavelet subbands. We show that the computational complexity grows linearly with the size of the search window, yet approaching the performance of a full search strategy. We extend the proposed motion estimation algorithm to work with blocks of variable sizes, in order to better capture local motion characteristics, thus improving in terms of rate-distortion behavior. Given this motion field representation, we propose a motion vector coding algorithm that allows to adaptively scale the motion bit budget according to the target bitrate, improving the coding efficiency at low bitrates. Finally, we show how to optimally scale the motion field when the sequence is decoded at reduced spatial resolution. Experimental results illustrate the advantages of each individual coding tool presented in this paper. Based on these simulations, we define the best configuration of coding parameters and we compare the proposed codec with MC-EZBC, a widely used reference codec implementing the t+2D framework

    Spatial and Temporal Image Prediction with Magnitude and Phase Representations

    Get PDF
    In this dissertation, I develop the theory and techniques for spatial and temporal image prediction with the magnitude and phase representation of the Complex Wavelet Transform (CWT) or the over-complete DCT to solve the problems of image inpainting and motion compensated inter-picture prediction. First, I develop the theory and algorithms of image reconstruction from the analytic magnitude or phase of the CWT. I prove the conditions under which a signal is uniquely specified by its analytic magnitude or phase, propose iterative algorithms for the reconstruction of a signal from its analytic CWT magnitude or phase, and analyze the convergence of the proposed algorithms. Image reconstruction from the magnitude and pseudo-phase of the over-complete DCT is also discussed and demonstrated. Second, I propose simple geometrical models of the CWT magnitude and phase to describe edges and structured textures and develop a spatial image prediction (inpainting) algorithm based on those models and the iterative image reconstruction mentioned above. Piecewise smooth signals, structured textures and their mixtures can be predicted successfully with the proposed algorithm. Simulation results show that the proposed algorithm achieves appealing visual quality with low computational complexity. Finally, I propose a novel temporal (inter-picture) image predictor for hybrid video coding. The proposed predictor enables successful predictive coding during fades, blended scenes, temporally decorrelated noise, and many other temporal evolutions that are beyond the capability of the traditional motion compensated prediction methods. The proposed predictor estimates the transform magnitude and phase of the desired motion compensated prediction by exploiting the temporal and spatial correlations of the transform coefficients. For the case of implementation in standard hybrid video coders, the over-complete DCT is chosen over the CWT. Better coding performance is achieved with the state-of-the-art H.264/AVC video encoder equipped with the proposed predictor. The proposed predictor is also successfully applied to image registration

    Fuzzy techniques for noise removal in image sequences and interval-valued fuzzy mathematical morphology

    Get PDF
    Image sequences play an important role in today's world. They provide us a lot of information. Videos are for example used for traffic observations, surveillance systems, autonomous navigation and so on. Due to bad acquisition, transmission or recording, the sequences are however usually corrupted by noise, which hampers the functioning of many image processing techniques. A preprocessing module to filter the images often becomes necessary. After an introduction to fuzzy set theory and image processing, in the first main part of the thesis, several fuzzy logic based video filters are proposed: one filter for grayscale video sequences corrupted by additive Gaussian noise and two color extensions of it and two grayscale filters and one color filter for sequences affected by the random valued impulse noise type. In the second main part of the thesis, interval-valued fuzzy mathematical morphology is studied. Mathematical morphology is a theory intended for the analysis of spatial structures that has found application in e.g. edge detection, object recognition, pattern recognition, image segmentation, image magnification… In the thesis, an overview is given of the evolution from binary mathematical morphology over the different grayscale morphology theories to interval-valued fuzzy mathematical morphology and the interval-valued image model. Additionally, the basic properties of the interval-valued fuzzy morphological operators are investigated. Next, also the decomposition of the interval-valued fuzzy morphological operators is investigated. We investigate the relationship between the cut of the result of such operator applied on an interval-valued image and structuring element and the result of the corresponding binary operator applied on the cut of the image and structuring element. These results are first of all interesting because they provide a link between interval-valued fuzzy mathematical morphology and binary mathematical morphology, but such conversion into binary operators also reduces the computation. Finally, also the reverse problem is tackled, i.e., the construction of interval-valued morphological operators from the binary ones. Using the results from a more general study in which the construction of an interval-valued fuzzy set from a nested family of crisp sets is constructed, increasing binary operators (e.g. the binary dilation) are extended to interval-valued fuzzy operators

    Toward sparse and geometry adapted video approximations

    Get PDF
    Video signals are sequences of natural images, where images are often modeled as piecewise-smooth signals. Hence, video can be seen as a 3D piecewise-smooth signal made of piecewise-smooth regions that move through time. Based on the piecewise-smooth model and on related theoretical work on rate-distortion performance of wavelet and oracle based coding schemes, one can better analyze the appropriate coding strategies that adaptive video codecs need to implement in order to be efficient. Efficient video representations for coding purposes require the use of adaptive signal decompositions able to capture appropriately the structure and redundancy appearing in video signals. Adaptivity needs to be such that it allows for proper modeling of signals in order to represent these with the lowest possible coding cost. Video is a very structured signal with high geometric content. This includes temporal geometry (normally represented by motion information) as well as spatial geometry. Clearly, most of past and present strategies used to represent video signals do not exploit properly its spatial geometry. Similarly to the case of images, a very interesting approach seems to be the decomposition of video using large over-complete libraries of basis functions able to represent salient geometric features of the signal. In the framework of video, these features should model 2D geometric video components as well as their temporal evolution, forming spatio-temporal 3D geometric primitives. Through this PhD dissertation, different aspects on the use of adaptivity in video representation are studied looking toward exploiting both aspects of video: its piecewise nature and the geometry. The first part of this work studies the use of localized temporal adaptivity in subband video coding. This is done considering two transformation schemes used for video coding: 3D wavelet representations and motion compensated temporal filtering. A theoretical R-D analysis as well as empirical results demonstrate how temporal adaptivity improves coding performance of moving edges in 3D transform (without motion compensation) based video coding. Adaptivity allows, at the same time, to equally exploit redundancy in non-moving video areas. The analogy between motion compensated video and 1D piecewise-smooth signals is studied as well. This motivates the introduction of local length adaptivity within frame-adaptive motion compensated lifted wavelet decompositions. This allows an optimal rate-distortion performance when video motion trajectories are shorter than the transformation "Group Of Pictures", or when efficient motion compensation can not be ensured. After studying temporal adaptivity, the second part of this thesis is dedicated to understand the fundamentals of how can temporal and spatial geometry be jointly exploited. This work builds on some previous results that considered the representation of spatial geometry in video (but not temporal, i.e, without motion). In order to obtain flexible and efficient (sparse) signal representations, using redundant dictionaries, the use of highly non-linear decomposition algorithms, like Matching Pursuit, is required. General signal representation using these techniques is still quite unexplored. For this reason, previous to the study of video representation, some aspects of non-linear decomposition algorithms and the efficient decomposition of images using Matching Pursuits and a geometric dictionary are investigated. A part of this investigation concerns the study on the influence of using a priori models within approximation non-linear algorithms. Dictionaries with a high internal coherence have some problems to obtain optimally sparse signal representations when used with Matching Pursuits. It is proved, theoretically and empirically, that inserting in this algorithm a priori models allows to improve the capacity to obtain sparse signal approximations, mainly when coherent dictionaries are used. Another point discussed in this preliminary study, on the use of Matching Pursuits, concerns the approach used in this work for the decompositions of video frames and images. The technique proposed in this thesis improves a previous work, where authors had to recur to sub-optimal Matching Pursuit strategies (using Genetic Algorithms), given the size of the functions library. In this work the use of full search strategies is made possible, at the same time that approximation efficiency is significantly improved and computational complexity is reduced. Finally, a priori based Matching Pursuit geometric decompositions are investigated for geometric video representations. Regularity constraints are taken into account to recover the temporal evolution of spatial geometric signal components. The results obtained for coding and multi-modal (audio-visual) signal analysis, clarify many unknowns and show to be promising, encouraging to prosecute research on the subject
    corecore