49 research outputs found

    Learning-based Wavelet-like Transforms For Fully Scalable and Accessible Image Compression

    Full text link
    The goal of this thesis is to improve the existing wavelet transform with the aid of machine learning techniques, so as to enhance coding efficiency of wavelet-based image compression frameworks, such as JPEG 2000. In this thesis, we first propose to augment the conventional base wavelet transform with two additional learned lifting steps -- a high-to-low step followed by a low-to-high step. The high-to-low step suppresses aliasing in the low-pass band by using the detail bands at the same resolution, while the low-to-high step aims to further remove redundancy from detail bands by using the corresponding low-pass band. These two additional steps reduce redundancy (notably aliasing information) amongst the wavelet subbands, and also improve the visual quality of reconstructed images at reduced resolutions. To train these two networks in an end-to-end fashion, we develop a backward annealing approach to overcome the non-differentiability of the quantization and cost functions during back-propagation. Importantly, the two additional networks share a common architecture, named a proposal-opacity topology, which is inspired and guided by a specific theoretical argument related to geometric flow. This particular network topology is compact and with limited non-linearities, allowing a fully scalable system; one pair of trained network parameters are applied for all levels of decomposition and for all bit-rates of interest. By employing the additional lifting networks within the JPEG2000 image coding standard, we can achieve up to 17.4% average BD bit-rate saving over a wide range of bit-rates, while retaining the quality and resolution scalability features of JPEG2000. Built upon the success of the high-to-low and low-to-high steps, we then study more broadly the extension of neural networks to all lifting steps that correspond to the base wavelet transform. The purpose of this comprehensive study is to understand what is the most effective way to develop learned wavelet-like transforms for highly scalable and accessible image compression. Specifically, we examine the impact of the number of learned lifting steps, the number of layers and the number of channels in each learned lifting network, and kernel support in each layer. To facilitate the study, we develop a generic training methodology that is simultaneously appropriate to all lifting structures considered. Experimental results ultimately suggest that to improve the existing wavelet transform, it is more profitable to augment a larger wavelet transform with more diverse high-to-low and low-to-high steps, rather than developing deep fully learned lifting structures

    Scalable video compression with optimized visual performance and random accessibility

    Full text link
    This thesis is concerned with maximizing the coding efficiency, random accessibility and visual performance of scalable compressed video. The unifying theme behind this work is the use of finely embedded localized coding structures, which govern the extent to which these goals may be jointly achieved. The first part focuses on scalable volumetric image compression. We investigate 3D transform and coding techniques which exploit inter-slice statistical redundancies without compromising slice accessibility. Our study shows that the motion-compensated temporal discrete wavelet transform (MC-TDWT) practically achieves an upper bound to the compression efficiency of slice transforms. From a video coding perspective, we find that most of the coding gain is attributed to offsetting the learning penalty in adaptive arithmetic coding through 3D code-block extension, rather than inter-frame context modelling. The second aspect of this thesis examines random accessibility. Accessibility refers to the ease with which a region of interest is accessed (subband samples needed for reconstruction are retrieved) from a compressed video bitstream, subject to spatiotemporal code-block constraints. We investigate the fundamental implications of motion compensation for random access efficiency and the compression performance of scalable interactive video. We demonstrate that inclusion of motion compensation operators within the lifting steps of a temporal subband transform incurs a random access penalty which depends on the characteristics of the motion field. The final aspect of this thesis aims to minimize the perceptual impact of visible distortion in scalable reconstructed video. We present a visual optimization strategy based on distortion scaling which raises the distortion-length slope of perceptually significant samples. This alters the codestream embedding order during post-compression rate-distortion optimization, thus allowing visually sensitive sites to be encoded with higher fidelity at a given bit-rate. For visual sensitivity analysis, we propose a contrast perception model that incorporates an adaptive masking slope. This versatile feature provides a context which models perceptual significance. It enables scene structures that otherwise suffer significant degradation to be preserved at lower bit-rates. The novelty in our approach derives from a set of "perceptual mappings" which account for quantization noise shaping effects induced by motion-compensated temporal synthesis. The proposed technique reduces wavelet compression artefacts and improves the perceptual quality of video

    Discrete Wavelet Transforms

    Get PDF
    The discrete wavelet transform (DWT) algorithms have a firm position in processing of signals in several areas of research and industry. As DWT provides both octave-scale frequency and spatial timing of the analyzed signal, it is constantly used to solve and treat more and more advanced problems. The present book: Discrete Wavelet Transforms: Algorithms and Applications reviews the recent progress in discrete wavelet transform algorithms and applications. The book covers a wide range of methods (e.g. lifting, shift invariance, multi-scale analysis) for constructing DWTs. The book chapters are organized into four major parts. Part I describes the progress in hardware implementations of the DWT algorithms. Applications include multitone modulation for ADSL and equalization techniques, a scalable architecture for FPGA-implementation, lifting based algorithm for VLSI implementation, comparison between DWT and FFT based OFDM and modified SPIHT codec. Part II addresses image processing algorithms such as multiresolution approach for edge detection, low bit rate image compression, low complexity implementation of CQF wavelets and compression of multi-component images. Part III focuses watermaking DWT algorithms. Finally, Part IV describes shift invariant DWTs, DC lossless property, DWT based analysis and estimation of colored noise and an application of the wavelet Galerkin method. The chapters of the present book consist of both tutorial and highly advanced material. Therefore, the book is intended to be a reference text for graduate students and researchers to obtain state-of-the-art knowledge on specific applications

    Fixed-analysis adaptive-synthesis filter banks

    Get PDF
    Subband/Wavelet filter analysis-synthesis filters are a major component in many compression algorithms. Such compression algorithms have been applied to images, voice, and video. These algorithms have achieved high performance. Typically, the configuration for such compression algorithms involves a bank of analysis filters whose coefficients have been designed in advance to enable high quality reconstruction. The analysis system is then followed by subband quantization and decoding on the synthesis side. Decoding is performed using a corresponding set of synthesis filters and the subbands are merged together. For many years, there has been interest in improving the analysis-synthesis filters in order to achieve better coding quality. Adaptive filter banks have been explored by a number of authors where by the analysis filters and synthesis filters coefficients are changed dynamically in response to the input. A degree of performance improvement has been reported but this approach does require that the analysis system dynamically maintain synchronization with the synthesis system in order to perform reconstruction. In this thesis, we explore a variant of the adaptive filter bank idea. We will refer to this approach as fixed-analysis adaptive-synthesis filter banks. Unlike the adaptive filter banks proposed previously, there is no analysis synthesis synchronization issue involved. This implies less coder complexity and more coder flexibility. Such an approach can be compatible with existing subband wavelet encoders. The design methodology and a performance analysis are presented.Ph.D.Committee Chair: Smith, Mark J. T.; Committee Co-Chair: Mersereau, Russell M.; Committee Member: Anderson, David; Committee Member: Lanterman, Aaron; Committee Member: Rosen, Gail; Committee Member: Wardi, Yora

    3D Wavelet Transformation for Visual Data Coding With Spatio and Temporal Scalability as Quality Artifacts: Current State Of The Art

    Get PDF
    Several techniques based on the three–dimensional (3-D) discrete cosine transform (DCT) have been proposed for visual data coding. These techniques fail to provide coding coupled with quality and resolution scalability, which is a significant drawback for contextual domains, such decease diagnosis, satellite image analysis. This paper gives an overview of several state-of-the-art 3-D wavelet coders that do meet these requirements and mainly investigates various types of compression techniques those exists, and putting it all together for a conclusion on further research scope

    Efficient compression of motion compensated residuals

    Get PDF
    EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Solutions to non-stationary problems in wavelet space.

    Get PDF

    Multiresolution image models and estimation techniques

    Get PDF

    Nondyadic and nonlinear multiresolution image approximations

    Get PDF
    This thesis focuses on the development of novel multiresolution image approximations. Specifically, we present two kinds of generalization of multiresolution techniques: image reduction for arbitrary scales, and nonlinear approximations using other metrics than the standard Euclidean one. Traditional multiresolution decompositions are restricted to dyadic scales. As first contribution of this thesis, we develop a method that goes beyond this restriction and that is well suited to arbitrary scale-change computations. The key component is a new and numerically exact algorithm for computing inner products between a continuously defined signal and B-splines of any order and of arbitrary sizes. The technique can also be applied for non-uniform to uniform grid conversion, which is another approximation problem where our method excels. Main applications are resampling and signal reconstruction. Although simple to implement, least-squares approximations lead to artifacts that could be reduced if nonlinear methods would be used instead. The second contribution of the thesis is the development of nonlinear spline pyramids that are optimal for lp-norms. First, we introduce a Banach-space formulation of the problem and show that the solution is well defined. Second, we compute the lp-approximation thanks to an iterative optimization algorithm based on digital filtering. We conclude that l1-approximations reduce the artifacts that are inherent to least-squares methods; in particular, edge blurring and ringing. In addition, we observe that the error of l1-approximations is sparser. Finally, we derive an exact formula for the asymptotic Lp-error; this result justifies using the least-squares approximation as initial solution for the iterative optimization algorithm when the degree of the spline is even; otherwise, one has to include an appropriate correction term. The theoretical background of the thesis includes the modelisation of images in a continuous/discrete formalism and takes advantage of the approximation theory of linear shift-invariant operators. We have chosen B-splines as basis functions because of their nice properties. We also propose a new graphical formalism that links B-splines, finite differences, differential operators, and arbitrary scale changes
    corecore