8 research outputs found

    A Panorama on Multiscale Geometric Representations, Intertwining Spatial, Directional and Frequency Selectivity

    Full text link
    The richness of natural images makes the quest for optimal representations in image processing and computer vision challenging. The latter observation has not prevented the design of image representations, which trade off between efficiency and complexity, while achieving accurate rendering of smooth regions as well as reproducing faithful contours and textures. The most recent ones, proposed in the past decade, share an hybrid heritage highlighting the multiscale and oriented nature of edges and patterns in images. This paper presents a panorama of the aforementioned literature on decompositions in multiscale, multi-orientation bases or dictionaries. They typically exhibit redundancy to improve sparsity in the transformed domain and sometimes its invariance with respect to simple geometric deformations (translation, rotation). Oriented multiscale dictionaries extend traditional wavelet processing and may offer rotation invariance. Highly redundant dictionaries require specific algorithms to simplify the search for an efficient (sparse) representation. We also discuss the extension of multiscale geometric decompositions to non-Euclidean domains such as the sphere or arbitrary meshed surfaces. The etymology of panorama suggests an overview, based on a choice of partially overlapping "pictures". We hope that this paper will contribute to the appreciation and apprehension of a stream of current research directions in image understanding.Comment: 65 pages, 33 figures, 303 reference

    Image representation and compression using steered hermite transforms

    Get PDF

    A common framework for rate and distortion based scaling of highly scalable compressed video

    Full text link

    Transforms for prediction residuals in video coding

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2010.Cataloged from PDF version of thesis.Includes bibliographical references (p. 135-140).Typically the same transform, the 2-D Discrete Cosine Transform (DCT), is used to compress both image intensities in image coding and prediction residuals in video coding. Major prediction residuals include the motion compensated prediction residual, the resolution enhancement residual in scalable video coding, and the intra prediction residual in intra-frame coding. The 2-D DCT is efficient at decorrelating images, but the spatial characteristics of prediction residuals can be significantly different from the spatial characteristics of images, and developing transforms that are adapted to the characteristics of prediction residuals can improve their compression efficiency. In this thesis, we explore the differences between the characteristics of images and prediction residuals by analyzing their local anisotropic characteristics and develop transforms adapted to the local anisotropic characteristics of some types of prediction residuals. The analysis shows that local regions in images have 2-D anisotropic characteristics and many regions in several types of prediction residuals have 1-D anisotropic characteristics. Based on this insight, we develop 1-D transforms for these residuals. We perform experiments to evaluate the potential gains achievable from using these transforms within the H.264 codec, and the experimental results indicate that these transforms can increase the compression efficiency of these residuals.by Fatih Kamışlı.Ph.D

    Learning-based Wavelet-like Transforms For Fully Scalable and Accessible Image Compression

    Full text link
    The goal of this thesis is to improve the existing wavelet transform with the aid of machine learning techniques, so as to enhance coding efficiency of wavelet-based image compression frameworks, such as JPEG 2000. In this thesis, we first propose to augment the conventional base wavelet transform with two additional learned lifting steps -- a high-to-low step followed by a low-to-high step. The high-to-low step suppresses aliasing in the low-pass band by using the detail bands at the same resolution, while the low-to-high step aims to further remove redundancy from detail bands by using the corresponding low-pass band. These two additional steps reduce redundancy (notably aliasing information) amongst the wavelet subbands, and also improve the visual quality of reconstructed images at reduced resolutions. To train these two networks in an end-to-end fashion, we develop a backward annealing approach to overcome the non-differentiability of the quantization and cost functions during back-propagation. Importantly, the two additional networks share a common architecture, named a proposal-opacity topology, which is inspired and guided by a specific theoretical argument related to geometric flow. This particular network topology is compact and with limited non-linearities, allowing a fully scalable system; one pair of trained network parameters are applied for all levels of decomposition and for all bit-rates of interest. By employing the additional lifting networks within the JPEG2000 image coding standard, we can achieve up to 17.4% average BD bit-rate saving over a wide range of bit-rates, while retaining the quality and resolution scalability features of JPEG2000. Built upon the success of the high-to-low and low-to-high steps, we then study more broadly the extension of neural networks to all lifting steps that correspond to the base wavelet transform. The purpose of this comprehensive study is to understand what is the most effective way to develop learned wavelet-like transforms for highly scalable and accessible image compression. Specifically, we examine the impact of the number of learned lifting steps, the number of layers and the number of channels in each learned lifting network, and kernel support in each layer. To facilitate the study, we develop a generic training methodology that is simultaneously appropriate to all lifting structures considered. Experimental results ultimately suggest that to improve the existing wavelet transform, it is more profitable to augment a larger wavelet transform with more diverse high-to-low and low-to-high steps, rather than developing deep fully learned lifting structures

    Wavelets and Subband Coding

    Get PDF
    First published in 1995, Wavelets and Subband Coding offered a unified view of the exciting field of wavelets and their discrete-time cousins, filter banks, or subband coding. The book developed the theory in both continuous and discrete time, and presented important applications. During the past decade, it filled a useful need in explaining a new view of signal processing based on flexible time-frequency analysis and its applications. Since 2007, the authors now retain the copyright and allow open access to the book

    Orientation Adaptive Subband Coding of Images

    No full text
    Abstract- In the subband coding of images, directionality of image features has thus far been exploited very little. The pro-posed subband coding scheme utilizes orientation of local image features to avoid the highly objectionable Gibbs-like phenomena observed at reconstructed image edges with conventional subband schemes at low bit rates. At comparable bit rates, the subjective image quality obtained by our orientation adaptive scheme is considerably enhanced over a conventional separable subband coding scheme, as well as other separable approaches such as the JPEG compression standard. I
    corecore