111 research outputs found


    Get PDF
    Colorization is a process of adding colors to a black and white image. The main task in colorization based compression is to automatically extract these few representative pixels in the encoder. In other words, the encoder selects the pixels required for the colorization process, which are called representative pixels (RP) and maintains the color information only for these RP. The position vectors and the chrominance values are sent to the decoder only for the RP set together with the luminance channel, which is compressed by conventional compression techniques. Then, the decoder restores the color information for the remaining pixels using colorization methods

    Image Compression by Learning to Minimize the Total Error

    Full text link

    Transform recipes for efficient cloud photo enhancement

    Get PDF
    Cloud image processing is often proposed as a solution to the limited computing power and battery life of mobile devices: it allows complex algorithms to run on powerful servers with virtually unlimited energy supply. Unfortunately, this overlooks the time and energy cost of uploading the input and downloading the output images. When transfer overhead is accounted for, processing images on a remote server becomes less attractive and many applications do not benefit from cloud offloading. We aim to change this in the case of image enhancements that preserve the overall content of an image. Our key insight is that, in this case, the server can compute and transmit a description of the transformation from input to output, which we call a transform recipe. At equivalent quality, our recipes are much more compact than JPEG images: this reduces the client's download. Furthermore, recipes can be computed from highly compressed inputs which significantly reduces the data uploaded to the server. The client reconstructs a high-fidelity approximation of the output by applying the recipe to its local high-quality input. We demonstrate our results on 168 images and 10 image processing applications, showing that our recipes form a compact representation for a diverse set of image filters. With an equivalent transmission budget, they provide higher-quality results than JPEG-compressed input/output images, with a gain of the order of 10 dB in many cases. We demonstrate the utility of recipes on a mobile phone by profiling the energy consumption and latency for both local and cloud computation: a transform recipe-based pipeline runs 2--4x faster and uses 2--7x less energy than local or naive cloud computation.Qatar Computing Research InstituteUnited States. Defense Advanced Research Projects Agency (Agreement FA8750-14-2-0009)Stanford University. Stanford Pervasive Parallelism LaboratoryAdobe System

    Sparse modelling of natural images and compressive sensing

    Get PDF
    This thesis concerns the study of the statistics of natural images and compressive sensing for two main objectives: 1) to extend our understanding of the regularities exhibited by natural images of the visual world we regularly view around us, and 2) to incorporate this knowledge into image processing applications. Previous work on image statistics has uncovered remarkable behavior of the dis tributions obtained from filtering natural images. Typically we observe high kurtosis, non-Gaussian distributions with sharp central cusps, which are called sparse in the literature. These results have become an accepted fact through empirical findings us ing zero mean filters on many different databases of natural scenes. The observations have played an important role in computational and biological applications, where re searchers have sought to understand visual processes through studying the statistical properties of the objects that are being observed. Interestingly, such results on sparse distributions also share elements with the emerging field of compressive sensing. This is a novel sampling protocol where one seeks to measure a signal in already com pressed format through randomised projections, while the recovery algorithm consists of searching for a constrained solution with the sparsest transformed coefficients. In view of prior art, we extend our knowledge of image statistics from the monochrome domain into the colour domain. We study sparse response distributions of filters constructed on colour channels and observe the regularity of the distributions across diverse datasets of natural images. Several solutions to image processing problems emerge from the incorporation of colour statistics as prior information. We give a Bayesian treatment to the problem of colorizing natural gray images, and formulate image compression schemes using elements of compressive sensing and sparsity. We also propose a denoising algorithm that utilises the sparse filter responses as a regular- isation function for the effective attenuation of Gaussian and impulse noise in images. The results emanating from this body of work illustrate how the statistics of natural images, when incorporated with Bayesian inference and sparse recovery, can have deep implications for image processing applications

    Densely-sampled light field reconstruction

    Get PDF
    In this chapter, we motivate the use of densely-sampled light fields as the representation which can bring the required density of light rays for the correct recreation of 3D visual cues such as focus and continuous parallax and can serve as an intermediary between light field sensing and light field display. We consider the problem of reconstructing such a representation from few camera views and approach it in a sparsification framework. More specifically, we demonstrate that the light field is well structured in the set of so-called epipolar images and can be sparsely represented by a dictionary of directional and multi-scale atoms called shearlets. We present the corresponding regularization method, along with its main algorithm and speed-accelerating modifications. Finally, we illustrate its applicability for the cases of holographic stereograms and light field compression.acceptedVersionPeer reviewe

    IST Austria Thesis

    Get PDF
    Modern computer vision systems heavily rely on statistical machine learning models, which typically require large amounts of labeled data to be learned reliably. Moreover, very recently computer vision research widely adopted techniques for representation learning, which further increase the demand for labeled data. However, for many important practical problems there is relatively small amount of labeled data available, so it is problematic to leverage full potential of the representation learning methods. One way to overcome this obstacle is to invest substantial resources into producing large labelled datasets. Unfortunately, this can be prohibitively expensive in practice. In this thesis we focus on the alternative way of tackling the aforementioned issue. We concentrate on methods, which make use of weakly-labeled or even unlabeled data. Specifically, the first half of the thesis is dedicated to the semantic image segmentation task. We develop a technique, which achieves competitive segmentation performance and only requires annotations in a form of global image-level labels instead of dense segmentation masks. Subsequently, we present a new methodology, which further improves segmentation performance by leveraging tiny additional feedback from a human annotator. By using our methods practitioners can greatly reduce the amount of data annotation effort, which is required to learn modern image segmentation models. In the second half of the thesis we focus on methods for learning from unlabeled visual data. We study a family of autoregressive models for modeling structure of natural images and discuss potential applications of these models. Moreover, we conduct in-depth study of one of these applications, where we develop the state-of-the-art model for the probabilistic image colorization task

    Graph Spectral Image Processing

    Full text link
    Recent advent of graph signal processing (GSP) has spurred intensive studies of signals that live naturally on irregular data kernels described by graphs (e.g., social networks, wireless sensor networks). Though a digital image contains pixels that reside on a regularly sampled 2D grid, if one can design an appropriate underlying graph connecting pixels with weights that reflect the image structure, then one can interpret the image (or image patch) as a signal on a graph, and apply GSP tools for processing and analysis of the signal in graph spectral domain. In this article, we overview recent graph spectral techniques in GSP specifically for image / video processing. The topics covered include image compression, image restoration, image filtering and image segmentation

    IST Austria Thesis

    Get PDF
    Deep neural networks have established a new standard for data-dependent feature extraction pipelines in the Computer Vision literature. Despite their remarkable performance in the standard supervised learning scenario, i.e. when models are trained with labeled data and tested on samples that follow a similar distribution, neural networks have been shown to struggle with more advanced generalization abilities, such as transferring knowledge across visually different domains, or generalizing to new unseen combinations of known concepts. In this thesis we argue that, in contrast to the usual black-box behavior of neural networks, leveraging more structured internal representations is a promising direction for tackling such problems. In particular, we focus on two forms of structure. First, we tackle modularity: We show that (i) compositional architectures are a natural tool for modeling reasoning tasks, in that they efficiently capture their combinatorial nature, which is key for generalizing beyond the compositions seen during training. We investigate how to to learn such models, both formally and experimentally, for the task of abstract visual reasoning. Then, we show that (ii) in some settings, modularity allows us to efficiently break down complex tasks into smaller, easier, modules, thereby improving computational efficiency; We study this behavior in the context of generative models for colorization, as well as for small objects detection. Secondly, we investigate the inherently layered structure of representations learned by neural networks, and analyze its role in the context of transfer learning and domain adaptation across visually dissimilar domains
    • …