66,216 research outputs found

    Learning Image-Adaptive Codebooks for Class-Agnostic Image Restoration

    Full text link
    Recent work on discrete generative priors, in the form of codebooks, has shown exciting performance for image reconstruction and restoration, as the discrete prior space spanned by the codebooks increases the robustness against diverse image degradations. Nevertheless, these methods require separate training of codebooks for different image categories, which limits their use to specific image categories only (e.g. face, architecture, etc.), and fail to handle arbitrary natural images. In this paper, we propose AdaCode for learning image-adaptive codebooks for class-agnostic image restoration. Instead of learning a single codebook for each image category, we learn a set of basis codebooks. For a given input image, AdaCode learns a weight map with which we compute a weighted combination of these basis codebooks for adaptive image restoration. Intuitively, AdaCode is a more flexible and expressive discrete generative prior than previous work. Experimental results demonstrate that AdaCode achieves state-of-the-art performance on image reconstruction and restoration tasks, including image super-resolution and inpainting

    Towards Joint Super-Resolution and High Dynamic Range Image Reconstruction

    Get PDF
    The main objective for digital image- and video camera systems is to reproduce a real-world scene in such a way that a high visual quality is obtained. A crucial aspect in this regard is, naturally, the quality of the hardware components of the camera device. There are, however, always some undesired limitations imposed by the sensor of the camera. To begin with, the dynamic range of light intensities that the sensor can capture in its nonsaturated region is much smaller than the dynamic range of most common daylight scenes. Secondly, the achievable spatial resolution of the camera is limited, especially for video capture with a high frame rate. Signal processing software algorithms can be used that fuse the information from a sequence of images into one enhanced image. Thus, the dynamic range limitation can be overcome, and the spatial resolution can be improved. This thesis discusses different methods that utilize data from a set of multiple images, that exhibits photometric diversity, spatial diversity, or both. For the case where the images are differently exposed, photometric alignment is performed prior to reconstructing an image of a higher dynamic range. For the case where there is spatial diversity, a Super-Resolution reconstruction method is applied, in which an inverse problem is formulated and solved to obtain a high resolution reconstruction result. For either case, as well as for the optimistic and promising combination of the two methods, the problem formulation should consider how the scene information is perceived by humans. Incorporating the properties of the human vision system in novel mathematical formulations for joint high dynamic range and high resolution image reconstruction is the main contribution of the thesis, in particular of the published papers that are included. The potential usefulness of high dynamic range image reconstruction on the one hand, and Super-Resolution image reconstruction on the other, are demonstrated. Finally, the combination of the two is discussed and results from simulations are given

    A Reverse Hierarchy Model for Predicting Eye Fixations

    Full text link
    A number of psychological and physiological evidences suggest that early visual attention works in a coarse-to-fine way, which lays a basis for the reverse hierarchy theory (RHT). This theory states that attention propagates from the top level of the visual hierarchy that processes gist and abstract information of input, to the bottom level that processes local details. Inspired by the theory, we develop a computational model for saliency detection in images. First, the original image is downsampled to different scales to constitute a pyramid. Then, saliency on each layer is obtained by image super-resolution reconstruction from the layer above, which is defined as unpredictability from this coarse-to-fine reconstruction. Finally, saliency on each layer of the pyramid is fused into stochastic fixations through a probabilistic model, where attention initiates from the top layer and propagates downward through the pyramid. Extensive experiments on two standard eye-tracking datasets show that the proposed method can achieve competitive results with state-of-the-art models.Comment: CVPR 2014, 27th IEEE Conference on Computer Vision and Pattern Recognition (CVPR). CVPR 201
    • …
    corecore