443 research outputs found

    A Deep Journey into Super-resolution: A survey

    Full text link
    Deep convolutional networks based super-resolution is a fast-growing field with numerous practical applications. In this exposition, we extensively compare 30+ state-of-the-art super-resolution Convolutional Neural Networks (CNNs) over three classical and three recently introduced challenging datasets to benchmark single image super-resolution. We introduce a taxonomy for deep-learning based super-resolution networks that groups existing methods into nine categories including linear, residual, multi-branch, recursive, progressive, attention-based and adversarial designs. We also provide comparisons between the models in terms of network complexity, memory footprint, model input and output, learning details, the type of network losses and important architectural differences (e.g., depth, skip-connections, filters). The extensive evaluation performed, shows the consistent and rapid growth in the accuracy in the past few years along with a corresponding boost in model complexity and the availability of large-scale datasets. It is also observed that the pioneering methods identified as the benchmark have been significantly outperformed by the current contenders. Despite the progress in recent years, we identify several shortcomings of existing techniques and provide future research directions towards the solution of these open problems.Comment: Accepted in ACM Computing Survey

    Fast and Accurate Image Super-Resolution with Deep Laplacian Pyramid Networks

    Full text link
    Convolutional neural networks have recently demonstrated high-quality reconstruction for single image super-resolution. However, existing methods often require a large number of network parameters and entail heavy computational loads at runtime for generating high-accuracy super-resolution results. In this paper, we propose the deep Laplacian Pyramid Super-Resolution Network for fast and accurate image super-resolution. The proposed network progressively reconstructs the sub-band residuals of high-resolution images at multiple pyramid levels. In contrast to existing methods that involve the bicubic interpolation for pre-processing (which results in large feature maps), the proposed method directly extracts features from the low-resolution input space and thereby entails low computational loads. We train the proposed network with deep supervision using the robust Charbonnier loss functions and achieve high-quality image reconstruction. Furthermore, we utilize the recursive layers to share parameters across as well as within pyramid levels, and thus drastically reduce the number of parameters. Extensive quantitative and qualitative evaluations on benchmark datasets show that the proposed algorithm performs favorably against the state-of-the-art methods in terms of run-time and image quality.Comment: The code and datasets are available at http://vllab.ucmerced.edu/wlai24/LapSRN

    Super-Resolution with Deep Adaptive Image Resampling

    Full text link
    Deep learning based methods have recently pushed the state-of-the-art on the problem of Single Image Super-Resolution (SISR). In this work, we revisit the more traditional interpolation-based methods, that were popular before, now with the help of deep learning. In particular, we propose to use a Convolutional Neural Network (CNN) to estimate spatially variant interpolation kernels and apply the estimated kernels adaptively to each position in the image. The whole model is trained in an end-to-end manner. We explore two ways to improve the results for the case of large upscaling factors, and propose a recursive extension of our basic model. This achieves results that are on par with state-of-the-art methods. We visualize the estimated adaptive interpolation kernels to gain more insight on the effectiveness of the proposed method. We also extend the method to the task of joint image filtering and again achieve state-of-the-art performance

    A Multi-Scale and Multi-Depth Convolutional Neural Network for Remote Sensing Imagery Pan-Sharpening

    Full text link
    Pan-sharpening is a fundamental and significant task in the field of remote sensing imagery processing, in which high-resolution spatial details from panchromatic images are employed to enhance the spatial resolution of multi-spectral (MS) images. As the transformation from low spatial resolution MS image to high-resolution MS image is complex and highly non-linear, inspired by the powerful representation for non-linear relationships of deep neural networks, we introduce multi-scale feature extraction and residual learning into the basic convolutional neural network (CNN) architecture and propose the multi-scale and multi-depth convolutional neural network (MSDCNN) for the pan-sharpening of remote sensing imagery. Both the quantitative assessment results and the visual assessment confirm that the proposed network yields high-resolution MS images that are superior to the images produced by the compared state-of-the-art methods

    Attention Based Real Image Restoration

    Full text link
    Deep convolutional neural networks perform better on images containing spatially invariant degradations, also known as synthetic degradations; however, their performance is limited on real-degraded photographs and requires multiple-stage network modeling. To advance the practicability of restoration algorithms, this paper proposes a novel single-stage blind real image restoration network (R2^2Net) by employing a modular architecture. We use a residual on the residual structure to ease the flow of low-frequency information and apply feature attention to exploit the channel dependencies. Furthermore, the evaluation in terms of quantitative metrics and visual quality for four restoration tasks i.e. Denoising, Super-resolution, Raindrop Removal, and JPEG Compression on 11 real degraded datasets against more than 30 state-of-the-art algorithms demonstrate the superiority of our R2^2Net. We also present the comparison on three synthetically generated degraded datasets for denoising to showcase the capability of our method on synthetics denoising. The codes, trained models, and results are available on https://github.com/saeed-anwar/R2Net.Comment: arXiv admin note: substantial text overlap with arXiv:1904.0739

    Super-Resolution via Deep Learning

    Full text link
    The recent phenomenal interest in convolutional neural networks (CNNs) must have made it inevitable for the super-resolution (SR) community to explore its potential. The response has been immense and in the last three years, since the advent of the pioneering work, there appeared too many works not to warrant a comprehensive survey. This paper surveys the SR literature in the context of deep learning. We focus on the three important aspects of multimedia - namely image, video and multi-dimensions, especially depth maps. In each case, first relevant benchmarks are introduced in the form of datasets and state of the art SR methods, excluding deep learning. Next is a detailed analysis of the individual works, each including a short description of the method and a critique of the results with special reference to the benchmarking done. This is followed by minimum overall benchmarking in the form of comparison on some common dataset, while relying on the results reported in various works

    Structural Residual Learning for Single Image Rain Removal

    Full text link
    To alleviate the adverse effect of rain streaks in image processing tasks, CNN-based single image rain removal methods have been recently proposed. However, the performance of these deep learning methods largely relies on the covering range of rain shapes contained in the pre-collected training rainy-clean image pairs. This makes them easily trapped into the overfitting-to-the-training-samples issue and cannot finely generalize to practical rainy images with complex and diverse rain streaks. Against this generalization issue, this study proposes a new network architecture by enforcing the output residual of the network possess intrinsic rain structures. Such a structural residual setting guarantees the rain layer extracted by the network finely comply with the prior knowledge of general rain streaks, and thus regulates sound rain shapes capable of being well extracted from rainy images in both training and predicting stages. Such a general regularization function naturally leads to both its better training accuracy and testing generalization capability even for those non-seen rain configurations. Such superiority is comprehensively substantiated by experiments implemented on synthetic and real datasets both visually and quantitatively as compared with current state-of-the-art methods

    Lightweight Pyramid Networks for Image Deraining

    Full text link
    Existing deep convolutional neural networks have found major success in image deraining, but at the expense of an enormous number of parameters. This limits their potential application, for example in mobile devices. In this paper, we propose a lightweight pyramid of networks (LPNet) for single image deraining. Instead of designing a complex network structures, we use domain-specific knowledge to simplify the learning process. Specifically, we find that by introducing the mature Gaussian-Laplacian image pyramid decomposition technology to the neural network, the learning problem at each pyramid level is greatly simplified and can be handled by a relatively shallow network with few parameters. We adopt recursive and residual network structures to build the proposed LPNet, which has less than 8K parameters while still achieving state-of-the-art performance on rain removal. We also discuss the potential value of LPNet for other low- and high-level vision tasks.Comment: submitted to IEEE Transactions on Neural Networks and Learning System

    Deep Learning on Image Denoising: An overview

    Full text link
    Deep learning techniques have received much attention in the area of image denoising. However, there are substantial differences in the various types of deep learning methods dealing with image denoising. Specifically, discriminative learning based on deep learning can ably address the issue of Gaussian noise. Optimization models based on deep learning are effective in estimating the real noise. However, there has thus far been little related research to summarize the different deep learning techniques for image denoising. In this paper, we offer a comparative study of deep techniques in image denoising. We first classify the deep convolutional neural networks (CNNs) for additive white noisy images; the deep CNNs for real noisy images; the deep CNNs for blind denoising and the deep CNNs for hybrid noisy images, which represents the combination of noisy, blurred and low-resolution images. Then, we analyze the motivations and principles of the different types of deep learning methods. Next, we compare the state-of-the-art methods on public denoising datasets in terms of quantitative and qualitative analysis. Finally, we point out some potential challenges and directions of future research

    Perceptual Video Super Resolution with Enhanced Temporal Consistency

    Full text link
    With the advent of perceptual loss functions, new possibilities in super-resolution have emerged, and we currently have models that successfully generate near-photorealistic high-resolution images from their low-resolution observations. Up to now, however, such approaches have been exclusively limited to single image super-resolution. The application of perceptual loss functions on video processing still entails several challenges, mostly related to the lack of temporal consistency of the generated images, i.e., flickering artifacts. In this work, we present a novel adversarial recurrent network for video upscaling that is able to produce realistic textures in a temporally consistent way. The proposed architecture naturally leverages information from previous frames due to its recurrent architecture, i.e. the input to the generator is composed of the low-resolution image and, additionally, the warped output of the network at the previous step. Together with a video discriminator, we also propose additional loss functions to further reinforce temporal consistency in the generated sequences. The experimental validation of our algorithm shows the effectiveness of our approach which obtains images with high perceptual quality and improved temporal consistency.Comment: Major revision and improvement of the manuscript: New network architecture, new loss function and extended experiment
    • …
    corecore