443 research outputs found
A Deep Journey into Super-resolution: A survey
Deep convolutional networks based super-resolution is a fast-growing field
with numerous practical applications. In this exposition, we extensively
compare 30+ state-of-the-art super-resolution Convolutional Neural Networks
(CNNs) over three classical and three recently introduced challenging datasets
to benchmark single image super-resolution. We introduce a taxonomy for
deep-learning based super-resolution networks that groups existing methods into
nine categories including linear, residual, multi-branch, recursive,
progressive, attention-based and adversarial designs. We also provide
comparisons between the models in terms of network complexity, memory
footprint, model input and output, learning details, the type of network losses
and important architectural differences (e.g., depth, skip-connections,
filters). The extensive evaluation performed, shows the consistent and rapid
growth in the accuracy in the past few years along with a corresponding boost
in model complexity and the availability of large-scale datasets. It is also
observed that the pioneering methods identified as the benchmark have been
significantly outperformed by the current contenders. Despite the progress in
recent years, we identify several shortcomings of existing techniques and
provide future research directions towards the solution of these open problems.Comment: Accepted in ACM Computing Survey
Fast and Accurate Image Super-Resolution with Deep Laplacian Pyramid Networks
Convolutional neural networks have recently demonstrated high-quality
reconstruction for single image super-resolution. However, existing methods
often require a large number of network parameters and entail heavy
computational loads at runtime for generating high-accuracy super-resolution
results. In this paper, we propose the deep Laplacian Pyramid Super-Resolution
Network for fast and accurate image super-resolution. The proposed network
progressively reconstructs the sub-band residuals of high-resolution images at
multiple pyramid levels. In contrast to existing methods that involve the
bicubic interpolation for pre-processing (which results in large feature maps),
the proposed method directly extracts features from the low-resolution input
space and thereby entails low computational loads. We train the proposed
network with deep supervision using the robust Charbonnier loss functions and
achieve high-quality image reconstruction. Furthermore, we utilize the
recursive layers to share parameters across as well as within pyramid levels,
and thus drastically reduce the number of parameters. Extensive quantitative
and qualitative evaluations on benchmark datasets show that the proposed
algorithm performs favorably against the state-of-the-art methods in terms of
run-time and image quality.Comment: The code and datasets are available at
http://vllab.ucmerced.edu/wlai24/LapSRN
Super-Resolution with Deep Adaptive Image Resampling
Deep learning based methods have recently pushed the state-of-the-art on the
problem of Single Image Super-Resolution (SISR). In this work, we revisit the
more traditional interpolation-based methods, that were popular before, now
with the help of deep learning. In particular, we propose to use a
Convolutional Neural Network (CNN) to estimate spatially variant interpolation
kernels and apply the estimated kernels adaptively to each position in the
image. The whole model is trained in an end-to-end manner. We explore two ways
to improve the results for the case of large upscaling factors, and propose a
recursive extension of our basic model. This achieves results that are on par
with state-of-the-art methods. We visualize the estimated adaptive
interpolation kernels to gain more insight on the effectiveness of the proposed
method. We also extend the method to the task of joint image filtering and
again achieve state-of-the-art performance
A Multi-Scale and Multi-Depth Convolutional Neural Network for Remote Sensing Imagery Pan-Sharpening
Pan-sharpening is a fundamental and significant task in the field of remote
sensing imagery processing, in which high-resolution spatial details from
panchromatic images are employed to enhance the spatial resolution of
multi-spectral (MS) images. As the transformation from low spatial resolution
MS image to high-resolution MS image is complex and highly non-linear, inspired
by the powerful representation for non-linear relationships of deep neural
networks, we introduce multi-scale feature extraction and residual learning
into the basic convolutional neural network (CNN) architecture and propose the
multi-scale and multi-depth convolutional neural network (MSDCNN) for the
pan-sharpening of remote sensing imagery. Both the quantitative assessment
results and the visual assessment confirm that the proposed network yields
high-resolution MS images that are superior to the images produced by the
compared state-of-the-art methods
Attention Based Real Image Restoration
Deep convolutional neural networks perform better on images containing
spatially invariant degradations, also known as synthetic degradations;
however, their performance is limited on real-degraded photographs and requires
multiple-stage network modeling. To advance the practicability of restoration
algorithms, this paper proposes a novel single-stage blind real image
restoration network (RNet) by employing a modular architecture. We use a
residual on the residual structure to ease the flow of low-frequency
information and apply feature attention to exploit the channel dependencies.
Furthermore, the evaluation in terms of quantitative metrics and visual quality
for four restoration tasks i.e. Denoising, Super-resolution, Raindrop Removal,
and JPEG Compression on 11 real degraded datasets against more than 30
state-of-the-art algorithms demonstrate the superiority of our RNet. We
also present the comparison on three synthetically generated degraded datasets
for denoising to showcase the capability of our method on synthetics denoising.
The codes, trained models, and results are available on
https://github.com/saeed-anwar/R2Net.Comment: arXiv admin note: substantial text overlap with arXiv:1904.0739
Super-Resolution via Deep Learning
The recent phenomenal interest in convolutional neural networks (CNNs) must
have made it inevitable for the super-resolution (SR) community to explore its
potential. The response has been immense and in the last three years, since the
advent of the pioneering work, there appeared too many works not to warrant a
comprehensive survey. This paper surveys the SR literature in the context of
deep learning. We focus on the three important aspects of multimedia - namely
image, video and multi-dimensions, especially depth maps. In each case, first
relevant benchmarks are introduced in the form of datasets and state of the art
SR methods, excluding deep learning. Next is a detailed analysis of the
individual works, each including a short description of the method and a
critique of the results with special reference to the benchmarking done. This
is followed by minimum overall benchmarking in the form of comparison on some
common dataset, while relying on the results reported in various works
Structural Residual Learning for Single Image Rain Removal
To alleviate the adverse effect of rain streaks in image processing tasks,
CNN-based single image rain removal methods have been recently proposed.
However, the performance of these deep learning methods largely relies on the
covering range of rain shapes contained in the pre-collected training
rainy-clean image pairs. This makes them easily trapped into the
overfitting-to-the-training-samples issue and cannot finely generalize to
practical rainy images with complex and diverse rain streaks. Against this
generalization issue, this study proposes a new network architecture by
enforcing the output residual of the network possess intrinsic rain structures.
Such a structural residual setting guarantees the rain layer extracted by the
network finely comply with the prior knowledge of general rain streaks, and
thus regulates sound rain shapes capable of being well extracted from rainy
images in both training and predicting stages. Such a general regularization
function naturally leads to both its better training accuracy and testing
generalization capability even for those non-seen rain configurations. Such
superiority is comprehensively substantiated by experiments implemented on
synthetic and real datasets both visually and quantitatively as compared with
current state-of-the-art methods
Lightweight Pyramid Networks for Image Deraining
Existing deep convolutional neural networks have found major success in image
deraining, but at the expense of an enormous number of parameters. This limits
their potential application, for example in mobile devices. In this paper, we
propose a lightweight pyramid of networks (LPNet) for single image deraining.
Instead of designing a complex network structures, we use domain-specific
knowledge to simplify the learning process. Specifically, we find that by
introducing the mature Gaussian-Laplacian image pyramid decomposition
technology to the neural network, the learning problem at each pyramid level is
greatly simplified and can be handled by a relatively shallow network with few
parameters. We adopt recursive and residual network structures to build the
proposed LPNet, which has less than 8K parameters while still achieving
state-of-the-art performance on rain removal. We also discuss the potential
value of LPNet for other low- and high-level vision tasks.Comment: submitted to IEEE Transactions on Neural Networks and Learning
System
Deep Learning on Image Denoising: An overview
Deep learning techniques have received much attention in the area of image
denoising. However, there are substantial differences in the various types of
deep learning methods dealing with image denoising. Specifically,
discriminative learning based on deep learning can ably address the issue of
Gaussian noise. Optimization models based on deep learning are effective in
estimating the real noise. However, there has thus far been little related
research to summarize the different deep learning techniques for image
denoising. In this paper, we offer a comparative study of deep techniques in
image denoising. We first classify the deep convolutional neural networks
(CNNs) for additive white noisy images; the deep CNNs for real noisy images;
the deep CNNs for blind denoising and the deep CNNs for hybrid noisy images,
which represents the combination of noisy, blurred and low-resolution images.
Then, we analyze the motivations and principles of the different types of deep
learning methods. Next, we compare the state-of-the-art methods on public
denoising datasets in terms of quantitative and qualitative analysis. Finally,
we point out some potential challenges and directions of future research
Perceptual Video Super Resolution with Enhanced Temporal Consistency
With the advent of perceptual loss functions, new possibilities in
super-resolution have emerged, and we currently have models that successfully
generate near-photorealistic high-resolution images from their low-resolution
observations. Up to now, however, such approaches have been exclusively limited
to single image super-resolution. The application of perceptual loss functions
on video processing still entails several challenges, mostly related to the
lack of temporal consistency of the generated images, i.e., flickering
artifacts. In this work, we present a novel adversarial recurrent network for
video upscaling that is able to produce realistic textures in a temporally
consistent way. The proposed architecture naturally leverages information from
previous frames due to its recurrent architecture, i.e. the input to the
generator is composed of the low-resolution image and, additionally, the warped
output of the network at the previous step. Together with a video
discriminator, we also propose additional loss functions to further reinforce
temporal consistency in the generated sequences. The experimental validation of
our algorithm shows the effectiveness of our approach which obtains images with
high perceptual quality and improved temporal consistency.Comment: Major revision and improvement of the manuscript: New network
architecture, new loss function and extended experiment
- …