5,567 research outputs found
A Deep Journey into Super-resolution: A survey
Deep convolutional networks based super-resolution is a fast-growing field
with numerous practical applications. In this exposition, we extensively
compare 30+ state-of-the-art super-resolution Convolutional Neural Networks
(CNNs) over three classical and three recently introduced challenging datasets
to benchmark single image super-resolution. We introduce a taxonomy for
deep-learning based super-resolution networks that groups existing methods into
nine categories including linear, residual, multi-branch, recursive,
progressive, attention-based and adversarial designs. We also provide
comparisons between the models in terms of network complexity, memory
footprint, model input and output, learning details, the type of network losses
and important architectural differences (e.g., depth, skip-connections,
filters). The extensive evaluation performed, shows the consistent and rapid
growth in the accuracy in the past few years along with a corresponding boost
in model complexity and the availability of large-scale datasets. It is also
observed that the pioneering methods identified as the benchmark have been
significantly outperformed by the current contenders. Despite the progress in
recent years, we identify several shortcomings of existing techniques and
provide future research directions towards the solution of these open problems.Comment: Accepted in ACM Computing Survey
Lightweight and Efficient Image Super-Resolution with Block State-based Recursive Network
Recently, several deep learning-based image super-resolution methods have
been developed by stacking massive numbers of layers. However, this leads too
large model sizes and high computational complexities, thus some recursive
parameter-sharing methods have been also proposed. Nevertheless, their designs
do not properly utilize the potential of the recursive operation. In this
paper, we propose a novel, lightweight, and efficient super-resolution method
to maximize the usefulness of the recursive architecture, by introducing block
state-based recursive network. By taking advantage of utilizing the block
state, the recursive part of our model can easily track the status of the
current image features. We show the benefits of the proposed method in terms of
model size, speed, and efficiency. In addition, we show that our method
outperforms the other state-of-the-art methods.Comment: The code is available at https://github.com/idearibosome/tf-bsrn-s
SESR: Single Image Super Resolution with Recursive Squeeze and Excitation Networks
Single image super resolution is a very important computer vision task, with
a wide range of applications. In recent years, the depth of the
super-resolution model has been constantly increasing, but with a small
increase in performance, it has brought a huge amount of computation and memory
consumption. In this work, in order to make the super resolution models more
effective, we proposed a novel single image super resolution method via
recursive squeeze and excitation networks (SESR). By introducing the squeeze
and excitation module, our SESR can model the interdependencies and
relationships between channels and that makes our model more efficiency. In
addition, the recursive structure and progressive reconstruction method in our
model minimized the layers and parameters and enabled SESR to simultaneously
train multi-scale super resolution in a single model. After evaluating on four
benchmark test sets, our model is proved to be above the state-of-the-art
methods in terms of speed and accuracy.Comment: Preprint version with 6 pages for ICPR1
SREdgeNet: Edge Enhanced Single Image Super Resolution using Dense Edge Detection Network and Feature Merge Network
Deep learning based single image super-resolution (SR) methods have been
rapidly evolved over the past few years and have yielded state-of-the-art
performances over conventional methods. Since these methods usually minimized
l1 loss between the output SR image and the ground truth image, they yielded
very high peak signal-to-noise ratio (PSNR) that is inversely proportional to
these losses. Unfortunately, minimizing these losses inevitably lead to blurred
edges due to averaging of plausible solutions. Recently, SRGAN was proposed to
avoid this average effect by minimizing perceptual losses instead of l1 loss
and it yielded perceptually better SR images (or images with sharp edges) at
the price of lowering PSNR. In this paper, we propose SREdgeNet, edge enhanced
single image SR network, that was inspired by conventional SR theories so that
average effect could be avoided not by changing the loss, but by changing the
SR network property with the same l1 loss. Our SREdgeNet consists of 3
sequential deep neural network modules: the first module is any
state-of-the-art SR network and we selected a variant of EDSR. The second
module is any edge detection network taking the output of the first SR module
as an input and we propose DenseEdgeNet for this module. Lastly, the third
module is merging the outputs of the first and second modules to yield edge
enhanced SR image and we propose MergeNet for this module. Qualitatively, our
proposed method yielded images with sharp edges compared to other
state-of-the-art SR methods. Quantitatively, our SREdgeNet yielded
state-of-the-art performance in terms of structural similarity (SSIM) while
maintained comparable PSNR for x8 enlargement.Comment: 10 pages, 9 figure
Triple Attention Mixed Link Network for Single Image Super Resolution
Single image super resolution is of great importance as a low-level computer
vision task. Recent approaches with deep convolutional neural networks have
achieved im-pressive performance. However, existing architectures have
limitations due to the less sophisticated structure along with less strong
representational power. In this work, to significantly enhance the feature
representation, we proposed Triple Attention mixed link Network (TAN) which
consists of 1) three different aspects (i.e., kernel, spatial and channel) of
attention mechanisms and 2) fu-sion of both powerful residual and dense
connections (i.e., mixed link). Specifically, the network with multi kernel
learns multi hierarchical representations under different receptive fields. The
output features are recalibrated by the effective kernel and channel attentions
and feed into next layer partly residual and partly dense, which filters the
information and enable the network to learn more powerful representations. The
features finally pass through the spatial attention in the reconstruction
network which generates a fusion of local and global information, let the
network restore more details and improves the quality of reconstructed images.
Thanks to the diverse feature recalibrations and the advanced information flow
topology, our proposed model is strong enough to per-form against the
state-of-the-art methods on the bench-mark evaluations
Efficient Deep Neural Network for Photo-realistic Image Super-Resolution
Recent progress in the deep learning-based models has improved
photo-realistic (or perceptual) single-image super-resolution significantly.
However, despite their powerful performance, many methods are difficult to
apply to real-world applications because of the heavy computational
requirements. To facilitate the use of a deep model under such demands, we
focus on keeping the network efficient while maintaining its performance. In
detail, we design an architecture that implements a cascading mechanism on a
residual network to boost the performance with limited resources via
multi-level feature fusion. In addition, our proposed model adopts group
convolution and recursive scheme in order to achieve extreme efficiency. We
further improve the perceptual quality of the output by employing the
adversarial learning paradigm and a multi-scale discriminator approach. The
performance of our method is investigated through extensive internal
experiments and benchmark using various datasets. Our results show that our
models outperform the recent methods with similar complexity, for both
traditional pixel-based and perception-based tasks
Adapting Image Super-Resolution State-of-the-arts and Learning Multi-model Ensemble for Video Super-Resolution
Recently, image super-resolution has been widely studied and achieved
significant progress by leveraging the power of deep convolutional neural
networks. However, there has been limited advancement in video super-resolution
(VSR) due to the complex temporal patterns in videos. In this paper, we
investigate how to adapt state-of-the-art methods of image super-resolution for
video super-resolution. The proposed adapting method is straightforward. The
information among successive frames is well exploited, while the overhead on
the original image super-resolution method is negligible. Furthermore, we
propose a learning-based method to ensemble the outputs from multiple
super-resolution models. Our methods show superior performance and rank second
in the NTIRE2019 Video Super-Resolution Challenge Track 1
Enhancing Perceptual Loss with Adversarial Feature Matching for Super-Resolution
Single image super-resolution (SISR) is an ill-posed problem with an
indeterminate number of valid solutions. Solving this problem with neural
networks would require access to extensive experience, either presented as a
large training set over natural images or a condensed representation from
another pre-trained network. Perceptual loss functions, which belong to the
latter category, have achieved breakthrough success in SISR and several other
computer vision tasks. While perceptual loss plays a central role in the
generation of photo-realistic images, it also produces undesired pattern
artifacts in the super-resolved outputs. In this paper, we show that the root
cause of these pattern artifacts can be traced back to a mismatch between the
pre-training objective of perceptual loss and the super-resolution objective.
To address this issue, we propose to augment the existing perceptual loss
formulation with a novel content loss function that uses the latent features of
a discriminator network to filter the unwanted artifacts across several levels
of adversarial similarity. Further, our modification has a stabilizing effect
on non-convex optimization in adversarial training. The proposed approach
offers notable gains in perceptual quality based on an extensive human
evaluation study and a competent reconstruction fidelity when tested on
objective evaluation metrics.Comment: Accepted for publication in the International Joint Conference on
Neural Networks (IJCNN) 202
Super-Resolution with Deep Adaptive Image Resampling
Deep learning based methods have recently pushed the state-of-the-art on the
problem of Single Image Super-Resolution (SISR). In this work, we revisit the
more traditional interpolation-based methods, that were popular before, now
with the help of deep learning. In particular, we propose to use a
Convolutional Neural Network (CNN) to estimate spatially variant interpolation
kernels and apply the estimated kernels adaptively to each position in the
image. The whole model is trained in an end-to-end manner. We explore two ways
to improve the results for the case of large upscaling factors, and propose a
recursive extension of our basic model. This achieves results that are on par
with state-of-the-art methods. We visualize the estimated adaptive
interpolation kernels to gain more insight on the effectiveness of the proposed
method. We also extend the method to the task of joint image filtering and
again achieve state-of-the-art performance
Image Super-Resolution via Dual-State Recurrent Networks
Advances in image super-resolution (SR) have recently benefited significantly
from rapid developments in deep neural networks. Inspired by these recent
discoveries, we note that many state-of-the-art deep SR architectures can be
reformulated as a single-state recurrent neural network (RNN) with finite
unfoldings. In this paper, we explore new structures for SR based on this
compact RNN view, leading us to a dual-state design, the Dual-State Recurrent
Network (DSRN). Compared to its single state counterparts that operate at a
fixed spatial resolution, DSRN exploits both low-resolution (LR) and
high-resolution (HR) signals jointly. Recurrent signals are exchanged between
these states in both directions (both LR to HR and HR to LR) via delayed
feedback. Extensive quantitative and qualitative evaluations on benchmark
datasets and on a recent challenge demonstrate that the proposed DSRN performs
favorably against state-of-the-art algorithms in terms of both memory
consumption and predictive accuracy
- …