240 research outputs found
The 2018 PIRM Challenge on Perceptual Image Super-resolution
This paper reports on the 2018 PIRM challenge on perceptual super-resolution
(SR), held in conjunction with the Perceptual Image Restoration and
Manipulation (PIRM) workshop at ECCV 2018. In contrast to previous SR
challenges, our evaluation methodology jointly quantifies accuracy and
perceptual quality, therefore enabling perceptual-driven methods to compete
alongside algorithms that target PSNR maximization. Twenty-one participating
teams introduced algorithms which well-improved upon the existing
state-of-the-art methods in perceptual SR, as confirmed by a human opinion
study. We also analyze popular image quality measures and draw conclusions
regarding which of them correlates best with human opinion scores. We conclude
with an analysis of the current trends in perceptual SR, as reflected from the
leading submissions.Comment: Workshop and Challenge on Perceptual Image Restoration and
Manipulation in conjunction with ECCV 2018 webpage: https://www.pirm2018.org
Perception-oriented Single Image Super-Resolution via Dual Relativistic Average Generative Adversarial Networks
The presence of residual and dense neural networks which greatly promotes the
development of image Super-Resolution(SR) have witnessed a lot of impressive
results. Depending on our observation, although more layers and connections
could always improve performance, the increase of model parameters is not
conducive to launch application of SR algorithms. Furthermore, algorithms
supervised by L1/L2 loss can achieve considerable performance on traditional
metrics such as PSNR and SSIM, yet resulting in blurry and over-smoothed
outputs without sufficient high-frequency details, namely low perceptual
index(PI). Regarding the issues, this paper develops a perception-oriented
single image SR algorithm via dual relativistic average generative adversarial
networks. In the generator part, a novel residual channel attention block is
proposed to recalibrate significance of specific channels, further increasing
feature expression capabilities. Parameters of convolutional layers within each
block are shared to expand receptive fields while maintain the amount of
tunable parameters unchanged. The feature maps are subsampled using sub-pixel
convolution to obtain reconstructed high-resolution images. The discriminator
part consists of two relativistic average discriminators that work in pixel
domain and feature domain, respectively, fully exploiting the prior that half
of data in a mini-batch are fake. Different weighted combinations of perceptual
loss and adversarial loss are utilized to supervise the generator to
equilibrate perceptual quality and objective results. Experimental results and
ablation studies show that our proposed algorithm can rival state-of-the-art SR
algorithms, both perceptually(PI-minimization) and
objectively(PSNR-maximization) with fewer parameters.Comment: Re-submit after codes reviewin
A Deep Journey into Super-resolution: A survey
Deep convolutional networks based super-resolution is a fast-growing field
with numerous practical applications. In this exposition, we extensively
compare 30+ state-of-the-art super-resolution Convolutional Neural Networks
(CNNs) over three classical and three recently introduced challenging datasets
to benchmark single image super-resolution. We introduce a taxonomy for
deep-learning based super-resolution networks that groups existing methods into
nine categories including linear, residual, multi-branch, recursive,
progressive, attention-based and adversarial designs. We also provide
comparisons between the models in terms of network complexity, memory
footprint, model input and output, learning details, the type of network losses
and important architectural differences (e.g., depth, skip-connections,
filters). The extensive evaluation performed, shows the consistent and rapid
growth in the accuracy in the past few years along with a corresponding boost
in model complexity and the availability of large-scale datasets. It is also
observed that the pioneering methods identified as the benchmark have been
significantly outperformed by the current contenders. Despite the progress in
recent years, we identify several shortcomings of existing techniques and
provide future research directions towards the solution of these open problems.Comment: Accepted in ACM Computing Survey
Joint Demosaicing and Super-Resolution (JDSR): Network Design and Perceptual Optimization
Image demosaicing and super-resolution are two important tasks in color
imaging pipeline. So far they have been mostly independently studied in the
open literature of deep learning; little is known about the potential benefit
of formulating a joint demosaicing and super-resolution (JDSR) problem. In this
paper, we propose an end-to-end optimization solution to the JDSR problem and
demonstrate its practical significance in computational imaging. Our technical
contributions are mainly two-fold. On network design, we have developed a
Residual-Dense Squeeze-and-Excitation Networks (RDSEN) supported by a
pre-demosaicing network (PDNet) as the pre-processing step. We address the
issue of spatio-spectral attention for color-filter-array (CFA) data and
discuss how to achieve better information flow by concatenating Residue-Dense
Squeeze-and-Excitation Blocks (RDSEBs) for JDSR. Experimental results have
shown that significant PSNR/SSIM gain can be achieved by RDSEN over previous
network architectures including state-of-the-art RCAN. On perceptual
optimization, we propose to leverage the latest ideas including relativistic
discriminator and pre-excitation perceptual loss function to further improve
the visual quality of textured regions in reconstructed images. Our extensive
experiment results have shown that Texture-enhanced Relativistic average
Generative Adversarial Network (TRaGAN) can produce both subjectively more
pleasant images and objectively lower perceptual distortion scores than
standard GAN for JDSR. Finally, we have verified the benefit of JDSR to
high-quality image reconstruction from real-world Bayer pattern data collected
by NASA Mars Curiosity.Comment: IEEE Transactions on Computational Imagin
MetricGAN: Generative Adversarial Networks based Black-box Metric Scores Optimization for Speech Enhancement
Adversarial loss in a conditional generative adversarial network (GAN) is not
designed to directly optimize evaluation metrics of a target task, and thus,
may not always guide the generator in a GAN to generate data with improved
metric scores. To overcome this issue, we propose a novel MetricGAN approach
with an aim to optimize the generator with respect to one or multiple
evaluation metrics. Moreover, based on MetricGAN, the metric scores of the
generated data can also be arbitrarily specified by users. We tested the
proposed MetricGAN on a speech enhancement task, which is particularly suitable
to verify the proposed approach because there are multiple metrics measuring
different aspects of speech signals. Moreover, these metrics are generally
complex and could not be fully optimized by Lp or conventional adversarial
losses.Comment: Accepted by Thirty-sixth International Conference on Machine Learning
(ICML) 201
Efficient Deep Neural Network for Photo-realistic Image Super-Resolution
Recent progress in the deep learning-based models has improved
photo-realistic (or perceptual) single-image super-resolution significantly.
However, despite their powerful performance, many methods are difficult to
apply to real-world applications because of the heavy computational
requirements. To facilitate the use of a deep model under such demands, we
focus on keeping the network efficient while maintaining its performance. In
detail, we design an architecture that implements a cascading mechanism on a
residual network to boost the performance with limited resources via
multi-level feature fusion. In addition, our proposed model adopts group
convolution and recursive scheme in order to achieve extreme efficiency. We
further improve the perceptual quality of the output by employing the
adversarial learning paradigm and a multi-scale discriminator approach. The
performance of our method is investigated through extensive internal
experiments and benchmark using various datasets. Our results show that our
models outperform the recent methods with similar complexity, for both
traditional pixel-based and perception-based tasks
PCA-SRGAN: Incremental Orthogonal Projection Discrimination for Face Super-resolution
Generative Adversarial Networks (GAN) have been employed for face super
resolution but they bring distorted facial details easily and still have
weakness on recovering realistic texture. To further improve the performance of
GAN based models on super-resolving face images, we propose PCA-SRGAN which
pays attention to the cumulative discrimination in the orthogonal projection
space spanned by PCA projection matrix of face data. By feeding the principal
component projections ranging from structure to details into the discriminator,
the discrimination difficulty will be greatly alleviated and the generator can
be enhanced to reconstruct clearer contour and finer texture, helpful to
achieve the high perception and low distortion eventually. This incremental
orthogonal projection discrimination has ensured a precise optimization
procedure from coarse to fine and avoids the dependence on the perceptual
regularization. We conduct experiments on CelebA and FFHQ face datasets. The
qualitative visual effect and quantitative evaluation have demonstrated the
overwhelming performance of our model over related works
Learning a Deep Convolution Network with Turing Test Adversaries for Microscopy Image Super Resolution
Adversarially trained deep neural networks have significantly improved
performance of single image super resolution, by hallucinating photorealistic
local textures, thereby greatly reducing the perception difference between a
real high resolution image and its super resolved (SR) counterpart. However,
application to medical imaging requires preservation of diagnostically relevant
features while refraining from introducing any diagnostically confusing
artifacts. We propose using a deep convolutional super resolution network
(SRNet) trained for (i) minimising reconstruction loss between the real and SR
images, and (ii) maximally confusing learned relativistic visual Turing test
(rVTT) networks to discriminate between (a) pair of real and SR images (T1) and
(b) pair of patches in real and SR selected from region of interest (T2). The
adversarial loss of T1 and T2 while backpropagated through SRNet helps it learn
to reconstruct pathorealism in the regions of interest such as white blood
cells (WBC) in peripheral blood smears or epithelial cells in histopathology of
cancerous biopsy tissues, which are experimentally demonstrated here.
Experiments performed for measuring signal distortion loss using peak signal to
noise ratio (pSNR) and structural similarity (SSIM) with variation of SR scale
factors, impact of rVTT adversarial losses, and impact on reporting using SR on
a commercially available artificial intelligence (AI) digital pathology system
substantiate our claims.Comment: To appear in the Proceedings of the 2019 IEEE International Symposium
on Biomedical Imaging (ISBI 2019
Use of Generative Adversarial Network Algorithm in Super-Resolution Images to Increase the Quality of Digital Elevation Models Based on ALOS PALSAR Data
Digital elevation models are responsible for providing altimetric information on a surface to be mapped. While global models of low and medium spatial resolution are available open source by several space agencies, the high- resolution ones, which are utilized in scales 1:25,000 and larger, are scarce and expensive. Here we address this limitation by the utilization of deep learning algorithms coupled with Single Image Super-Resolution techniques in digital elevation models to obtain better spatial quality versions from lower resolution inputs. The development of a GAN-based (Generative Adversarial Network-based) methodology enables the improvement of the initial spatial resolution of low-resolution images. In the geospatial data context, for example, these algorithms can be used with digital elevation models and satellite images. The methodological approach uses a dataset with digital elevation models SRTM (Shuttle Radar Topography Mission) (30 meters of spatial resolution) and ALOS PALSAR (12.5 meters of spatial resolution), created with the objective of allowing the study to be carried out, promoting the emergence of new research groups in the area as well as enabling the comparison between the results obtained. It has been found that by increasing the number of iterations the performance of the generated model was improved and the quality of the generated image increased. Furthermore, the visual analysis of the generated image against the high- and low-resolution ones showed a great similarity between the first two
PIRM Challenge on Perceptual Image Enhancement on Smartphones: Report
This paper reviews the first challenge on efficient perceptual image
enhancement with the focus on deploying deep learning models on smartphones.
The challenge consisted of two tracks. In the first one, participants were
solving the classical image super-resolution problem with a bicubic downscaling
factor of 4. The second track was aimed at real-world photo enhancement, and
the goal was to map low-quality photos from the iPhone 3GS device to the same
photos captured with a DSLR camera. The target metric used in this challenge
combined the runtime, PSNR scores and solutions' perceptual results measured in
the user study. To ensure the efficiency of the submitted models, we
additionally measured their runtime and memory requirements on Android
smartphones. The proposed solutions significantly improved baseline results
defining the state-of-the-art for image enhancement on smartphones
- …