9,756 research outputs found
NTIRE 2020 Challenge on Image and Video Deblurring
Motion blur is one of the most common degradation artifacts in dynamic scene
photography. This paper reviews the NTIRE 2020 Challenge on Image and Video
Deblurring. In this challenge, we present the evaluation results from 3
competition tracks as well as the proposed solutions. Track 1 aims to develop
single-image deblurring methods focusing on restoration quality. On Track 2,
the image deblurring methods are executed on a mobile platform to find the
balance of the running speed and the restoration accuracy. Track 3 targets
developing video deblurring methods that exploit the temporal relation between
input frames. In each competition, there were 163, 135, and 102 registered
participants and in the final testing phase, 9, 4, and 7 teams competed. The
winning methods demonstrate the state-ofthe-art performance on image and video
deblurring tasks.Comment: To be published in CVPR 2020 Workshop (New Trends in Image
Restoration and Enhancement
NTIRE 2020 Challenge on Image Demoireing: Methods and Results
This paper reviews the Challenge on Image Demoireing that was part of the New
Trends in Image Restoration and Enhancement (NTIRE) workshop, held in
conjunction with CVPR 2020. Demoireing is a difficult task of removing moire
patterns from an image to reveal an underlying clean image. The challenge was
divided into two tracks. Track 1 targeted the single image demoireing problem,
which seeks to remove moire patterns from a single image. Track 2 focused on
the burst demoireing problem, where a set of degraded moire images of the same
scene were provided as input, with the goal of producing a single demoired
image as output. The methods were ranked in terms of their fidelity, measured
using the peak signal-to-noise ratio (PSNR) between the ground truth clean
images and the restored images produced by the participants' methods. The
tracks had 142 and 99 registered participants, respectively, with a total of 14
and 6 submissions in the final testing stage. The entries span the current
state-of-the-art in image and burst image demoireing problems
Super-Resolution with Deep Adaptive Image Resampling
Deep learning based methods have recently pushed the state-of-the-art on the
problem of Single Image Super-Resolution (SISR). In this work, we revisit the
more traditional interpolation-based methods, that were popular before, now
with the help of deep learning. In particular, we propose to use a
Convolutional Neural Network (CNN) to estimate spatially variant interpolation
kernels and apply the estimated kernels adaptively to each position in the
image. The whole model is trained in an end-to-end manner. We explore two ways
to improve the results for the case of large upscaling factors, and propose a
recursive extension of our basic model. This achieves results that are on par
with state-of-the-art methods. We visualize the estimated adaptive
interpolation kernels to gain more insight on the effectiveness of the proposed
method. We also extend the method to the task of joint image filtering and
again achieve state-of-the-art performance
Structural Residual Learning for Single Image Rain Removal
To alleviate the adverse effect of rain streaks in image processing tasks,
CNN-based single image rain removal methods have been recently proposed.
However, the performance of these deep learning methods largely relies on the
covering range of rain shapes contained in the pre-collected training
rainy-clean image pairs. This makes them easily trapped into the
overfitting-to-the-training-samples issue and cannot finely generalize to
practical rainy images with complex and diverse rain streaks. Against this
generalization issue, this study proposes a new network architecture by
enforcing the output residual of the network possess intrinsic rain structures.
Such a structural residual setting guarantees the rain layer extracted by the
network finely comply with the prior knowledge of general rain streaks, and
thus regulates sound rain shapes capable of being well extracted from rainy
images in both training and predicting stages. Such a general regularization
function naturally leads to both its better training accuracy and testing
generalization capability even for those non-seen rain configurations. Such
superiority is comprehensively substantiated by experiments implemented on
synthetic and real datasets both visually and quantitatively as compared with
current state-of-the-art methods
Recurrent Back-Projection Network for Video Super-Resolution
We proposed a novel architecture for the problem of video super-resolution.
We integrate spatial and temporal contexts from continuous video frames using a
recurrent encoder-decoder module, that fuses multi-frame information with the
more traditional, single frame super-resolution path for the target frame. In
contrast to most prior work where frames are pooled together by stacking or
warping, our model, the Recurrent Back-Projection Network (RBPN) treats each
context frame as a separate source of information. These sources are combined
in an iterative refinement framework inspired by the idea of back-projection in
multiple-image super-resolution. This is aided by explicitly representing
estimated inter-frame motion with respect to the target, rather than explicitly
aligning frames. We propose a new video super-resolution benchmark, allowing
evaluation at a larger scale and considering videos in different motion
regimes. Experimental results demonstrate that our RBPN is superior to existing
methods on several datasets.Comment: To appear in CVPR201
Learning Spatial Pyramid Attentive Pooling in Image Synthesis and Image-to-Image Translation
Image synthesis and image-to-image translation are two important generative
learning tasks. Remarkable progress has been made by learning Generative
Adversarial Networks (GANs)~\cite{goodfellow2014generative} and
cycle-consistent GANs (CycleGANs)~\cite{zhu2017unpaired} respectively. This
paper presents a method of learning Spatial Pyramid Attentive Pooling (SPAP)
which is a novel architectural unit and can be easily integrated into both
generators and discriminators in GANs and CycleGANs. The proposed SPAP
integrates Atrous spatial pyramid~\cite{chen2018deeplab}, a proposed cascade
attention mechanism and residual connections~\cite{he2016deep}. It leverages
the advantages of the three components to facilitate effective end-to-end
generative learning: (i) the capability of fusing multi-scale information by
ASPP; (ii) the capability of capturing relative importance between both spatial
locations (especially multi-scale context) or feature channels by attention;
(iii) the capability of preserving information and enhancing optimization
feasibility by residual connections. Coarse-to-fine and fine-to-coarse SPAP are
studied and intriguing attention maps are observed in both tasks. In
experiments, the proposed SPAP is tested in GANs on the Celeba-HQ-128
dataset~\cite{karras2017progressive}, and tested in CycleGANs on the
Image-to-Image translation datasets including the Cityscape
dataset~\cite{cordts2016cityscapes}, Facade and Aerial Maps
dataset~\cite{zhu2017unpaired}, both obtaining better performance.Comment: 12 page
Reverse Attention for Salient Object Detection
Benefit from the quick development of deep learning techniques, salient
object detection has achieved remarkable progresses recently. However, there
still exists following two major challenges that hinder its application in
embedded devices, low resolution output and heavy model weight. To this end,
this paper presents an accurate yet compact deep network for efficient salient
object detection. More specifically, given a coarse saliency prediction in the
deepest layer, we first employ residual learning to learn side-output residual
features for saliency refinement, which can be achieved with very limited
convolutional parameters while keep accuracy. Secondly, we further propose
reverse attention to guide such side-output residual learning in a top-down
manner. By erasing the current predicted salient regions from side-output
features, the network can eventually explore the missing object parts and
details which results in high resolution and accuracy. Experiments on six
benchmark datasets demonstrate that the proposed approach compares favorably
against state-of-the-art methods, and with advantages in terms of simplicity,
efficiency (45 FPS) and model size (81 MB).Comment: ECCV 201
Hyperspectral Image Super-resolution via Deep Progressive Zero-centric Residual Learning
This paper explores the problem of hyperspectral image (HSI) super-resolution
that merges a low resolution HSI (LR-HSI) and a high resolution multispectral
image (HR-MSI). The cross-modality distribution of the spatial and spectral
information makes the problem challenging. Inspired by the classic wavelet
decomposition-based image fusion, we propose a novel \textit{lightweight} deep
neural network-based framework, namely progressive zero-centric residual
network (PZRes-Net), to address this problem efficiently and effectively.
Specifically, PZRes-Net learns a high resolution and \textit{zero-centric}
residual image, which contains high-frequency spatial details of the scene
across all spectral bands, from both inputs in a progressive fashion along the
spectral dimension. And the resulting residual image is then superimposed onto
the up-sampled LR-HSI in a \textit{mean-value invariant} manner, leading to a
coarse HR-HSI, which is further refined by exploring the coherence across all
spectral bands simultaneously. To learn the residual image efficiently and
effectively, we employ spectral-spatial separable convolution with dense
connections. In addition, we propose zero-mean normalization implemented on the
feature maps of each layer to realize the zero-mean characteristic of the
residual image. Extensive experiments over both real and synthetic benchmark
datasets demonstrate that our PZRes-Net outperforms state-of-the-art methods to
a \textit{significant} extent in terms of both 4 quantitative metrics and
visual quality, e.g., our PZRes-Net improves the PSNR more than 3dB, while
saving 2.3 parameters and consuming 15 less FLOPs. The code is
publicly available at https://github.com/zbzhzhy/PZRes-Net
AIM 2020 Challenge on Efficient Super-Resolution: Methods and Results
This paper reviews the AIM 2020 challenge on efficient single image
super-resolution with focus on the proposed solutions and results. The
challenge task was to super-resolve an input image with a magnification factor
x4 based on a set of prior examples of low and corresponding high resolution
images. The goal is to devise a network that reduces one or several aspects
such as runtime, parameter count, FLOPs, activations, and memory consumption
while at least maintaining PSNR of MSRResNet. The track had 150 registered
participants, and 25 teams submitted the final results. They gauge the
state-of-the-art in efficient single image super-resolution
Deep Generative Filter for Motion Deblurring
Removing blur caused by camera shake in images has always been a challenging
problem in computer vision literature due to its ill-posed nature. Motion blur
caused due to the relative motion between the camera and the object in 3D space
induces a spatially varying blurring effect over the entire image. In this
paper, we propose a novel deep filter based on Generative Adversarial Network
(GAN) architecture integrated with global skip connection and dense
architecture in order to tackle this problem. Our model, while bypassing the
process of blur kernel estimation, significantly reduces the test time which is
necessary for practical applications. The experiments on the benchmark datasets
prove the effectiveness of the proposed method which outperforms the
state-of-the-art blind deblurring algorithms both quantitatively and
qualitatively
- …