7 research outputs found
Learning Deformable Kernels for Image and Video Denoising
Most of the classical denoising methods restore clear results by selecting
and averaging pixels in the noisy input. Instead of relying on hand-crafted
selecting and averaging strategies, we propose to explicitly learn this process
with deep neural networks. Specifically, we propose deformable 2D kernels for
image denoising where the sampling locations and kernel weights are both
learned. The proposed kernel naturally adapts to image structures and could
effectively reduce the oversmoothing artifacts. Furthermore, we develop 3D
deformable kernels for video denoising to more efficiently sample pixels across
the spatial-temporal space. Our method is able to solve the misalignment issues
of large motion from dynamic scenes. For better training our video denoising
model, we introduce the trilinear sampler and a new regularization term. We
demonstrate that the proposed method performs favorably against the
state-of-the-art image and video denoising approaches on both synthetic and
real-world data.Comment: 10 page
Attention Mechanism Enhanced Kernel Prediction Networks for Denoising of Burst Images
Deep learning based image denoising methods have been extensively
investigated. In this paper, attention mechanism enhanced kernel prediction
networks (AME-KPNs) are proposed for burst image denoising, in which, nearly
cost-free attention modules are adopted to first refine the feature maps and to
further make a full use of the inter-frame and intra-frame redundancies within
the whole image burst. The proposed AME-KPNs output per-pixel
spatially-adaptive kernels, residual maps and corresponding weight maps, in
which, the predicted kernels roughly restore clean pixels at their
corresponding locations via an adaptive convolution operation, and
subsequently, residuals are weighted and summed to compensate the limited
receptive field of predicted kernels. Simulations and real-world experiments
are conducted to illustrate the robustness of the proposed AME-KPNs in burst
image denoising.Comment: accepted by ICASSP 202
Quadratic video interpolation
Video interpolation is an important problem in computer vision, which helps
overcome the temporal limitation of camera sensors. Existing video
interpolation methods usually assume uniform motion between consecutive frames
and use linear models for interpolation, which cannot well approximate the
complex motion in the real world. To address these issues, we propose a
quadratic video interpolation method which exploits the acceleration
information in videos. This method allows prediction with curvilinear
trajectory and variable velocity, and generates more accurate interpolation
results. For high-quality frame synthesis, we develop a flow reversal layer to
estimate flow fields starting from the unknown target frame to the source
frame. In addition, we present techniques for flow refinement. Extensive
experiments demonstrate that our approach performs favorably against the
existing linear models on a wide variety of video datasets.Comment: NeurIPS 2019, project website:
https://sites.google.com/view/xiangyuxu/qvi_nips1
Learnable Sampling 3D Convolution for Video Enhancement and Action Recognition
A key challenge in video enhancement and action recognition is to fuse useful
information from neighboring frames. Recent works suggest establishing accurate
correspondences between neighboring frames before fusing temporal information.
However, the generated results heavily depend on the quality of correspondence
estimation. In this paper, we propose a more robust solution: \emph{sampling
and fusing multi-level features} across neighborhood frames to generate the
results. Based on this idea, we introduce a new module to improve the
capability of 3D convolution, namely, learnable sampling 3D convolution
(\emph{LS3D-Conv}). We add learnable 2D offsets to 3D convolution which aims to
sample locations on spatial feature maps across frames. The offsets can be
learned for specific tasks. The \emph{LS3D-Conv} can flexibly replace 3D
convolution layers in existing 3D networks and get new architectures, which
learns the sampling at multiple feature levels. The experiments on video
interpolation, video super-resolution, video denoising, and action recognition
demonstrate the effectiveness of our approach
Unified Dynamic Convolutional Network for Super-Resolution with Variational Degradations
Deep Convolutional Neural Networks (CNNs) have achieved remarkable results on
Single Image Super-Resolution (SISR). Despite considering only a single
degradation, recent studies also include multiple degrading effects to better
reflect real-world cases. However, most of the works assume a fixed combination
of degrading effects, or even train an individual network for different
combinations. Instead, a more practical approach is to train a single network
for wide-ranging and variational degradations. To fulfill this requirement,
this paper proposes a unified network to accommodate the variations from
inter-image (cross-image variations) and intra-image (spatial variations).
Different from the existing works, we incorporate dynamic convolution which is
a far more flexible alternative to handle different variations. In SISR with
non-blind setting, our Unified Dynamic Convolutional Network for Variational
Degradations (UDVD) is evaluated on both synthetic and real images with an
extensive set of variations. The qualitative results demonstrate the
effectiveness of UDVD over various existing works. Extensive experiments show
that our UDVD achieves favorable or comparable performance on both synthetic
and real images.Comment: CVPR 202
Deep Bilateral Retinex for Low-Light Image Enhancement
Low-light images, i.e. the images captured in low-light conditions, suffer
from very poor visibility caused by low contrast, color distortion and
significant measurement noise. Low-light image enhancement is about improving
the visibility of low-light images. As the measurement noise in low-light
images is usually significant yet complex with spatially-varying
characteristic, how to handle the noise effectively is an important yet
challenging problem in low-light image enhancement. Based on the Retinex
decomposition of natural images, this paper proposes a deep learning method for
low-light image enhancement with a particular focus on handling the measurement
noise. The basic idea is to train a neural network to generate a set of
pixel-wise operators for simultaneously predicting the noise and the
illumination layer, where the operators are defined in the bilateral space.
Such an integrated approach allows us to have an accurate prediction of the
reflectance layer in the presence of significant spatially-varying measurement
noise. Extensive experiments on several benchmark datasets have shown that the
proposed method is very competitive to the state-of-the-art methods, and has
significant advantage over others when processing images captured in extremely
low lighting conditions.Comment: 15 page
Deep Learning on Image Denoising: An overview
Deep learning techniques have received much attention in the area of image
denoising. However, there are substantial differences in the various types of
deep learning methods dealing with image denoising. Specifically,
discriminative learning based on deep learning can ably address the issue of
Gaussian noise. Optimization models based on deep learning are effective in
estimating the real noise. However, there has thus far been little related
research to summarize the different deep learning techniques for image
denoising. In this paper, we offer a comparative study of deep techniques in
image denoising. We first classify the deep convolutional neural networks
(CNNs) for additive white noisy images; the deep CNNs for real noisy images;
the deep CNNs for blind denoising and the deep CNNs for hybrid noisy images,
which represents the combination of noisy, blurred and low-resolution images.
Then, we analyze the motivations and principles of the different types of deep
learning methods. Next, we compare the state-of-the-art methods on public
denoising datasets in terms of quantitative and qualitative analysis. Finally,
we point out some potential challenges and directions of future research