130 research outputs found
Learning Optimization-inspired Image Propagation with Control Mechanisms and Architecture Augmentations for Low-level Vision
In recent years, building deep learning models from optimization perspectives
has becoming a promising direction for solving low-level vision problems. The
main idea of most existing approaches is to straightforwardly combine numerical
iterations with manually designed network architectures to generate image
propagations for specific kinds of optimization models. However, these
heuristic learning models often lack mechanisms to control the propagation and
rely on architecture engineering heavily. To mitigate the above issues, this
paper proposes a unified optimization-inspired deep image propagation framework
to aggregate Generative, Discriminative and Corrective (GDC for short)
principles for a variety of low-level vision tasks. Specifically, we first
formulate low-level vision tasks using a generic optimization objective and
construct our fundamental propagative modules from three different viewpoints,
i.e., the solution could be obtained/learned 1) in generative manner; 2) based
on discriminative metric, and 3) with domain knowledge correction. By designing
control mechanisms to guide image propagations, we then obtain convergence
guarantees of GDC for both fully- and partially-defined optimization
formulations. Furthermore, we introduce two architecture augmentation
strategies (i.e., normalization and automatic search) to respectively enhance
the propagation stability and task/data-adaption ability. Extensive experiments
on different low-level vision applications demonstrate the effectiveness and
flexibility of GDC.Comment: 15 page
The Secrets of Non-Blind Poisson Deconvolution
Non-blind image deconvolution has been studied for several decades but most
of the existing work focuses on blur instead of noise. In photon-limited
conditions, however, the excessive amount of shot noise makes traditional
deconvolution algorithms fail. In searching for reasons why these methods fail,
we present a systematic analysis of the Poisson non-blind deconvolution
algorithms reported in the literature, covering both classical and deep
learning methods. We compile a list of five "secrets" highlighting the do's and
don'ts when designing algorithms. Based on this analysis, we build a
proof-of-concept method by combining the five secrets. We find that the new
method performs on par with some of the latest methods while outperforming some
older ones.Comment: Under submission at Transactions on Computational Imagin
Image Restoration
This book represents a sample of recent contributions of researchers all around the world in the field of image restoration. The book consists of 15 chapters organized in three main sections (Theory, Applications, Interdisciplinarity). Topics cover some different aspects of the theory of image restoration, but this book is also an occasion to highlight some new topics of research related to the emergence of some original imaging devices. From this arise some real challenging problems related to image reconstruction/restoration that open the way to some new fundamental scientific questions closely related with the world we interact with
Data-Driven Image Restoration
Every day many images are taken by digital cameras, and people
are demanding visually accurate and pleasing result. Noise and
blur degrade images captured by modern cameras, and high-level
vision tasks (such as segmentation, recognition, and tracking)
require high-quality images. Therefore, image restoration
specifically, image
deblurring and image denoising is a critical preprocessing step.
A fundamental problem in image deblurring is to recover reliably
distinct spatial frequencies that have been suppressed by the
blur kernel. Existing image deblurring techniques often rely on
generic image priors that only help recover part of the frequency
spectrum, such as the frequencies near the high-end. To this end,
we pose the following specific questions: (i) Does class-specific
information offer an advantage over existing generic priors for
image quality restoration? (ii) If a class-specific prior exists,
how should it be encoded into a deblurring framework to recover
attenuated image frequencies? Throughout this work, we devise a
class-specific prior based on the band-pass filter responses and
incorporate it into a deblurring strategy. Specifically, we show
that the subspace of band-pass filtered images and their
intensity distributions serve as useful priors for recovering
image frequencies.
Next, we present a novel image denoising algorithm that uses
external, category specific image database. In contrast to
existing noisy image restoration algorithms, our method selects
clean image “support patches” similar to the noisy patch from
an external database. We employ a content adaptive distribution
model for each patch where we derive the parameters of the
distribution from the support patches. Our objective function
composed of a Gaussian fidelity term that imposes category
specific information, and a low-rank term that encourages the
similarity between the noisy and the support patches in a robust
manner.
Finally, we propose to learn a fully-convolutional network model
that consists of a Chain of Identity Mapping Modules (CIMM) for
image denoising. The CIMM structure possesses two distinctive
features that are important for the noise removal task. Firstly,
each residual unit employs identity mappings as the skip
connections and receives pre-activated input to preserve the
gradient magnitude propagated in both the forward and backward
directions. Secondly, by utilizing dilated kernels for the
convolution layers in the residual branch, each neuron in the
last convolution layer of each module can observe the full
receptive field of the first layer
Neural Gradient Regularizer
Owing to its significant success, the prior imposed on gradient maps has
consistently been a subject of great interest in the field of image processing.
Total variation (TV), one of the most representative regularizers, is known for
its ability to capture the sparsity of gradient maps. Nonetheless, TV and its
variants often underestimate the gradient maps, leading to the weakening of
edges and details whose gradients should not be zero in the original image.
Recently, total deep variation (TDV) has been introduced, assuming the sparsity
of feature maps, which provides a flexible regularization learned from
large-scale datasets for a specific task. However, TDV requires retraining when
the image or task changes, limiting its versatility. In this paper, we propose
a neural gradient regularizer (NGR) that expresses the gradient map as the
output of a neural network. Unlike existing methods, NGR does not rely on the
sparsity assumption, thereby avoiding the underestimation of gradient maps. NGR
is applicable to various image types and different image processing tasks,
functioning in a zero-shot learning fashion, making it a versatile and
plug-and-play regularizer. Extensive experimental results demonstrate the
superior performance of NGR over state-of-the-art counterparts for a range of
different tasks, further validating its effectiveness and versatility
Motion deblurring of faces
Face analysis is a core part of computer vision, in which remarkable progress
has been observed in the past decades. Current methods achieve recognition and
tracking with invariance to fundamental modes of variation such as
illumination, 3D pose, expressions. Notwithstanding, a much less standing mode
of variation is motion deblurring, which however presents substantial
challenges in face analysis. Recent approaches either make oversimplifying
assumptions, e.g. in cases of joint optimization with other tasks, or fail to
preserve the highly structured shape/identity information. Therefore, we
propose a data-driven method that encourages identity preservation. The
proposed model includes two parallel streams (sub-networks): the first deblurs
the image, the second implicitly extracts and projects the identity of both the
sharp and the blurred image in similar subspaces. We devise a method for
creating realistic motion blur by averaging a variable number of frames to
train our model. The averaged images originate from a 2MF2 dataset with 10
million facial frames, which we introduce for the task. Considering deblurring
as an intermediate step, we utilize the deblurred outputs to conduct a thorough
experimentation on high-level face analysis tasks, i.e. landmark localization
and face verification. The experimental evaluation demonstrates the superiority
of our method
- …