2,640 research outputs found
Deep Algorithm Unrolling for Blind Image Deblurring
Blind image deblurring remains a topic of enduring interest. Learning based
approaches, especially those that employ neural networks have emerged to
complement traditional model based methods and in many cases achieve vastly
enhanced performance. That said, neural network approaches are generally
empirically designed and the underlying structures are difficult to interpret.
In recent years, a promising technique called algorithm unrolling has been
developed that has helped connect iterative algorithms such as those for sparse
coding to neural network architectures. However, such connections have not been
made yet for blind image deblurring. In this paper, we propose a neural network
architecture based on this idea. We first present an iterative algorithm that
may be considered as a generalization of the traditional total-variation
regularization method in the gradient domain. We then unroll the algorithm to
construct a neural network for image deblurring which we refer to as Deep
Unrolling for Blind Deblurring (DUBLID). Key algorithm parameters are learned
with the help of training images. Our proposed deep network DUBLID achieves
significant practical performance gains while enjoying interpretability at the
same time. Extensive experimental results show that DUBLID outperforms many
state-of-the-art methods and in addition is computationally faster
FastDVDnet: Towards Real-Time Deep Video Denoising Without Flow Estimation
In this paper, we propose a state-of-the-art video denoising algorithm based
on a convolutional neural network architecture. Until recently, video denoising
with neural networks had been a largely under explored domain, and existing
methods could not compete with the performance of the best patch-based methods.
The approach we introduce in this paper, called FastDVDnet, shows similar or
better performance than other state-of-the-art competitors with significantly
lower computing times. In contrast to other existing neural network denoisers,
our algorithm exhibits several desirable properties such as fast runtimes, and
the ability to handle a wide range of noise levels with a single network model.
The characteristics of its architecture make it possible to avoid using a
costly motion compensation stage while achieving excellent performance. The
combination between its denoising performance and lower computational load
makes this algorithm attractive for practical denoising applications. We
compare our method with different state-of-art algorithms, both visually and
with respect to objective quality metrics.Comment: Code for this algorithm and results can be found in
https://github.com/m-tassano/fastdvdnet. arXiv admin note: text overlap with
arXiv:1906.1189
Face Recognition in Low Quality Images: A Survey
Low-resolution face recognition (LRFR) has received increasing attention over
the past few years. Its applications lie widely in the real-world environment
when high-resolution or high-quality images are hard to capture. One of the
biggest demands for LRFR technologies is video surveillance. As the the number
of surveillance cameras in the city increases, the videos that captured will
need to be processed automatically. However, those videos or images are usually
captured with large standoffs, arbitrary illumination condition, and diverse
angles of view. Faces in these images are generally small in size. Several
studies addressed this problem employed techniques like super resolution,
deblurring, or learning a relationship between different resolution domains. In
this paper, we provide a comprehensive review of approaches to low-resolution
face recognition in the past five years. First, a general problem definition is
given. Later, systematically analysis of the works on this topic is presented
by catogory. In addition to describing the methods, we also focus on datasets
and experiment settings. We further address the related works on unconstrained
low-resolution face recognition and compare them with the result that use
synthetic low-resolution data. Finally, we summarized the general limitations
and speculate a priorities for the future effort.Comment: There are some mistakes addressing in this paper which will be
misleading to the reader and we wont have a new version in short time. We
will resubmit once it is being corecte
Real-world Underwater Enhancement: Challenges, Benchmarks, and Solutions
Underwater image enhancement is such an important low-level vision task with
many applications that numerous algorithms have been proposed in recent years.
These algorithms developed upon various assumptions demonstrate successes from
various aspects using different data sets and different metrics. In this work,
we setup an undersea image capturing system, and construct a large-scale
Real-world Underwater Image Enhancement (RUIE) data set divided into three
subsets. The three subsets target at three challenging aspects for enhancement,
i.e., image visibility quality, color casts, and higher-level
detection/classification, respectively. We conduct extensive and systematic
experiments on RUIE to evaluate the effectiveness and limitations of various
algorithms to enhance visibility and correct color casts on images with
hierarchical categories of degradation. Moreover, underwater image enhancement
in practice usually serves as a preprocessing step for mid-level and high-level
vision tasks. We thus exploit the object detection performance on enhanced
images as a brand new task-specific evaluation criterion. The findings from
these evaluations not only confirm what is commonly believed, but also suggest
promising solutions and new directions for visibility enhancement, color
correction, and object detection on real-world underwater images.Comment: arXiv admin note: text overlap with arXiv:1712.04143 by other author
Connecting Image Denoising and High-Level Vision Tasks via Deep Learning
Image denoising and high-level vision tasks are usually handled independently
in the conventional practice of computer vision, and their connection is
fragile. In this paper, we cope with the two jointly and explore the mutual
influence between them with the focus on two questions, namely (1) how image
denoising can help improving high-level vision tasks, and (2) how the semantic
information from high-level vision tasks can be used to guide image denoising.
First for image denoising we propose a convolutional neural network in which
convolutions are conducted in various spatial resolutions via downsampling and
upsampling operations in order to fuse and exploit contextual information on
different scales. Second we propose a deep neural network solution that
cascades two modules for image denoising and various high-level tasks,
respectively, and use the joint loss for updating only the denoising network
via back-propagation. We experimentally show that on one hand, the proposed
denoiser has the generality to overcome the performance degradation of
different high-level vision tasks. On the other hand, with the guidance of
high-level vision information, the denoising network produces more visually
appealing results. Extensive experiments demonstrate the benefit of exploiting
image semantics simultaneously for image denoising and high-level vision tasks
via deep learning. The code is available online:
https://github.com/Ding-Liu/DeepDenoisingComment: arXiv admin note: text overlap with arXiv:1706.0428
Blind Motion Deblurring with Cycle Generative Adversarial Networks
Blind motion deblurring is one of the most basic and challenging problems in
image processing and computer vision. It aims to recover a sharp image from its
blurred version knowing nothing about the blur process. Many existing methods
use Maximum A Posteriori (MAP) or Expectation Maximization (EM) frameworks to
deal with this kind of problems, but they cannot handle well the figh frequency
features of natural images. Most recently, deep neural networks have been
emerging as a powerful tool for image deblurring. In this paper, we prove that
encoder-decoder architecture gives better results for image deblurring tasks.
In addition, we propose a novel end-to-end learning model which refines
generative adversarial network by many novel training strategies so as to
tackle the problem of deblurring. Experimental results show that our model can
capture high frequency features well, and the results on benchmark dataset show
that proposed model achieves the competitive performance
Segmentation-Aware Image Denoising without Knowing True Segmentation
Several recent works discussed application-driven image restoration neural
networks, which are capable of not only removing noise in images but also
preserving their semantic-aware details, making them suitable for various
high-level computer vision tasks as the pre-processing step. However, such
approaches require extra annotations for their high-level vision tasks, in
order to train the joint pipeline using hybrid losses. The availability of
those annotations is yet often limited to a few image sets, potentially
restricting the general applicability of these methods to denoising more unseen
and unannotated images. Motivated by that, we propose a segmentation-aware
image denoising model dubbed U-SAID, based on a novel unsupervised approach
with a pixel-wise uncertainty loss. U-SAID does not need any ground-truth
segmentation map, and thus can be applied to any image dataset. It generates
denoised images with comparable or even better quality, and the denoised
results show stronger robustness for subsequent semantic segmentation tasks,
when compared to either its supervised counterpart or classical
"application-agnostic" denoisers. Moreover, we demonstrate the superior
generalizability of U-SAID in three-folds, by plugging its "universal" denoiser
without fine-tuning: (1) denoising unseen types of images; (2) denoising as
pre-processing for segmenting unseen noisy images; and (3) denoising for unseen
high-level tasks. Extensive experiments demonstrate the effectiveness,
robustness and generalizability of the proposed U-SAID over various popular
image sets
Bridging the Gap Between Computational Photography and Visual Recognition
What is the current state-of-the-art for image restoration and enhancement
applied to degraded images acquired under less than ideal circumstances? Can
the application of such algorithms as a pre-processing step to improve image
interpretability for manual analysis or automatic visual recognition to
classify scene content? While there have been important advances in the area of
computational photography to restore or enhance the visual quality of an image,
the capabilities of such techniques have not always translated in a useful way
to visual recognition tasks. Consequently, there is a pressing need for the
development of algorithms that are designed for the joint problem of improving
visual appearance and recognition, which will be an enabling factor for the
deployment of visual recognition tools in many real-world scenarios. To address
this, we introduce the UG^2 dataset as a large-scale benchmark composed of
video imagery captured under challenging conditions, and two enhancement tasks
designed to test algorithmic impact on visual quality and automatic object
recognition. Furthermore, we propose a set of metrics to evaluate the joint
improvement of such tasks as well as individual algorithmic advances, including
a novel psychophysics-based evaluation regime for human assessment and a
realistic set of quantitative measures for object recognition performance. We
introduce six new algorithms for image restoration or enhancement, which were
created as part of the IARPA sponsored UG^2 Challenge workshop held at CVPR
2018. Under the proposed evaluation regime, we present an in-depth analysis of
these algorithms and a host of deep learning-based and classic baseline
approaches. From the observed results, it is evident that we are in the early
days of building a bridge between computational photography and visual
recognition, leaving many opportunities for innovation in this area.Comment: CVPR Prize Challenge: http://www.ug2challenge.or
Noiseprint: a CNN-based camera model fingerprint
Forensic analyses of digital images rely heavily on the traces of in-camera
and out-camera processes left on the acquired images. Such traces represent a
sort of camera fingerprint. If one is able to recover them, by suppressing the
high-level scene content and other disturbances, a number of forensic tasks can
be easily accomplished. A notable example is the PRNU pattern, which can be
regarded as a device fingerprint, and has received great attention in
multimedia forensics. In this paper we propose a method to extract a camera
model fingerprint, called noiseprint, where the scene content is largely
suppressed and model-related artifacts are enhanced. This is obtained by means
of a Siamese network, which is trained with pairs of image patches coming from
the same (label +1) or different (label -1) cameras. Although noiseprints can
be used for a large variety of forensic tasks, here we focus on image forgery
localization. Experiments on several datasets widespread in the forensic
community show noiseprint-based methods to provide state-of-the-art
performance
A Deep Journey into Super-resolution: A survey
Deep convolutional networks based super-resolution is a fast-growing field
with numerous practical applications. In this exposition, we extensively
compare 30+ state-of-the-art super-resolution Convolutional Neural Networks
(CNNs) over three classical and three recently introduced challenging datasets
to benchmark single image super-resolution. We introduce a taxonomy for
deep-learning based super-resolution networks that groups existing methods into
nine categories including linear, residual, multi-branch, recursive,
progressive, attention-based and adversarial designs. We also provide
comparisons between the models in terms of network complexity, memory
footprint, model input and output, learning details, the type of network losses
and important architectural differences (e.g., depth, skip-connections,
filters). The extensive evaluation performed, shows the consistent and rapid
growth in the accuracy in the past few years along with a corresponding boost
in model complexity and the availability of large-scale datasets. It is also
observed that the pioneering methods identified as the benchmark have been
significantly outperformed by the current contenders. Despite the progress in
recent years, we identify several shortcomings of existing techniques and
provide future research directions towards the solution of these open problems.Comment: Accepted in ACM Computing Survey
- …