10,276 research outputs found
Wide & deep learning for spatial & intensity adaptive image restoration
Most existing deep learning-based image restoration methods usually aim to
remove degradation with uniform spatial distribution and constant intensity,
making insufficient use of degradation prior knowledge. Here we bootstrap the
deep neural networks to suppress complex image degradation whose intensity is
spatially variable, through utilizing prior knowledge from degraded images.
Specifically, we propose an ingenious and efficient multi-frame image
restoration network (DparNet) with wide & deep architecture, which integrates
degraded images and prior knowledge of degradation to reconstruct images with
ideal clarity and stability. The degradation prior is directly learned from
degraded images in form of key degradation parameter matrix, with no
requirement of any off-site knowledge. The wide & deep architecture in DparNet
enables the learned parameters to directly modulate the final restoring
results, boosting spatial & intensity adaptive image restoration. We
demonstrate the proposed method on two representative image restoration
applications: image denoising and suppression of atmospheric turbulence effects
in images. Two large datasets, containing 109,536 and 49,744 images
respectively, were constructed to support our experiments. The experimental
results show that our DparNet significantly outperform SoTA methods in
restoration performance and network efficiency. More importantly, by utilizing
the learned degradation parameters via wide & deep learning, we can improve the
PSNR of image restoration by 0.6~1.1 dB with less than 2% increasing in model
parameter numbers and computational complexity. Our work suggests that degraded
images may hide key information of the degradation process, which can be
utilized to boost spatial & intensity adaptive image restoration
Robust Machine Learning In Computer Vision
Deep neural networks have been shown to be successful in various computer vision tasks such as image classification and object detection. Although deep neural networks have exceeded human performance in many tasks, robustness and reliability are always the concerns of using deep learning models. On the one hand, degraded images and videos aggravate the performance of computer vision tasks. On the other hand, if the deep neural networks are under adversarial attacks, the networks can be broken completely. Motivated by the vulnerability of deep neural networks, I analyze and develop image restoration and adversarial defense algorithms towards a vision of robust machine learning in computer vision.
In this dissertation, I study two types of degradation making deep neural networks vulnerable. The first part of the dissertation focuses on face recognition at long range, whose performance is severely degraded by atmospheric turbulence. The theme is on improving the performance and robustness of various tasks in face recognition systems such as facial keypoints localization, feature extraction, and image restoration. The second part focuses on defending adversarial attacks in the images classification task. The theme is on exploring adversarial defense methods that can achieve good performance in standard accuracy, robustness to adversarial attacks with known threat models, and good generalization to other unseen attacks
Non-blind Image Restoration Based on Convolutional Neural Network
Blind image restoration processors based on convolutional neural network
(CNN) are intensively researched because of their high performance. However,
they are too sensitive to the perturbation of the degradation model. They
easily fail to restore the image whose degradation model is slightly different
from the trained degradation model. In this paper, we propose a non-blind
CNN-based image restoration processor, aiming to be robust against a
perturbation of the degradation model compared to the blind restoration
processor. Experimental comparisons demonstrate that the proposed non-blind
CNN-based image restoration processor can robustly restore images compared to
existing blind CNN-based image restoration processors.Comment: Accepted by IEEE 7th Global Conference on Consumer Electronics, 201
Unsupervised Domain Adaptation for Face Recognition in Unlabeled Videos
Despite rapid advances in face recognition, there remains a clear gap between
the performance of still image-based face recognition and video-based face
recognition, due to the vast difference in visual quality between the domains
and the difficulty of curating diverse large-scale video datasets. This paper
addresses both of those challenges, through an image to video feature-level
domain adaptation approach, to learn discriminative video frame
representations. The framework utilizes large-scale unlabeled video data to
reduce the gap between different domains while transferring discriminative
knowledge from large-scale labeled still images. Given a face recognition
network that is pretrained in the image domain, the adaptation is achieved by
(i) distilling knowledge from the network to a video adaptation network through
feature matching, (ii) performing feature restoration through synthetic data
augmentation and (iii) learning a domain-invariant feature through a domain
adversarial discriminator. We further improve performance through a
discriminator-guided feature fusion that boosts high-quality frames while
eliminating those degraded by video domain-specific factors. Experiments on the
YouTube Faces and IJB-A datasets demonstrate that each module contributes to
our feature-level domain adaptation framework and substantially improves video
face recognition performance to achieve state-of-the-art accuracy. We
demonstrate qualitatively that the network learns to suppress diverse artifacts
in videos such as pose, illumination or occlusion without being explicitly
trained for them.Comment: accepted for publication at International Conference on Computer
Vision (ICCV) 201
DeepOtsu: Document Enhancement and Binarization using Iterative Deep Learning
This paper presents a novel iterative deep learning framework and apply it
for document enhancement and binarization. Unlike the traditional methods which
predict the binary label of each pixel on the input image, we train the neural
network to learn the degradations in document images and produce the uniform
images of the degraded input images, which allows the network to refine the
output iteratively. Two different iterative methods have been studied in this
paper: recurrent refinement (RR) which uses the same trained neural network in
each iteration for document enhancement and stacked refinement (SR) which uses
a stack of different neural networks for iterative output refinement. Given the
learned uniform and enhanced image, the binarization map can be easy to obtain
by a global or local threshold. The experimental results on several public
benchmark data sets show that our proposed methods provide a new clean version
of the degraded image which is suitable for visualization and promising results
of binarization using the global Otsu's threshold based on the enhanced images
learned iteratively by the neural network.Comment: Accepted by Pattern Recognitio
- …