946 research outputs found
When Autonomous Systems Meet Accuracy and Transferability through AI: A Survey
With widespread applications of artificial intelligence (AI), the
capabilities of the perception, understanding, decision-making and control for
autonomous systems have improved significantly in the past years. When
autonomous systems consider the performance of accuracy and transferability,
several AI methods, like adversarial learning, reinforcement learning (RL) and
meta-learning, show their powerful performance. Here, we review the
learning-based approaches in autonomous systems from the perspectives of
accuracy and transferability. Accuracy means that a well-trained model shows
good results during the testing phase, in which the testing set shares a same
task or a data distribution with the training set. Transferability means that
when a well-trained model is transferred to other testing domains, the accuracy
is still good. Firstly, we introduce some basic concepts of transfer learning
and then present some preliminaries of adversarial learning, RL and
meta-learning. Secondly, we focus on reviewing the accuracy or transferability
or both of them to show the advantages of adversarial learning, like generative
adversarial networks (GANs), in typical computer vision tasks in autonomous
systems, including image style transfer, image superresolution, image
deblurring/dehazing/rain removal, semantic segmentation, depth estimation,
pedestrian detection and person re-identification (re-ID). Then, we further
review the performance of RL and meta-learning from the aspects of accuracy or
transferability or both of them in autonomous systems, involving pedestrian
tracking, robot navigation and robotic manipulation. Finally, we discuss
several challenges and future topics for using adversarial learning, RL and
meta-learning in autonomous systems
Learning Dual Convolutional Neural Networks for Low-Level Vision
In this paper, we propose a general dual convolutional neural network
(DualCNN) for low-level vision problems, e.g., super-resolution,
edge-preserving filtering, deraining and dehazing. These problems usually
involve the estimation of two components of the target signals: structures and
details. Motivated by this, our proposed DualCNN consists of two parallel
branches, which respectively recovers the structures and details in an
end-to-end manner. The recovered structures and details can generate the target
signals according to the formation model for each particular application. The
DualCNN is a flexible framework for low-level vision tasks and can be easily
incorporated into existing CNNs. Experimental results show that the DualCNN can
be effectively applied to numerous low-level vision tasks with favorable
performance against the state-of-the-art methods.Comment: CVPR 201
UG Track 2: A Collective Benchmark Effort for Evaluating and Advancing Image Understanding in Poor Visibility Environments
The UG challenge in IEEE CVPR 2019 aims to evoke a comprehensive
discussion and exploration about how low-level vision techniques can benefit
the high-level automatic visual recognition in various scenarios. In its second
track, we focus on object or face detection in poor visibility enhancements
caused by bad weathers (haze, rain) and low light conditions. While existing
enhancement methods are empirically expected to help the high-level end task,
that is observed to not always be the case in practice. To provide a more
thorough examination and fair comparison, we introduce three benchmark sets
collected in real-world hazy, rainy, and low-light conditions, respectively,
with annotate objects/faces annotated. To our best knowledge, this is the first
and currently largest effort of its kind. Baseline results by cascading
existing enhancement and detection models are reported, indicating the highly
challenging nature of our new data as well as the large room for further
technical innovations. We expect a large participation from the broad research
community to address these challenges together.Comment: A summary paper on datasets, fact sheets, baseline results, challenge
results, and winning methods in UG Challenge (Track 2). More materials
are provided in http://www.ug2challenge.org/index.htm
Deep joint rain and haze removal from single images
Rain removal from a single image is a challenge which has been studied for a
long time. In this paper, a novel convolutional neural network based on wavelet
and dark channel is proposed. On one hand, we think that rain streaks
correspond to high frequency component of the image. Therefore, haar wavelet
transform is a good choice to separate the rain streaks and background to some
extent. More specifically, the LL subband of a rain image is more inclined to
express the background information, while LH, HL, HH subband tend to represent
the rain streaks and the edges. On the other hand, the accumulation of rain
streaks from long distance makes the rain image look like haze veil. We extract
dark channel of rain image as a feature map in network. By increasing this
mapping between the dark channel of input and output images, we achieve haze
removal in an indirect way. All of the parameters are optimized by
back-propagation. Experiments on both synthetic and real- world datasets reveal
that our method outperforms other state-of- the-art methods from a qualitative
and quantitative perspective.Comment: 6 page
Rain O'er Me: Synthesizing real rain to derain with data distillation
We present a supervised technique for learning to remove rain from images
without using synthetic rain software. The method is based on a two-stage data
distillation approach: 1) A rainy image is first paired with a coarsely
derained version using on a simple filtering technique ("rain-to-clean"). 2)
Then a clean image is randomly matched with the rainy soft-labeled pair.
Through a shared deep neural network, the rain that is removed from the first
image is then added to the clean image to generate a second pair
("clean-to-rain"). The neural network simultaneously learns to map both images
such that high resolution structure in the clean images can inform the
deraining of the rainy images. Demonstrations show that this approach can
address those visual characteristics of rain not easily synthesized by software
in the usual way
Bridging the Gap Between Computational Photography and Visual Recognition
What is the current state-of-the-art for image restoration and enhancement
applied to degraded images acquired under less than ideal circumstances? Can
the application of such algorithms as a pre-processing step to improve image
interpretability for manual analysis or automatic visual recognition to
classify scene content? While there have been important advances in the area of
computational photography to restore or enhance the visual quality of an image,
the capabilities of such techniques have not always translated in a useful way
to visual recognition tasks. Consequently, there is a pressing need for the
development of algorithms that are designed for the joint problem of improving
visual appearance and recognition, which will be an enabling factor for the
deployment of visual recognition tools in many real-world scenarios. To address
this, we introduce the UG^2 dataset as a large-scale benchmark composed of
video imagery captured under challenging conditions, and two enhancement tasks
designed to test algorithmic impact on visual quality and automatic object
recognition. Furthermore, we propose a set of metrics to evaluate the joint
improvement of such tasks as well as individual algorithmic advances, including
a novel psychophysics-based evaluation regime for human assessment and a
realistic set of quantitative measures for object recognition performance. We
introduce six new algorithms for image restoration or enhancement, which were
created as part of the IARPA sponsored UG^2 Challenge workshop held at CVPR
2018. Under the proposed evaluation regime, we present an in-depth analysis of
these algorithms and a host of deep learning-based and classic baseline
approaches. From the observed results, it is evident that we are in the early
days of building a bridge between computational photography and visual
recognition, leaving many opportunities for innovation in this area.Comment: CVPR Prize Challenge: http://www.ug2challenge.or
Dual Residual Networks Leveraging the Potential of Paired Operations for Image Restoration
In this paper, we study design of deep neural networks for tasks of image
restoration. We propose a novel style of residual connections dubbed "dual
residual connection", which exploits the potential of paired operations, e.g.,
up- and down-sampling or convolution with large- and small-size kernels. We
design a modular block implementing this connection style; it is equipped with
two containers to which arbitrary paired operations are inserted. Adopting the
"unraveled" view of the residual networks proposed by Veit et al., we point out
that a stack of the proposed modular blocks allows the first operation in a
block interact with the second operation in any subsequent blocks. Specifying
the two operations in each of the stacked blocks, we build a complete network
for each individual task of image restoration. We experimentally evaluate the
proposed approach on five image restoration tasks using nine datasets. The
results show that the proposed networks with properly chosen paired operations
outperform previous methods on almost all of the tasks and datasets.Comment: i) Accepted to CVPR 2019 ii) Code, trained models and additional
results for visual comparison will be provided at
https://github.com/liu-vis/DualResidualNetwork
Structure-Preserving Image Super-resolution via Contextualized Multi-task Learning
Single image super resolution (SR), which refers to reconstruct a
higher-resolution (HR) image from the observed low-resolution (LR) image, has
received substantial attention due to its tremendous application potentials.
Despite the breakthroughs of recently proposed SR methods using convolutional
neural networks (CNNs), their generated results usually lack of preserving
structural (high-frequency) details. In this paper, regarding global boundary
context and residual context as complimentary information for enhancing
structural details in image restoration, we develop a contextualized multi-task
learning framework to address the SR problem. Specifically, our method first
extracts convolutional features from the input LR image and applies one
deconvolutional module to interpolate the LR feature maps in a content-adaptive
way. Then, the resulting feature maps are fed into two branched sub-networks.
During the neural network training, one sub-network outputs salient image
boundaries and the HR image, and the other sub-network outputs the local
residual map, i.e., the residual difference between the generated HR image and
ground-truth image. On several standard benchmarks (i.e., Set5, Set14 and
BSD200), our extensive evaluations demonstrate the effectiveness of our SR
method on achieving both higher restoration quality and computational
efficiency compared with several state-of-the-art SR approaches. The source
code and some SR results can be found at:
http://hcp.sysu.edu.cn/structure-preserving-image-super-resolution/Comment: To appear in Transactions on Multimedia 201
Rain Streak Removal for Single Image via Kernel Guided CNN
Rain streak removal is an important issue and has recently been investigated
extensively. Existing methods, especially the newly emerged deep learning
methods, could remove the rain streaks well in many cases. However the
essential factor in the generative procedure of the rain streaks, i.e., the
motion blur, which leads to the line pattern appearances, were neglected by the
deep learning rain streaks approaches and this resulted in over-derain or
under-derain results. In this paper, we propose a novel rain streak removal
approach using a kernel guided convolutional neural network (KGCNN), achieving
the state-of-the-art performance with simple network architectures. We first
model the rain streak interference with its motion blur mechanism. Then, our
framework starts with learning the motion blur kernel, which is determined by
two factors including angle and length, by a plain neural network, denoted as
parameter net, from a patch of the texture component. Then, after a
dimensionality stretching operation, the learned motion blur kernel is
stretched into a degradation map with the same spatial size as the rainy patch.
The stretched degradation map together with the texture patch is subsequently
input into a derain convolutional network, which is a typical ResNet
architecture and trained to output the rain streaks with the guidance of the
learned motion blur kernel. Experiments conducted on extensive synthetic and
real data demonstrate the effectiveness of the proposed method, which preserves
the texture and the contrast while removing the rain streaks
Joint Transmission Map Estimation and Dehazing using Deep Networks
Single image haze removal is an extremely challenging problem due to its
inherent ill-posed nature. Several prior-based and learning-based methods have
been proposed in the literature to solve this problem and they have achieved
superior results. However, most of the existing methods assume constant
atmospheric light model and tend to follow a two-step procedure involving
prior-based methods for estimating transmission map followed by calculation of
dehazed image using the closed form solution. In this paper, we relax the
constant atmospheric light assumption and propose a novel unified single image
dehazing network that jointly estimates the transmission map and performs
dehazing. In other words, our new approach provides an end-to-end learning
framework, where the inherent transmission map and dehazed result are learned
directly from the loss function. Extensive experiments on synthetic and real
datasets with challenging hazy images demonstrate that the proposed method
achieves significant improvements over the state-of-the-art methods.Comment: This paper has been accepted in IEEE-TCSV
- …