8,331 research outputs found
Dropout Sampling for Robust Object Detection in Open-Set Conditions
Dropout Variational Inference, or Dropout Sampling, has been recently
proposed as an approximation technique for Bayesian Deep Learning and evaluated
for image classification and regression tasks. This paper investigates the
utility of Dropout Sampling for object detection for the first time. We
demonstrate how label uncertainty can be extracted from a state-of-the-art
object detection system via Dropout Sampling. We evaluate this approach on a
large synthetic dataset of 30,000 images, and a real-world dataset captured by
a mobile robot in a versatile campus environment. We show that this uncertainty
can be utilized to increase object detection performance under the open-set
conditions that are typically encountered in robotic vision. A Dropout Sampling
network is shown to achieve a 12.3% increase in recall (for the same precision
score as a standard network) and a 15.1% increase in precision (for the same
recall score as the standard network).Comment: to appear in IEEE International Conference on Robotics and Automation
2018 (ICRA 2018
Evaluating Merging Strategies for Sampling-based Uncertainty Techniques in Object Detection
There has been a recent emergence of sampling-based techniques for estimating
epistemic uncertainty in deep neural networks. While these methods can be
applied to classification or semantic segmentation tasks by simply averaging
samples, this is not the case for object detection, where detection sample
bounding boxes must be accurately associated and merged. A weak merging
strategy can significantly degrade the performance of the detector and yield an
unreliable uncertainty measure. This paper provides the first in-depth
investigation of the effect of different association and merging strategies. We
compare different combinations of three spatial and two semantic affinity
measures with four clustering methods for MC Dropout with a Single Shot
Multi-Box Detector. Our results show that the correct choice of
affinity-clustering combination can greatly improve the effectiveness of the
classification and spatial uncertainty estimation and the resulting object
detection performance. We base our evaluation on a new mix of datasets that
emulate near open-set conditions (semantically similar unknown classes),
distant open-set conditions (semantically dissimilar unknown classes) and the
common closed-set conditions (only known classes).Comment: to appear in IEEE International Conference on Robotics and Automation
2019 (ICRA 2019
Towards Safe Autonomous Driving: Capture Uncertainty in the Deep Neural Network For Lidar 3D Vehicle Detection
To assure that an autonomous car is driving safely on public roads, its
object detection module should not only work correctly, but show its prediction
confidence as well. Previous object detectors driven by deep learning do not
explicitly model uncertainties in the neural network. We tackle with this
problem by presenting practical methods to capture uncertainties in a 3D
vehicle detector for Lidar point clouds. The proposed probabilistic detector
represents reliable epistemic uncertainty and aleatoric uncertainty in
classification and localization tasks. Experimental results show that the
epistemic uncertainty is related to the detection accuracy, whereas the
aleatoric uncertainty is influenced by vehicle distance and occlusion. The
results also show that we can improve the detection performance by 1%-5% by
modeling the aleatoric uncertainty.Comment: Accepted to present in the 21st IEEE International Conference on
Intelligent Transportation Systems (ITSC 2018
Uncertainty Estimation in One-Stage Object Detection
Environment perception is the task for intelligent vehicles on which all
subsequent steps rely. A key part of perception is to safely detect other road
users such as vehicles, pedestrians, and cyclists. With modern deep learning
techniques huge progress was made over the last years in this field. However
such deep learning based object detection models cannot predict how certain
they are in their predictions, potentially hampering the performance of later
steps such as tracking or sensor fusion. We present a viable approaches to
estimate uncertainty in an one-stage object detector, while improving the
detection performance of the baseline approach. The proposed model is evaluated
on a large scale automotive pedestrian dataset. Experimental results show that
the uncertainty outputted by our system is coupled with detection accuracy and
the occlusion level of pedestrians
ModDrop: adaptive multi-modal gesture recognition
We present a method for gesture detection and localisation based on
multi-scale and multi-modal deep learning. Each visual modality captures
spatial information at a particular spatial scale (such as motion of the upper
body or a hand), and the whole system operates at three temporal scales. Key to
our technique is a training strategy which exploits: i) careful initialization
of individual modalities; and ii) gradual fusion involving random dropping of
separate channels (dubbed ModDrop) for learning cross-modality correlations
while preserving uniqueness of each modality-specific representation. We
present experiments on the ChaLearn 2014 Looking at People Challenge gesture
recognition track, in which we placed first out of 17 teams. Fusing multiple
modalities at several spatial and temporal scales leads to a significant
increase in recognition rates, allowing the model to compensate for errors of
the individual classifiers as well as noise in the separate channels.
Futhermore, the proposed ModDrop training technique ensures robustness of the
classifier to missing signals in one or several channels to produce meaningful
predictions from any number of available modalities. In addition, we
demonstrate the applicability of the proposed fusion scheme to modalities of
arbitrary nature by experiments on the same dataset augmented with audio.Comment: 14 pages, 7 figure
- …