23 research outputs found
Dropout Distillation for Efficiently Estimating Model Confidence
We propose an efficient way to output better calibrated uncertainty scores
from neural networks. The Distilled Dropout Network (DDN) makes standard
(non-Bayesian) neural networks more introspective by adding a new training loss
which prevents them from being overconfident. Our method is more efficient than
Bayesian neural networks or model ensembles which, despite providing more
reliable uncertainty scores, are more cumbersome to train and slower to test.
We evaluate DDN on the the task of image classification on the CIFAR-10 dataset
and show that our calibration results are competitive even when compared to 100
Monte Carlo samples from a dropout network while they also increase the
classification accuracy. We also propose better calibration within the state of
the art Faster R-CNN object detection framework and show, using the COCO
dataset, that DDN helps train better calibrated object detectors
Dropout distillation
Dropout is a popular stochastic regularization technique for deep neural networks that works by randomly dropping (i.e. zeroing) units from the network during training. This randomization process allows to implicitly train an ensemble of exponentially many networks sharing the same parametrization, which should be averaged at test time to deliver the final prediction. A typical workaround for this intractable averaging operation consists in scaling the layers undergoing dropout randomization. This simple rule called ’standard dropout’ is efficient, but might degrade the accuracy of the prediction. In this work we introduce a novel approach, coined ’dropout distillation’, that allows us to train a predictor in a way to better approximate the intractable, but preferable, averaging process, while keeping under control its computational efficiency. We are thus able to construct models that are as efficient as standard dropout, or even more efficient, while being more accurate. Experiments on standard benchmark datasets demonstrate the validity of our method, yielding consistent improvements over conventional dropout
Evaluating Merging Strategies for Sampling-based Uncertainty Techniques in Object Detection
There has been a recent emergence of sampling-based techniques for estimating
epistemic uncertainty in deep neural networks. While these methods can be
applied to classification or semantic segmentation tasks by simply averaging
samples, this is not the case for object detection, where detection sample
bounding boxes must be accurately associated and merged. A weak merging
strategy can significantly degrade the performance of the detector and yield an
unreliable uncertainty measure. This paper provides the first in-depth
investigation of the effect of different association and merging strategies. We
compare different combinations of three spatial and two semantic affinity
measures with four clustering methods for MC Dropout with a Single Shot
Multi-Box Detector. Our results show that the correct choice of
affinity-clustering combination can greatly improve the effectiveness of the
classification and spatial uncertainty estimation and the resulting object
detection performance. We base our evaluation on a new mix of datasets that
emulate near open-set conditions (semantically similar unknown classes),
distant open-set conditions (semantically dissimilar unknown classes) and the
common closed-set conditions (only known classes).Comment: to appear in IEEE International Conference on Robotics and Automation
2019 (ICRA 2019
Adversarial Dropout for Recurrent Neural Networks
Successful application processing sequential data, such as text and speech,
requires an improved generalization performance of recurrent neural networks
(RNNs). Dropout techniques for RNNs were introduced to respond to these
demands, but we conjecture that the dropout on RNNs could have been improved by
adopting the adversarial concept. This paper investigates ways to improve the
dropout for RNNs by utilizing intentionally generated dropout masks.
Specifically, the guided dropout used in this research is called as adversarial
dropout, which adversarially disconnects neurons that are dominantly used to
predict correct targets over time. Our analysis showed that our regularizer,
which consists of a gap between the original and the reconfigured RNNs, was the
upper bound of the gap between the training and the inference phases of the
random dropout. We demonstrated that minimizing our regularizer improved the
effectiveness of the dropout for RNNs on sequential MNIST tasks,
semi-supervised text classification tasks, and language modeling tasks.Comment: published in AAAI1