1,080 research outputs found
Probabilistic Adaptive Computation Time
We present a probabilistic model with discrete latent variables that control
the computation time in deep learning models such as ResNets and LSTMs. A prior
on the latent variables expresses the preference for faster computation. The
amount of computation for an input is determined via amortized maximum a
posteriori (MAP) inference. MAP inference is performed using a novel stochastic
variational optimization method. The recently proposed Adaptive Computation
Time mechanism can be seen as an ad-hoc relaxation of this model. We
demonstrate training using the general-purpose Concrete relaxation of discrete
variables. Evaluation on ResNet shows that our method matches the
speed-accuracy trade-off of Adaptive Computation Time, while allowing for
evaluation with a simple deterministic procedure that has a lower memory
footprint
Deep Directional Statistics: Pose Estimation with Uncertainty Quantification
Modern deep learning systems successfully solve many perception tasks such as
object pose estimation when the input image is of high quality. However, in
challenging imaging conditions such as on low-resolution images or when the
image is corrupted by imaging artifacts, current systems degrade considerably
in accuracy. While a loss in performance is unavoidable, we would like our
models to quantify their uncertainty in order to achieve robustness against
images of varying quality. Probabilistic deep learning models combine the
expressive power of deep learning with uncertainty quantification. In this
paper, we propose a novel probabilistic deep learning model for the task of
angular regression. Our model uses von Mises distributions to predict a
distribution over object pose angle. Whereas a single von Mises distribution is
making strong assumptions about the shape of the distribution, we extend the
basic model to predict a mixture of von Mises distributions. We show how to
learn a mixture model using a finite and infinite number of mixture components.
Our model allows for likelihood-based training and efficient inference at test
time. We demonstrate on a number of challenging pose estimation datasets that
our model produces calibrated probability predictions and competitive or
superior point estimates compared to the current state-of-the-art
- …