13,334 research outputs found
Neural Likelihoods via Cumulative Distribution Functions
We leverage neural networks as universal approximators of monotonic functions
to build a parameterization of conditional cumulative distribution functions
(CDFs). By the application of automatic differentiation with respect to
response variables and then to parameters of this CDF representation, we are
able to build black box CDF and density estimators. A suite of families is
introduced as alternative constructions for the multivariate case. At one
extreme, the simplest construction is a competitive density estimator against
state-of-the-art deep learning methods, although it does not provide an easily
computable representation of multivariate CDFs. At the other extreme, we have a
flexible construction from which multivariate CDF evaluations and
marginalizations can be obtained by a simple forward pass in a deep neural net,
but where the computation of the likelihood scales exponentially with
dimensionality. Alternatives in between the extremes are discussed. We evaluate
the different representations empirically on a variety of tasks involving tail
area probabilities, tail dependence and (partial) density estimation.Comment: 10 page
Lightweight Probabilistic Deep Networks
Even though probabilistic treatments of neural networks have a long history,
they have not found widespread use in practice. Sampling approaches are often
too slow already for simple networks. The size of the inputs and the depth of
typical CNN architectures in computer vision only compound this problem.
Uncertainty in neural networks has thus been largely ignored in practice,
despite the fact that it may provide important information about the
reliability of predictions and the inner workings of the network. In this
paper, we introduce two lightweight approaches to making supervised learning
with probabilistic deep networks practical: First, we suggest probabilistic
output layers for classification and regression that require only minimal
changes to existing networks. Second, we employ assumed density filtering and
show that activation uncertainties can be propagated in a practical fashion
through the entire network, again with minor changes. Both probabilistic
networks retain the predictive power of the deterministic counterpart, but
yield uncertainties that correlate well with the empirical error induced by
their predictions. Moreover, the robustness to adversarial examples is
significantly increased.Comment: To appear at CVPR 201
- …