38,215 research outputs found
Causal Dependence Tree Approximations of Joint Distributions for Multiple Random Processes
We investigate approximating joint distributions of random processes with
causal dependence tree distributions. Such distributions are particularly
useful in providing parsimonious representation when there exists causal
dynamics among processes. By extending the results by Chow and Liu on
dependence tree approximations, we show that the best causal dependence tree
approximation is the one which maximizes the sum of directed informations on
its edges, where best is defined in terms of minimizing the KL-divergence
between the original and the approximate distribution. Moreover, we describe a
low-complexity algorithm to efficiently pick this approximate distribution.Comment: 9 pages, 15 figure
Why and When Can Deep -- but Not Shallow -- Networks Avoid the Curse of Dimensionality: a Review
The paper characterizes classes of functions for which deep learning can be
exponentially better than shallow learning. Deep convolutional networks are a
special case of these conditions, though weight sharing is not the main reason
for their exponential advantage
Dropout Inference in Bayesian Neural Networks with Alpha-divergences
To obtain uncertainty estimates with real-world Bayesian deep learning
models, practical inference approximations are needed. Dropout variational
inference (VI) for example has been used for machine vision and medical
applications, but VI can severely underestimates model uncertainty.
Alpha-divergences are alternative divergences to VI's KL objective, which are
able to avoid VI's uncertainty underestimation. But these are hard to use in
practice: existing techniques can only use Gaussian approximating
distributions, and require existing models to be changed radically, thus are of
limited use for practitioners. We propose a re-parametrisation of the
alpha-divergence objectives, deriving a simple inference technique which,
together with dropout, can be easily implemented with existing models by simply
changing the loss of the model. We demonstrate improved uncertainty estimates
and accuracy compared to VI in dropout networks. We study our model's epistemic
uncertainty far away from the data using adversarial images, showing that these
can be distinguished from non-adversarial images by examining our model's
uncertainty
- …