Search CORE

26,902 research outputs found

Lightweight Probabilistic Deep Networks

Author: Gast Jochen
Roth Stefan
Publication venue
Publication date: 01/01/2018
Field of study

Even though probabilistic treatments of neural networks have a long history, they have not found widespread use in practice. Sampling approaches are often too slow already for simple networks. The size of the inputs and the depth of typical CNN architectures in computer vision only compound this problem. Uncertainty in neural networks has thus been largely ignored in practice, despite the fact that it may provide important information about the reliability of predictions and the inner workings of the network. In this paper, we introduce two lightweight approaches to making supervised learning with probabilistic deep networks practical: First, we suggest probabilistic output layers for classification and regression that require only minimal changes to existing networks. Second, we employ assumed density filtering and show that activation uncertainties can be propagated in a practical fashion through the entire network, again with minor changes. Both probabilistic networks retain the predictive power of the deterministic counterpart, but yield uncertainties that correlate well with the empirical error induced by their predictions. Moreover, the robustness to adversarial examples is significantly increased.Comment: To appear at CVPR 201

arXiv.org e-Print Archive

TUbiblio

Crossref

Kernel Bayes' rule

Author: Fukumizu Kenji
Gretton Arthur
Song Le
Publication venue
Publication date: 01/01/2011
Field of study

A nonparametric kernel-based method for realizing Bayes' rule is proposed, based on representations of probabilities in reproducing kernel Hilbert spaces. Probabilities are uniquely characterized by the mean of the canonical map to the RKHS. The prior and conditional probabilities are expressed in terms of RKHS functions of an empirical sample: no explicit parametric model is needed for these quantities. The posterior is likewise an RKHS mean of a weighted sample. The estimator for the expectation of a function of the posterior is derived, and rates of consistency are shown. Some representative applications of the kernel Bayes' rule are presented, including Baysian computation without likelihood and filtering with a nonparametric state-space model.Comment: 27 pages, 5 figure

arXiv.org e-Print Archive

CiteSeerX

UCL Discovery

MPG.PuRe

Sequential Bayesian inference for static parameters in dynamic state space models

Author: Bhattacharya Arnab
Wilson Simon
Publication venue
Publication date: 27/06/2017
Field of study

A method for sequential Bayesian inference of the static parameters of a dynamic state space model is proposed. The method is based on the observation that many dynamic state space models have a relatively small number of static parameters (or hyper-parameters), so that in principle the posterior can be computed and stored on a discrete grid of practical size which can be tracked dynamically. Further to this, this approach is able to use any existing methodology which computes the filtering and prediction distributions of the state process. Kalman filter and its extensions to non-linear/non-Gaussian situations have been used in this paper. This is illustrated using several applications: linear Gaussian model, Binomial model, stochastic volatility model and the extremely non-linear univariate non-stationary growth model. Performance has been compared to both existing on-line method and off-line methods

arXiv.org e-Print Archive

CiteSeerX

Bayesian Conditional Density Filtering

Author: Dunson David B.
Guhaniyogi Rajarshi
Qamar Shaan
Publication venue
Publication date: 22/09/2015
Field of study

We propose a Conditional Density Filtering (C-DF) algorithm for efficient online Bayesian inference. C-DF adapts MCMC sampling to the online setting, sampling from approximations to conditional posterior distributions obtained by propagating surrogate conditional sufficient statistics (a function of data and parameter estimates) as new data arrive. These quantities eliminate the need to store or process the entire dataset simultaneously and offer a number of desirable features. Often, these include a reduction in memory requirements and runtime and improved mixing, along with state-of-the-art parameter inference and prediction. These improvements are demonstrated through several illustrative examples including an application to high dimensional compressed regression. Finally, we show that C-DF samples converge to the target posterior distribution asymptotically as sampling proceeds and more data arrives.Comment: 41 pages, 7 figures, 12 table

arXiv.org e-Print Archive

CiteSeerX