Search CORE

9,752 research outputs found

A Scale Mixture Perspective of Multiplicative Noise in Neural Networks

Author: Anandkumar Anima
Nalisnick Eric
Smyth Padhraic
Publication venue
Publication date: 10/06/2015
Field of study

Corrupting the input and hidden layers of deep neural networks (DNNs) with multiplicative noise, often drawn from the Bernoulli distribution (or 'dropout'), provides regularization that has significantly contributed to deep learning's success. However, understanding how multiplicative corruptions prevent overfitting has been difficult due to the complexity of a DNN's functional form. In this paper, we show that when a Gaussian prior is placed on a DNN's weights, applying multiplicative noise induces a Gaussian scale mixture, which can be reparameterized to circumvent the problematic likelihood function. Analysis can then proceed by using a type-II maximum likelihood procedure to derive a closed-form expression revealing how regularization evolves as a function of the network's weights. Results show that multiplicative noise forces weights to become either sparse or invariant to rescaling. We find our analysis has implications for model compression as it naturally reveals a weight pruning rule that starkly contrasts with the commonly used signal-to-noise ratio (SNR). While the SNR prunes weights with large variances, seeing them as noisy, our approach recognizes their robustness and retains them. We empirically demonstrate our approach has a strong advantage over the SNR heuristic and is competitive to retraining with soft targets produced from a teacher model

arXiv.org e-Print Archive

eScholarship - University of California

Rapid deconvolution of low-resolution time-of-flight data using Bayesian inference

Author: de Kock M.
Eggers H.
Miller R.
Pieterse C.
Robertson W.
Publication venue: 'AIP Publishing'
Publication date: 01/01/2019
Field of study

The deconvolution of low-resolution time-of-flight data has numerous advantages, including the ability to extract additional information from the experimental data. We augment the well-known Lucy-Richardson deconvolution algorithm using various Bayesian prior distributions and show that a prior of second-differences of the signal outperforms the standard Lucy-Richardson algorithm, accelerating the rate of convergence by more than a factor of four, while preserving the peak amplitude ratios of a similar fraction of the total peaks. A novel stopping criterion and boosting mechanism are implemented to ensure that these methods converge to a similar final entropy and local minima are avoided. Improvement by a factor of two in mass resolution allows more accurate quantification of the spectra. The general method is demonstrated in this paper through the deconvolution of fragmentation peaks of the 2,5-dihydroxybenzoic acid matrix and the benzyltriphenylphosphonium thermometer ion, following femtosecond ultraviolet laser desorption

MPG.PuRe

Stellenbosch University SUNScholar Repository

FigShare

Bayesian P-Splines to investigate the impact of covariates on Multiple Sclerosis clinical course

Author: Di Serio C.
Lamina C.
Publication venue
Publication date: 01/01/2003
Field of study

This paper aims at proposing suitable statistical tools to address heterogeneity in repeated measures, within a Multiple Sclerosis (MS) longitudinal study. Indeed, due to unobservable sources of heterogeneity, modelling the effect of covariates on MS severity evolves as a very difficult feature. Bayesian P-Splines are suggested for modelling non linear smooth effects of covariates within generalized additive models. Thus, based on a pooled MS data set, we show how extending Bayesian P-splines to mixed effects models (Lang and Brezger, 2001), represents an attractive statistical approach to investigate the role of prognostic factors in affecting individual change in disability

Open Access LMU

Risk, Unexpected Uncertainty, and Estimation Uncertainty: Bayesian Learning in Unstable Settings

Author: A Quinn
A Wagner
AC Courville
AJ Yu
AN Hampton
BA Strange
CD Fiorillo
D Draper
D Ellsberg
E Payzan-LeNestour
Elise Payzan-LeNestour
FH Knight
G Aston-Jones
G Vanni-Mercier
GI Christopoulos
J Dow
JD Cohen
JM Keynes
JM Pearce
JO Berger
K Craik
K Doya
K Preuschoff
K Preuschoff
K Sangjoon
LP Hansen
M Allais
M Basili
M d'Acremont
M Hsu
MFS Rushworth
MP Paulus
ND Daw
ND Daw
P Bossaerts
P Dayan
P Diaconis
Peter Bossaerts
PN Tobler
RE Kass
RH Thaler
S Huettel
S Ishii
S Kakade
SA Huettel
TEJ Behrens
Tim Behrens
U Rutishauser
W Epstein
W Yoshida
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2011
Field of study

Recently, evidence has emerged that humans approach learning using Bayesian updating rather than (model-free) reinforcement algorithms in a six-arm restless bandit problem. Here, we investigate what this implies for human appreciation of uncertainty. In our task, a Bayesian learner distinguishes three equally salient levels of uncertainty. First, the Bayesian perceives irreducible uncertainty or risk: even knowing the payoff probabilities of a given arm, the outcome remains uncertain. Second, there is (parameter) estimation uncertainty or ambiguity: payoff probabilities are unknown and need to be estimated. Third, the outcome probabilities of the arms change: the sudden jumps are referred to as unexpected uncertainty. We document how the three levels of uncertainty evolved during the course of our experiment and how it affected the learning rate. We then zoom in on estimation uncertainty, which has been suggested to be a driving force in exploration, in spite of evidence of widespread aversion to ambiguity. Our data corroborate the latter. We discuss neural evidence that foreshadowed the ability of humans to distinguish between the three levels of uncertainty. Finally, we investigate the boundaries of human capacity to implement Bayesian learning. We repeat the experiment with different instructions, reflecting varying levels of structural uncertainty. Under this fourth notion of uncertainty, choices were no better explained by Bayesian updating than by (model-free) reinforcement learning. Exit questionnaires revealed that participants remained unaware of the presence of unexpected uncertainty and failed to acquire the right model with which to implement Bayesian updating

Infoscience - École polytechnique fédérale de Lausanne

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Caltech Authors

University of Melbourne Institutional Repository

Examining the Robustness of Competing Explanations of Slow Growth in African Countries

Author: Ronelle Burger
Stan du Plessis
Publication venue
Publication date
Field of study

This research challenges previous findings regarding the robustness of the African growth dummy by expanding the list of variables to include those suggested by Easterly and Levine (1998) and Sachs and Warner (1997b). Using the Bayesian Averaging of Classical Estimates approach, this paper concludes that the African growth dummy does not appear to be robustly related to growth. This supports the interpretation that the presence of the African dummy in other studies results from misspecification. The paper also contributes to the debate on growth strategies for Africa by assessing the robustness of divergent perspectives offered in the recent literature.growth, Africa, model specification, robustness

Research Papers in Economics