166,128 research outputs found

    Bayesian Dropout

    Full text link
    Dropout has recently emerged as a powerful and simple method for training neural networks preventing co-adaptation by stochastically omitting neurons. Dropout is currently not grounded in explicit modelling assumptions which so far has precluded its adoption in Bayesian modelling. Using Bayesian entropic reasoning we show that dropout can be interpreted as optimal inference under constraints. We demonstrate this on an analytically tractable regression model providing a Bayesian interpretation of its mechanism for regularizing and preventing co-adaptation as well as its connection to other Bayesian techniques. We also discuss two general approximate techniques for applying Bayesian dropout for general models, one based on an analytical approximation and the other on stochastic variational techniques. These techniques are then applied to a Baysian logistic regression problem and are shown to improve performance as the model become more misspecified. Our framework roots dropout as a theoretically justified and practical tool for statistical modelling allowing Bayesians to tap into the benefits of dropout training.Comment: 21 pages, 3 figures. Manuscript prepared 2014 and awaiting submissio

    A Theoretically Grounded Application of Dropout in Recurrent Neural Networks

    Full text link
    Recurrent neural networks (RNNs) stand at the forefront of many recent developments in deep learning. Yet a major difficulty with these models is their tendency to overfit, with dropout shown to fail when applied to recurrent layers. Recent results at the intersection of Bayesian modelling and deep learning offer a Bayesian interpretation of common deep learning techniques such as dropout. This grounding of dropout in approximate Bayesian inference suggests an extension of the theoretical results, offering insights into the use of dropout with RNN models. We apply this new variational inference based dropout technique in LSTM and GRU models, assessing it on language modelling and sentiment analysis tasks. The new approach outperforms existing techniques, and to the best of our knowledge improves on the single model state-of-the-art in language modelling with the Penn Treebank (73.4 test perplexity). This extends our arsenal of variational tools in deep learning.Comment: Added clarifications; Published in NIPS 201

    Data mining as a tool for environmental scientists

    Get PDF
    Over recent years a huge library of data mining algorithms has been developed to tackle a variety of problems in fields such as medical imaging and network traffic analysis. Many of these techniques are far more flexible than more classical modelling approaches and could be usefully applied to data-rich environmental problems. Certain techniques such as Artificial Neural Networks, Clustering, Case-Based Reasoning and more recently Bayesian Decision Networks have found application in environmental modelling while other methods, for example classification and association rule extraction, have not yet been taken up on any wide scale. We propose that these and other data mining techniques could be usefully applied to difficult problems in the field. This paper introduces several data mining concepts and briefly discusses their application to environmental modelling, where data may be sparse, incomplete, or heterogenous

    Generalized structured additive regression based on Bayesian P-splines

    Get PDF
    Generalized additive models (GAM) for modelling nonlinear effects of continuous covariates are now well established tools for the applied statistician. In this paper we develop Bayesian GAM's and extensions to generalized structured additive regression based on one or two dimensional P-splines as the main building block. The approach extends previous work by Lang und Brezger (2003) for Gaussian responses. Inference relies on Markov chain Monte Carlo (MCMC) simulation techniques, and is either based on iteratively weighted least squares (IWLS) proposals or on latent utility representations of (multi)categorical regression models. Our approach covers the most common univariate response distributions, e.g. the Binomial, Poisson or Gamma distribution, as well as multicategorical responses. For the first time, we present Bayesian semiparametric inference for the widely used multinomial logit models. As we will demonstrate through two applications on the forest health status of trees and a space-time analysis of health insurance data, the approach allows realistic modelling of complex problems. We consider the enormous flexibility and extendability of our approach as a main advantage of Bayesian inference based on MCMC techniques compared to more traditional approaches. Software for the methodology presented in the paper is provided within the public domain package BayesX

    Bayesian Learning and Predictability in a Stochastic Nonlinear Dynamical Model

    Get PDF
    Bayesian inference methods are applied within a Bayesian hierarchical modelling framework to the problems of joint state and parameter estimation, and of state forecasting. We explore and demonstrate the ideas in the context of a simple nonlinear marine biogeochemical model. A novel approach is proposed to the formulation of the stochastic process model, in which ecophysiological properties of plankton communities are represented by autoregressive stochastic processes. This approach captures the effects of changes in plankton communities over time, and it allows the incorporation of literature metadata on individual species into prior distributions for process model parameters. The approach is applied to a case study at Ocean Station Papa, using Particle Markov chain Monte Carlo computational techniques. The results suggest that, by drawing on objective prior information, it is possible to extract useful information about model state and a subset of parameters, and even to make useful long-term forecasts, based on sparse and noisy observations

    Measuring stellar differential rotation with high-precision space-borne photometry

    Full text link
    We introduce a method of measuring a lower limit to the amplitude of surface differential rotation from high-precision, evenly sampled photometric time series. It is applied to main-sequence late-type stars whose optical flux modulation is dominated by starspots. An autocorrelation of the time series was used to select stars that allow an accurate determination of starspot rotation periods. A simple two-spot model was applied together with a Bayesian information criterion to preliminarily select intervals of the time series showing evidence of differential rotation with starspots of almost constant area. Finally, the significance of the differential rotation detection and a measurement of its amplitude and uncertainty were obtained by an a posteriori Bayesian analysis based on a Monte Carlo Markov Chain approach. We applied our method to the Sun and eight other stars for which previous spot modelling had been performed to compare our results with previous ones. We find that autocorrelation is a simple method for selecting stars with a coherent rotational signal that is a prerequisite for successfully measuring differential rotation through spot modelling. For a proper Monte Carlo Markov Chain analysis, it is necessary to take the strong correlations among different parameters that exist in spot modelling into account. For the planet-hosting star Kepler-30, we derive a lower limit to the relative amplitude of the differential rotation of \Delta P / P = 0.0523 \pm 0.0016. We confirm that the Sun as a star in the optical passband is not suitable for measuring differential rotation owing to the rapid evolution of its photospheric active regions. In general, our method performs well in comparison to more sophisticated and time-consuming approaches.Comment: Accepted to Astronomy and Astrophysics, 15 pages, 13 figures, 4 tables and an Appendi

    Bayesian computational methods

    Get PDF
    If, in the mid 1980?s, one had asked the average statistician about the difficulties of using Bayesian Statistics, his/her most likely answer would have been ?Well, there is this problem of selecting a prior distribution and then, even if one agrees on the prior, the whole Bayesian inference is simply impossible to implement in practice!? The same question asked in the 21th Century does not produce the same reply, but rather a much less serious complaint about the lack of generic software (besides winBUGS)! The last 15 years have indeed seen a tremendous change in the way Bayesian Statistics are perceived, both by mathematical statisticians and by applied statisticians and the impetus behind this change has been a prodigious leap-forward in the computational abilities. The availability of very powerful approximation methods has correlatively freed Bayesian modelling, in terms of both model scope and prior modelling. As discussed below, a most successful illustration of this gained freedom can be seen in Bayesian model choice, which was only emerging at the beginning of the MCMC era, for lack of appropriate computational tools. In this chapter, we will first present the most standard computational challenges met in Bayesian Statistics (Section 2), and then relate these problems with computational solutions. Of course, this chapter is only a terse introduction to the problems and solutions related to Bayesian computations. For more complete references, see Robert and Casella (1999, 2004) and Liu (2001), among others. We also restrain from providing an introduction to Bayesian Statistics per se and for comprehensive coverage, address the reader to Robert (2001), (again) among others. --

    Models beyond the Dirichlet process

    Get PDF
    Bayesian nonparametric inference is a relatively young area of research and it has recently undergone a strong development. Most of its success can be explained by the considerable degree of exibility it ensures in statistical modelling, if compared to parametric alternatives, and by the emergence of new and ecient simulation techniques that make nonparametric models amenable to concrete use in a number of applied statistical problems. Since its introduction in 1973 by T.S. Ferguson, the Dirichlet process has emerged as a cornerstone in Bayesian nonparametrics. Nonetheless, in some cases of interest for statistical applications the Dirichlet process is not an adequate prior choice and alternative nonparametric models need to be devised. In this paper we provide a review of Bayesian nonparametric models that go beyond the Dirichlet process.
    corecore