10,390 research outputs found

    Bayesian Structural Inference for Hidden Processes

    Full text link
    We introduce a Bayesian approach to discovering patterns in structurally complex processes. The proposed method of Bayesian Structural Inference (BSI) relies on a set of candidate unifilar HMM (uHMM) topologies for inference of process structure from a data series. We employ a recently developed exact enumeration of topological epsilon-machines. (A sequel then removes the topological restriction.) This subset of the uHMM topologies has the added benefit that inferred models are guaranteed to be epsilon-machines, irrespective of estimated transition probabilities. Properties of epsilon-machines and uHMMs allow for the derivation of analytic expressions for estimating transition probabilities, inferring start states, and comparing the posterior probability of candidate model topologies, despite process internal structure being only indirectly present in data. We demonstrate BSI's effectiveness in estimating a process's randomness, as reflected by the Shannon entropy rate, and its structure, as quantified by the statistical complexity. We also compare using the posterior distribution over candidate models and the single, maximum a posteriori model for point estimation and show that the former more accurately reflects uncertainty in estimated values. We apply BSI to in-class examples of finite- and infinite-order Markov processes, as well to an out-of-class, infinite-state hidden process.Comment: 20 pages, 11 figures, 1 table; supplementary materials, 15 pages, 11 figures, 6 tables; http://csc.ucdavis.edu/~cmg/compmech/pubs/bsihp.ht

    Conjugate Bayes for probit regression via unified skew-normal distributions

    Full text link
    Regression models for dichotomous data are ubiquitous in statistics. Besides being useful for inference on binary responses, these methods serve also as building blocks in more complex formulations, such as density regression, nonparametric classification and graphical models. Within the Bayesian framework, inference proceeds by updating the priors for the coefficients, typically set to be Gaussians, with the likelihood induced by probit or logit regressions for the responses. In this updating, the apparent absence of a tractable posterior has motivated a variety of computational methods, including Markov Chain Monte Carlo routines and algorithms which approximate the posterior. Despite being routinely implemented, Markov Chain Monte Carlo strategies face mixing or time-inefficiency issues in large p and small n studies, whereas approximate routines fail to capture the skewness typically observed in the posterior. This article proves that the posterior distribution for the probit coefficients has a unified skew-normal kernel, under Gaussian priors. Such a novel result allows efficient Bayesian inference for a wide class of applications, especially in large p and small-to-moderate n studies where state-of-the-art computational methods face notable issues. These advances are outlined in a genetic study, and further motivate the development of a wider class of conjugate priors for probit models along with methods to obtain independent and identically distributed samples from the unified skew-normal posterior

    A Deterministic and Generalized Framework for Unsupervised Learning with Restricted Boltzmann Machines

    Full text link
    Restricted Boltzmann machines (RBMs) are energy-based neural-networks which are commonly used as the building blocks for deep architectures neural architectures. In this work, we derive a deterministic framework for the training, evaluation, and use of RBMs based upon the Thouless-Anderson-Palmer (TAP) mean-field approximation of widely-connected systems with weak interactions coming from spin-glass theory. While the TAP approach has been extensively studied for fully-visible binary spin systems, our construction is generalized to latent-variable models, as well as to arbitrarily distributed real-valued spin systems with bounded support. In our numerical experiments, we demonstrate the effective deterministic training of our proposed models and are able to show interesting features of unsupervised learning which could not be directly observed with sampling. Additionally, we demonstrate how to utilize our TAP-based framework for leveraging trained RBMs as joint priors in denoising problems

    Functional approximations to posterior densities: a neural network approach to efficient sampling

    Get PDF
    The performance of Monte Carlo integration methods like importance sampling or Markov Chain Monte Carlo procedures greatly depends on the choice of the importance or candidate density. Usually, such a density has to be "close" to the target density in order to yield numerically accurate results with efficient sampling. Neural networks seem to be natural importance or candidate densities, as they have a universal approximation property and are easy to sample from. That is, conditional upon the specification of the neural network, sampling can be done either directly or using a Gibbs sampling technique, possibly using auxiliary variables. A key step in the proposed class of methods is the construction of a neural network that approximates the target density accurately. The methods are tested on a set of illustrative models which include a mixture of normal distributions, a Bayesian instrumental variable regression problem with weak instruments and near-identification, and two-regime growth model for US recessions and expansions. These examples involve experiments with non-standard, non-elliptical posterior distributions. The results indicate the feasibility of the neural network approach.Markov chain Monte Carlo;Bayesian inference;importance sampling;neural networks

    Functional Approximations to Likelihoods/Posterior Densities: A Neural Network Approach to Efficient Sampling

    Get PDF
    The performance of Monte Carlo integration methods like importance-sampling or Markov-Chain Monte-Carlo procedures depends greatly on the choice of the importance- or candidate-density. Such a density must typically be "close" to the target density to yield numerically accurate results with efficient sampling. Neural networks are natural importance- or candidate-densities since they have a universal approximation property and are easy to sample from. That is, conditional upon the specified neural network, sampling can be done either directly or using a Gibbs sampling technique, possibly with auxiliary variables. We propose such a class of methods, a key step for which is the construction of a neural network that approximates the target density accurately. The methods are tested on a set of illustrative models that includes a mixture of normal distributions, a Bayesian instrumental-variable regression problem with weak instruments and near-identification, and a two-regime growth model for US recessions and expansions. These examples involve experiments with non-standard, non-elliptical posterior distributions. The results indicate the feasibility of the neural network approachMarkov chain Monte Carlo, importance sampling, neural networks, Bayesian inference

    Information Anatomy of Stochastic Equilibria

    Full text link
    A stochastic nonlinear dynamical system generates information, as measured by its entropy rate. Some---the ephemeral information---is dissipated and some---the bound information---is actively stored and so affects future behavior. We derive analytic expressions for the ephemeral and bound informations in the limit of small-time discretization for two classical systems that exhibit dynamical equilibria: first-order Langevin equations (i) where the drift is the gradient of a potential function and the diffusion matrix is invertible and (ii) with a linear drift term (Ornstein-Uhlenbeck) but a noninvertible diffusion matrix. In both cases, the bound information is sensitive only to the drift, while the ephemeral information is sensitive only to the diffusion matrix and not to the drift. Notably, this information anatomy changes discontinuously as any of the diffusion coefficients vanishes, indicating that it is very sensitive to the noise structure. We then calculate the information anatomy of the stochastic cusp catastrophe and of particles diffusing in a heat bath in the overdamped limit, both examples of stochastic gradient descent on a potential landscape. Finally, we use our methods to calculate and compare approximations for the so-called time-local predictive information for adaptive agents.Comment: 35 pages, 3 figures, 1 table; http://csc.ucdavis.edu/~cmg/compmech/pubs/iase.ht

    Neural network based approximations to posterior densities: a class of flexible sampling methods with applications to reduced rank models

    Get PDF
    Likelihoods and posteriors of econometric models with strong endogeneity and weakinstruments may exhibit rather non-elliptical contours in the parameter space.This feature also holds for cointegration models when near non-stationarity occursand determining the number of cointegrating relations is a nontrivial issue, and in mixture processes where the modes are relatively far apart. The performance ofMonte Carlo integration methods like importance sampling or Markov ChainMonte Carlo procedures greatly depends in all these cases on the choice of the importance or candidate density. Such a density has to be `close' to the targetdensity in order to yield numerically accurate results with efficient sampling. Neural networks seem to be natural importance or candidate densities, as they havea universal approximation property and are easy to sample from. That is, conditionallyupon the specification of the neural network, sampling can be done either directly orusing a Gibbs sampling technique, possibly using auxiliary variables. A key step in the proposed class of methods is the construction of a neural network that approximatesthe target density accurately. The methods are tested on a set of illustrative modelswhich include a mixture of normal distributions, a Bayesian instrumental variable regression problem with weak instruments and near non-identification, a cointegrationmodel with near non-stationarity and a two-regime growth model for US recessionsand expansions. These examples involve experiments with non-standard, non-ellipticalposterior distributions. The results indicate the feasibility of theneural network approach.Markov chain Monte Carlo;Bayesian inference;neural networks;importance sample
    corecore