10,390 research outputs found
Bayesian Structural Inference for Hidden Processes
We introduce a Bayesian approach to discovering patterns in structurally
complex processes. The proposed method of Bayesian Structural Inference (BSI)
relies on a set of candidate unifilar HMM (uHMM) topologies for inference of
process structure from a data series. We employ a recently developed exact
enumeration of topological epsilon-machines. (A sequel then removes the
topological restriction.) This subset of the uHMM topologies has the added
benefit that inferred models are guaranteed to be epsilon-machines,
irrespective of estimated transition probabilities. Properties of
epsilon-machines and uHMMs allow for the derivation of analytic expressions for
estimating transition probabilities, inferring start states, and comparing the
posterior probability of candidate model topologies, despite process internal
structure being only indirectly present in data. We demonstrate BSI's
effectiveness in estimating a process's randomness, as reflected by the Shannon
entropy rate, and its structure, as quantified by the statistical complexity.
We also compare using the posterior distribution over candidate models and the
single, maximum a posteriori model for point estimation and show that the
former more accurately reflects uncertainty in estimated values. We apply BSI
to in-class examples of finite- and infinite-order Markov processes, as well to
an out-of-class, infinite-state hidden process.Comment: 20 pages, 11 figures, 1 table; supplementary materials, 15 pages, 11
figures, 6 tables; http://csc.ucdavis.edu/~cmg/compmech/pubs/bsihp.ht
Conjugate Bayes for probit regression via unified skew-normal distributions
Regression models for dichotomous data are ubiquitous in statistics. Besides
being useful for inference on binary responses, these methods serve also as
building blocks in more complex formulations, such as density regression,
nonparametric classification and graphical models. Within the Bayesian
framework, inference proceeds by updating the priors for the coefficients,
typically set to be Gaussians, with the likelihood induced by probit or logit
regressions for the responses. In this updating, the apparent absence of a
tractable posterior has motivated a variety of computational methods, including
Markov Chain Monte Carlo routines and algorithms which approximate the
posterior. Despite being routinely implemented, Markov Chain Monte Carlo
strategies face mixing or time-inefficiency issues in large p and small n
studies, whereas approximate routines fail to capture the skewness typically
observed in the posterior. This article proves that the posterior distribution
for the probit coefficients has a unified skew-normal kernel, under Gaussian
priors. Such a novel result allows efficient Bayesian inference for a wide
class of applications, especially in large p and small-to-moderate n studies
where state-of-the-art computational methods face notable issues. These
advances are outlined in a genetic study, and further motivate the development
of a wider class of conjugate priors for probit models along with methods to
obtain independent and identically distributed samples from the unified
skew-normal posterior
A Deterministic and Generalized Framework for Unsupervised Learning with Restricted Boltzmann Machines
Restricted Boltzmann machines (RBMs) are energy-based neural-networks which
are commonly used as the building blocks for deep architectures neural
architectures. In this work, we derive a deterministic framework for the
training, evaluation, and use of RBMs based upon the Thouless-Anderson-Palmer
(TAP) mean-field approximation of widely-connected systems with weak
interactions coming from spin-glass theory. While the TAP approach has been
extensively studied for fully-visible binary spin systems, our construction is
generalized to latent-variable models, as well as to arbitrarily distributed
real-valued spin systems with bounded support. In our numerical experiments, we
demonstrate the effective deterministic training of our proposed models and are
able to show interesting features of unsupervised learning which could not be
directly observed with sampling. Additionally, we demonstrate how to utilize
our TAP-based framework for leveraging trained RBMs as joint priors in
denoising problems
Functional approximations to posterior densities: a neural network approach to efficient sampling
The performance of Monte Carlo integration methods like importance sampling or Markov Chain Monte Carlo procedures greatly depends on the choice of the importance or candidate density. Usually, such a density has to be "close" to the target density in order to yield numerically accurate results with efficient sampling. Neural networks seem to be natural importance or candidate densities, as they have a universal approximation property and are easy to sample from. That is, conditional upon the specification of the neural network, sampling can be done either directly or using a Gibbs sampling technique, possibly using auxiliary variables. A key step in the proposed class of methods is the construction of a neural network that approximates the target density accurately. The methods are tested on a set of illustrative models which include a mixture of normal distributions, a Bayesian instrumental variable regression problem with weak instruments and near-identification, and two-regime growth model for US recessions and expansions. These examples involve experiments with non-standard, non-elliptical posterior distributions. The results indicate the feasibility of the neural network approach.Markov chain Monte Carlo;Bayesian inference;importance sampling;neural networks
Functional Approximations to Likelihoods/Posterior Densities: A Neural Network Approach to Efficient Sampling
The performance of Monte Carlo integration methods like importance-sampling or Markov-Chain Monte-Carlo procedures depends greatly on the choice of the importance- or candidate-density. Such a density must typically be "close" to the target density to yield numerically accurate results with efficient sampling. Neural networks are natural importance- or candidate-densities since they have a universal approximation property and are easy to sample from. That is, conditional upon the specified neural network, sampling can be done either directly or using a Gibbs sampling technique, possibly with auxiliary variables. We propose such a class of methods, a key step for which is the construction of a neural network that approximates the target density accurately. The methods are tested on a set of illustrative models that includes a mixture of normal distributions, a Bayesian instrumental-variable regression problem with weak instruments and near-identification, and a two-regime growth model for US recessions and expansions. These examples involve experiments with non-standard, non-elliptical posterior distributions. The results indicate the feasibility of the neural network approachMarkov chain Monte Carlo, importance sampling, neural networks, Bayesian inference
Information Anatomy of Stochastic Equilibria
A stochastic nonlinear dynamical system generates information, as measured by
its entropy rate. Some---the ephemeral information---is dissipated and
some---the bound information---is actively stored and so affects future
behavior. We derive analytic expressions for the ephemeral and bound
informations in the limit of small-time discretization for two classical
systems that exhibit dynamical equilibria: first-order Langevin equations (i)
where the drift is the gradient of a potential function and the diffusion
matrix is invertible and (ii) with a linear drift term (Ornstein-Uhlenbeck) but
a noninvertible diffusion matrix. In both cases, the bound information is
sensitive only to the drift, while the ephemeral information is sensitive only
to the diffusion matrix and not to the drift. Notably, this information anatomy
changes discontinuously as any of the diffusion coefficients vanishes,
indicating that it is very sensitive to the noise structure. We then calculate
the information anatomy of the stochastic cusp catastrophe and of particles
diffusing in a heat bath in the overdamped limit, both examples of stochastic
gradient descent on a potential landscape. Finally, we use our methods to
calculate and compare approximations for the so-called time-local predictive
information for adaptive agents.Comment: 35 pages, 3 figures, 1 table;
http://csc.ucdavis.edu/~cmg/compmech/pubs/iase.ht
Neural network based approximations to posterior densities: a class of flexible sampling methods with applications to reduced rank models
Likelihoods and posteriors of econometric models with strong endogeneity and weakinstruments may exhibit rather non-elliptical contours in the parameter space.This feature also holds for cointegration models when near non-stationarity occursand determining the number of cointegrating relations is a nontrivial issue, and in mixture processes where the modes are relatively far apart. The performance ofMonte Carlo integration methods like importance sampling or Markov ChainMonte Carlo procedures greatly depends in all these cases on the choice of the importance or candidate density. Such a density has to be `close' to the targetdensity in order to yield numerically accurate results with efficient sampling. Neural networks seem to be natural importance or candidate densities, as they havea universal approximation property and are easy to sample from. That is, conditionallyupon the specification of the neural network, sampling can be done either directly orusing a Gibbs sampling technique, possibly using auxiliary variables. A key step in the proposed class of methods is the construction of a neural network that approximatesthe target density accurately. The methods are tested on a set of illustrative modelswhich include a mixture of normal distributions, a Bayesian instrumental variable regression problem with weak instruments and near non-identification, a cointegrationmodel with near non-stationarity and a two-regime growth model for US recessionsand expansions. These examples involve experiments with non-standard, non-ellipticalposterior distributions. The results indicate the feasibility of theneural network approach.Markov chain Monte Carlo;Bayesian inference;neural networks;importance sample
- …