13,859 research outputs found
Sparse Nonparametric Graphical Models
We present some nonparametric methods for graphical modeling. In the discrete
case, where the data are binary or drawn from a finite alphabet, Markov random
fields are already essentially nonparametric, since the cliques can take only a
finite number of values. Continuous data are different. The Gaussian graphical
model is the standard parametric model for continuous data, but it makes
distributional assumptions that are often unrealistic. We discuss two
approaches to building more flexible graphical models. One allows arbitrary
graphs and a nonparametric extension of the Gaussian; the other uses kernel
density estimation and restricts the graphs to trees and forests. Examples of
both methods are presented. We also discuss possible future research directions
for nonparametric graphical modeling.Comment: Published in at http://dx.doi.org/10.1214/12-STS391 the Statistical
Science (http://www.imstat.org/sts/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Gaussian Process Structural Equation Models with Latent Variables
In a variety of disciplines such as social sciences, psychology, medicine and
economics, the recorded data are considered to be noisy measurements of latent
variables connected by some causal structure. This corresponds to a family of
graphical models known as the structural equation model with latent variables.
While linear non-Gaussian variants have been well-studied, inference in
nonparametric structural equation models is still underdeveloped. We introduce
a sparse Gaussian process parameterization that defines a non-linear structure
connecting latent variables, unlike common formulations of Gaussian process
latent variable models. The sparse parameterization is given a full Bayesian
treatment without compromising Markov chain Monte Carlo efficiency. We compare
the stability of the sampling procedure and the predictive ability of the model
against the current practice.Comment: 12 pages, 6 figure
Bayesian nonparametric sparse VAR models
High dimensional vector autoregressive (VAR) models require a large number of
parameters to be estimated and may suffer of inferential problems. We propose a
new Bayesian nonparametric (BNP) Lasso prior (BNP-Lasso) for high-dimensional
VAR models that can improve estimation efficiency and prediction accuracy. Our
hierarchical prior overcomes overparametrization and overfitting issues by
clustering the VAR coefficients into groups and by shrinking the coefficients
of each group toward a common location. Clustering and shrinking effects
induced by the BNP-Lasso prior are well suited for the extraction of causal
networks from time series, since they account for some stylized facts in
real-world networks, which are sparsity, communities structures and
heterogeneity in the edges intensity. In order to fully capture the richness of
the data and to achieve a better understanding of financial and macroeconomic
risk, it is therefore crucial that the model used to extract network accounts
for these stylized facts.Comment: Forthcoming in "Journal of Econometrics" ---- Revised Version of the
paper "Bayesian nonparametric Seemingly Unrelated Regression Models" ----
Supplementary Material available on reques
Learning the Structure of Deep Sparse Graphical Models
Deep belief networks are a powerful way to model complex probability
distributions. However, learning the structure of a belief network,
particularly one with hidden units, is difficult. The Indian buffet process has
been used as a nonparametric Bayesian prior on the directed structure of a
belief network with a single infinitely wide hidden layer. In this paper, we
introduce the cascading Indian buffet process (CIBP), which provides a
nonparametric prior on the structure of a layered, directed belief network that
is unbounded in both depth and width, yet allows tractable inference. We use
the CIBP prior with the nonlinear Gaussian belief network so each unit can
additionally vary its behavior between discrete and continuous representations.
We provide Markov chain Monte Carlo algorithms for inference in these belief
networks and explore the structures learned on several image data sets.Comment: 20 pages, 6 figures, AISTATS 2010, Revise
Marginal integration for nonparametric causal inference
We consider the problem of inferring the total causal effect of a single
variable intervention on a (response) variable of interest. We propose a
certain marginal integration regression technique for a very general class of
potentially nonlinear structural equation models (SEMs) with known structure,
or at least known superset of adjustment variables: we call the procedure
S-mint regression. We easily derive that it achieves the convergence rate as
for nonparametric regression: for example, single variable intervention effects
can be estimated with convergence rate assuming smoothness with
twice differentiable functions. Our result can also be seen as a major
robustness property with respect to model misspecification which goes much
beyond the notion of double robustness. Furthermore, when the structure of the
SEM is not known, we can estimate (the equivalence class of) the directed
acyclic graph corresponding to the SEM, and then proceed by using S-mint based
on these estimates. We empirically compare the S-mint regression method with
more classical approaches and argue that the former is indeed more robust, more
reliable and substantially simpler.Comment: 40 pages, 14 figure
Sparse covariance estimation in heterogeneous samples
Standard Gaussian graphical models (GGMs) implicitly assume that the
conditional independence among variables is common to all observations in the
sample. However, in practice, observations are usually collected form
heterogeneous populations where such assumption is not satisfied, leading in
turn to nonlinear relationships among variables. To tackle these problems we
explore mixtures of GGMs; in particular, we consider both infinite mixture
models of GGMs and infinite hidden Markov models with GGM emission
distributions. Such models allow us to divide a heterogeneous population into
homogenous groups, with each cluster having its own conditional independence
structure. The main advantage of considering infinite mixtures is that they
allow us easily to estimate the number of number of subpopulations in the
sample. As an illustration, we study the trends in exchange rate fluctuations
in the pre-Euro era. This example demonstrates that the models are very
flexible while providing extremely interesting interesting insights into
real-life applications
- …