1,390 research outputs found
Learning an L1-regularized Gaussian Bayesian Network in the Equivalence Class Space
Learning the structure of a graphical model from data is a common task in a wide range of practical applications. In this paper, we focus on Gaussian Bayesian networks, i.e., on continuous data and directed acyclic graphs with a joint probability density of all variables given by a Gaussian. We propose to work in an equivalence class search space, specifically using the k-greedy equivalence search algorithm. This, combined with regularization techniques to guide the structure search, can learn sparse networks close to the one that generated the data. We provide results on some synthetic networks and on modeling the gene network of the two biological pathways regulating the biosynthesis of isoprenoids for the Arabidopsis thaliana plan
Variational Downscaling, Fusion and Assimilation of Hydrometeorological States via Regularized Estimation
Improved estimation of hydrometeorological states from down-sampled
observations and background model forecasts in a noisy environment, has been a
subject of growing research in the past decades. Here, we introduce a unified
framework that ties together the problems of downscaling, data fusion and data
assimilation as ill-posed inverse problems. This framework seeks solutions
beyond the classic least squares estimation paradigms by imposing proper
regularization, which are constraints consistent with the degree of smoothness
and probabilistic structure of the underlying state. We review relevant
regularization methods in derivative space and extend classic formulations of
the aforementioned problems with particular emphasis on hydrologic and
atmospheric applications. Informed by the statistical characteristics of the
state variable of interest, the central results of the paper suggest that
proper regularization can lead to a more accurate and stable recovery of the
true state and hence more skillful forecasts. In particular, using the Tikhonov
and Huber regularization in the derivative space, the promise of the proposed
framework is demonstrated in static downscaling and fusion of synthetic
multi-sensor precipitation data, while a data assimilation numerical experiment
is presented using the heat equation in a variational setting
Learning Topic Models and Latent Bayesian Networks Under Expansion Constraints
Unsupervised estimation of latent variable models is a fundamental problem
central to numerous applications of machine learning and statistics. This work
presents a principled approach for estimating broad classes of such models,
including probabilistic topic models and latent linear Bayesian networks, using
only second-order observed moments. The sufficient conditions for
identifiability of these models are primarily based on weak expansion
constraints on the topic-word matrix, for topic models, and on the directed
acyclic graph, for Bayesian networks. Because no assumptions are made on the
distribution among the latent variables, the approach can handle arbitrary
correlations among the topics or latent factors. In addition, a tractable
learning method via optimization is proposed and studied in numerical
experiments.Comment: 38 pages, 6 figures, 2 tables, applications in topic models and
Bayesian networks are studied. Simulation section is adde
Sparse Linear Identifiable Multivariate Modeling
In this paper we consider sparse and identifiable linear latent variable
(factor) and linear Bayesian network models for parsimonious analysis of
multivariate data. We propose a computationally efficient method for joint
parameter and model inference, and model comparison. It consists of a fully
Bayesian hierarchy for sparse models using slab and spike priors (two-component
delta-function and continuous mixtures), non-Gaussian latent factors and a
stochastic search over the ordering of the variables. The framework, which we
call SLIM (Sparse Linear Identifiable Multivariate modeling), is validated and
bench-marked on artificial and real biological data sets. SLIM is closest in
spirit to LiNGAM (Shimizu et al., 2006), but differs substantially in
inference, Bayesian network structure learning and model comparison.
Experimentally, SLIM performs equally well or better than LiNGAM with
comparable computational complexity. We attribute this mainly to the stochastic
search strategy used, and to parsimony (sparsity and identifiability), which is
an explicit part of the model. We propose two extensions to the basic i.i.d.
linear framework: non-linear dependence on observed variables, called SNIM
(Sparse Non-linear Identifiable Multivariate modeling) and allowing for
correlations between latent variables, called CSLIM (Correlated SLIM), for the
temporal and/or spatial data. The source code and scripts are available from
http://cogsys.imm.dtu.dk/slim/.Comment: 45 pages, 17 figure
Bayes-optimal Learning of Deep Random Networks of Extensive-width
We consider the problem of learning a target function corresponding to a
deep, extensive-width, non-linear neural network with random Gaussian weights.
We consider the asymptotic limit where the number of samples, the input
dimension and the network width are proportionally large. We propose a
closed-form expression for the Bayes-optimal test error, for regression and
classification tasks. We further compute closed-form expressions for the test
errors of ridge regression, kernel and random features regression. We find, in
particular, that optimally regularized ridge regression, as well as kernel
regression, achieve Bayes-optimal performances, while the logistic loss yields
a near-optimal test error for classification. We further show numerically that
when the number of samples grows faster than the dimension, ridge and kernel
methods become suboptimal, while neural networks achieve test error close to
zero from quadratically many samples
- …