3,053 research outputs found
Robust causal structure learning with some hidden variables
We introduce a new method to estimate the Markov equivalence class of a
directed acyclic graph (DAG) in the presence of hidden variables, in settings
where the underlying DAG among the observed variables is sparse, and there are
a few hidden variables that have a direct effect on many of the observed ones.
Building on the so-called low rank plus sparse framework, we suggest a
two-stage approach which first removes the effect of the hidden variables, and
then estimates the Markov equivalence class of the underlying DAG under the
assumption that there are no remaining hidden variables. This approach is
consistent in certain high-dimensional regimes and performs favourably when
compared to the state of the art, both in terms of graphical structure recovery
and total causal effect estimation
Sparse Linear Identifiable Multivariate Modeling
In this paper we consider sparse and identifiable linear latent variable
(factor) and linear Bayesian network models for parsimonious analysis of
multivariate data. We propose a computationally efficient method for joint
parameter and model inference, and model comparison. It consists of a fully
Bayesian hierarchy for sparse models using slab and spike priors (two-component
delta-function and continuous mixtures), non-Gaussian latent factors and a
stochastic search over the ordering of the variables. The framework, which we
call SLIM (Sparse Linear Identifiable Multivariate modeling), is validated and
bench-marked on artificial and real biological data sets. SLIM is closest in
spirit to LiNGAM (Shimizu et al., 2006), but differs substantially in
inference, Bayesian network structure learning and model comparison.
Experimentally, SLIM performs equally well or better than LiNGAM with
comparable computational complexity. We attribute this mainly to the stochastic
search strategy used, and to parsimony (sparsity and identifiability), which is
an explicit part of the model. We propose two extensions to the basic i.i.d.
linear framework: non-linear dependence on observed variables, called SNIM
(Sparse Non-linear Identifiable Multivariate modeling) and allowing for
correlations between latent variables, called CSLIM (Correlated SLIM), for the
temporal and/or spatial data. The source code and scripts are available from
http://cogsys.imm.dtu.dk/slim/.Comment: 45 pages, 17 figure
Deep Markov Random Field for Image Modeling
Markov Random Fields (MRFs), a formulation widely used in generative image
modeling, have long been plagued by the lack of expressive power. This issue is
primarily due to the fact that conventional MRFs formulations tend to use
simplistic factors to capture local patterns. In this paper, we move beyond
such limitations, and propose a novel MRF model that uses fully-connected
neurons to express the complex interactions among pixels. Through theoretical
analysis, we reveal an inherent connection between this model and recurrent
neural networks, and thereon derive an approximated feed-forward network that
couples multiple RNNs along opposite directions. This formulation combines the
expressive power of deep neural networks and the cyclic dependency structure of
MRF in a unified model, bringing the modeling capability to a new level. The
feed-forward approximation also allows it to be efficiently learned from data.
Experimental results on a variety of low-level vision tasks show notable
improvement over state-of-the-arts.Comment: Accepted at ECCV 201
Penalized Likelihood Methods for Estimation of Sparse High Dimensional Directed Acyclic Graphs
Directed acyclic graphs (DAGs) are commonly used to represent causal
relationships among random variables in graphical models. Applications of these
models arise in the study of physical, as well as biological systems, where
directed edges between nodes represent the influence of components of the
system on each other. The general problem of estimating DAGs from observed data
is computationally NP-hard, Moreover two directed graphs may be observationally
equivalent. When the nodes exhibit a natural ordering, the problem of
estimating directed graphs reduces to the problem of estimating the structure
of the network. In this paper, we propose a penalized likelihood approach that
directly estimates the adjacency matrix of DAGs. Both lasso and adaptive lasso
penalties are considered and an efficient algorithm is proposed for estimation
of high dimensional DAGs. We study variable selection consistency of the two
penalties when the number of variables grows to infinity with the sample size.
We show that although lasso can only consistently estimate the true network
under stringent assumptions, adaptive lasso achieves this task under mild
regularity conditions. The performance of the proposed methods is compared to
alternative methods in simulated, as well as real, data examples.Comment: 19 pages, 8 figure
Mixed Cumulative Distribution Networks
Directed acyclic graphs (DAGs) are a popular framework to express
multivariate probability distributions. Acyclic directed mixed graphs (ADMGs)
are generalizations of DAGs that can succinctly capture much richer sets of
conditional independencies, and are especially useful in modeling the effects
of latent variables implicitly. Unfortunately there are currently no good
parameterizations of general ADMGs. In this paper, we apply recent work on
cumulative distribution networks and copulas to propose one one general
construction for ADMG models. We consider a simple parameter estimation
approach, and report some encouraging experimental results.Comment: 11 pages, 4 figure
Learning Topic Models and Latent Bayesian Networks Under Expansion Constraints
Unsupervised estimation of latent variable models is a fundamental problem
central to numerous applications of machine learning and statistics. This work
presents a principled approach for estimating broad classes of such models,
including probabilistic topic models and latent linear Bayesian networks, using
only second-order observed moments. The sufficient conditions for
identifiability of these models are primarily based on weak expansion
constraints on the topic-word matrix, for topic models, and on the directed
acyclic graph, for Bayesian networks. Because no assumptions are made on the
distribution among the latent variables, the approach can handle arbitrary
correlations among the topics or latent factors. In addition, a tractable
learning method via optimization is proposed and studied in numerical
experiments.Comment: 38 pages, 6 figures, 2 tables, applications in topic models and
Bayesian networks are studied. Simulation section is adde
- …