10,756 research outputs found
Marginal integration for nonparametric causal inference
We consider the problem of inferring the total causal effect of a single
variable intervention on a (response) variable of interest. We propose a
certain marginal integration regression technique for a very general class of
potentially nonlinear structural equation models (SEMs) with known structure,
or at least known superset of adjustment variables: we call the procedure
S-mint regression. We easily derive that it achieves the convergence rate as
for nonparametric regression: for example, single variable intervention effects
can be estimated with convergence rate assuming smoothness with
twice differentiable functions. Our result can also be seen as a major
robustness property with respect to model misspecification which goes much
beyond the notion of double robustness. Furthermore, when the structure of the
SEM is not known, we can estimate (the equivalence class of) the directed
acyclic graph corresponding to the SEM, and then proceed by using S-mint based
on these estimates. We empirically compare the S-mint regression method with
more classical approaches and argue that the former is indeed more robust, more
reliable and substantially simpler.Comment: 40 pages, 14 figure
Beyond Covariation: Cues to Causal Structure
Causal induction has two components: learning about the structure of causal models and learning about causal strength and other quantitative parameters. This chapter argues for several interconnected theses. First, people represent causal knowledge qualitatively, in terms of causal structure; quantitative knowledge is derivative. Second, people use a variety of cues to infer causal structure aside from statistical data (e.g. temporal order, intervention, coherence with prior knowledge). Third, once a structural model is hypothesized, subsequent statistical data are used to confirm, refute, or elaborate the model. Fourth, people are limited in the number and complexity of causal models that they can hold in mind to test, but they can separately learn and then integrate simple models, and revise models by adding and removing single links. Finally, current computational models of learning need further development before they can be applied to human learning
Learning Large-Scale Bayesian Networks with the sparsebn Package
Learning graphical models from data is an important problem with wide
applications, ranging from genomics to the social sciences. Nowadays datasets
often have upwards of thousands---sometimes tens or hundreds of thousands---of
variables and far fewer samples. To meet this challenge, we have developed a
new R package called sparsebn for learning the structure of large, sparse
graphical models with a focus on Bayesian networks. While there are many
existing software packages for this task, this package focuses on the unique
setting of learning large networks from high-dimensional data, possibly with
interventions. As such, the methods provided place a premium on scalability and
consistency in a high-dimensional setting. Furthermore, in the presence of
interventions, the methods implemented here achieve the goal of learning a
causal network from data. Additionally, the sparsebn package is fully
compatible with existing software packages for network analysis.Comment: To appear in the Journal of Statistical Software, 39 pages, 7 figure
Penalized Estimation of Directed Acyclic Graphs From Discrete Data
Bayesian networks, with structure given by a directed acyclic graph (DAG),
are a popular class of graphical models. However, learning Bayesian networks
from discrete or categorical data is particularly challenging, due to the large
parameter space and the difficulty in searching for a sparse structure. In this
article, we develop a maximum penalized likelihood method to tackle this
problem. Instead of the commonly used multinomial distribution, we model the
conditional distribution of a node given its parents by multi-logit regression,
in which an edge is parameterized by a set of coefficient vectors with dummy
variables encoding the levels of a node. To obtain a sparse DAG, a group norm
penalty is employed, and a blockwise coordinate descent algorithm is developed
to maximize the penalized likelihood subject to the acyclicity constraint of a
DAG. When interventional data are available, our method constructs a causal
network, in which a directed edge represents a causal relation. We apply our
method to various simulated and real data sets. The results show that our
method is very competitive, compared to many existing methods, in DAG
estimation from both interventional and high-dimensional observational data.Comment: To appear in Statistics and Computin
Time as a guide to cause
How do people learn causal structure? In two studies we investigated
the interplay between temporal order, intervention and covariational cues. In
Study 1 temporal order overrode covariation information, leading to spurious
causal inferences when the temporal cues were misleading. In Study 2 both
temporal order and intervention contributed to accurate causal inference, well
beyond that achievable through covariational data alone. Together the studies
show that people use both temporal order and interventional cues to infer
causal structure, and that these cues dominate the available statistical
information. We endorse a hypothesis-driven account of learning, whereby
people use cues such as temporal order to generate initial models, and then
test these models against the incoming covariational data
- …