18 research outputs found
Counting and Sampling from Markov Equivalent DAGs Using Clique Trees
A directed acyclic graph (DAG) is the most common graphical model for
representing causal relationships among a set of variables. When restricted to
using only observational data, the structure of the ground truth DAG is
identifiable only up to Markov equivalence, based on conditional independence
relations among the variables. Therefore, the number of DAGs equivalent to the
ground truth DAG is an indicator of the causal complexity of the underlying
structure--roughly speaking, it shows how many interventions or how much
additional information is further needed to recover the underlying DAG. In this
paper, we propose a new technique for counting the number of DAGs in a Markov
equivalence class. Our approach is based on the clique tree representation of
chordal graphs. We show that in the case of bounded degree graphs, the proposed
algorithm is polynomial time. We further demonstrate that this technique can be
utilized for uniform sampling from a Markov equivalence class, which provides a
stochastic way to enumerate DAGs in the equivalence class and may be needed for
finding the best DAG or for causal inference given the equivalence class as
input. We also extend our counting and sampling method to the case where prior
knowledge about the underlying DAG is available, and present applications of
this extension in causal experiment design and estimating the causal effect of
joint interventions
Identification and Estimation for Nonignorable Missing Data: A Data Fusion Approach
We consider the task of identifying and estimating a parameter of interest in
settings where data is missing not at random (MNAR). In general, such
parameters are not identified without strong assumptions on the missing data
model. In this paper, we take an alternative approach and introduce a method
inspired by data fusion, where information in an MNAR dataset is augmented by
information in an auxiliary dataset subject to missingness at random (MAR). We
show that even if the parameter of interest cannot be identified given either
dataset alone, it can be identified given pooled data, under two complementary
sets of assumptions. We derive an inverse probability weighted (IPW) estimator
for identified parameters, and evaluate the performance of our estimation
strategies via simulation studies.Comment: 21 pages, 4 figure
Partial Identification of Causal Effects Using Proxy Variables
Proximal causal inference is a recently proposed framework for evaluating the
causal effect of a treatment on an outcome variable in the presence of
unmeasured confounding (Miao et al., 2018a; Tchetgen Tchetgen et al., 2020).
For nonparametric point identification, the framework leverages proxy variables
of unobserved confounders, provided that such proxies are sufficiently relevant
for the latter, a requirement that has previously been formalized as a
completeness condition. Completeness is key to connecting the observed proxy
data to hidden factors via a so-called confounding bridge function,
identification of which is an important step towards proxy-based point
identification of causal effects. However, completeness is well-known not to be
empirically testable, therefore potentially restricting the application of the
proximal causal framework. In this paper, we propose partial identification
methods that do not require completeness and obviate the need for
identification of a bridge function. That is, we establish that proxies of
unobserved confounders can be leveraged to obtain bounds on the causal effect
of the treatment on the outcome even if available information does not suffice
to identify either a bridge function or a corresponding causal effect of
interest. We further establish analogous partial identification results in
related settings where identification hinges upon hidden mediators for which
proxies are available, however such proxies are not sufficiently rich for point
identification of a bridge function or a corresponding causal effect of
interest