Pragmatic Causal Inference

Abstract

Data-driven causal inference from real-world multivariate systems can be biased for a number of reasons. These include unmeasured confounding, systematic censoring of observations, data dependence induced by a network of unit interactions, and misspecification of parametric models. This dissertation proposes statistical methods spanning three major steps of the causal inference workflow -- discovery of a suitable causal model, which in our case, can be visualized via one of several classes of causal graphical models, identification of target causal parameters as functions of the observed data distribution, and estimation of these parameters from finite samples. The overarching goal of these methods is to augment the data scientist's toolkit to tackle the aforementioned challenges in real-world systems in theoretically sound yet practical ways. We provide a continuous optimization procedure for causal discovery in the presence of latent confounders, and a computationally efficient discrete search procedure for discovery and downstream estimation of causal effects in causal graphs encoding interactions between units in a network. For identification, we provide an algorithm that generalizes the state-of-the-art for recovery of target parameters in missing not at random distributions that can be represented graphically via directed acyclic graphs. Finally for estimation, we provide results on the tangent space of causal graphical models with latent variables which may be used to improve the efficiency of semiparametric estimators for any target parameter of interest. We also provide novel estimators, including influence-function based estimators, for the average causal effect of a point exposure on an outcome when there are latent variables in the system

    Similar works