6 research outputs found

    Causal Inference Methods For Bias Correction In Data Analyses

    Get PDF
    Many problems in the empirical sciences and rational decision making require causal, rather than associative, reasoning. The field of causal inference is concerned with establishing and quantifying cause-effect relationships to inform interventions, even in the absence of direct experimentation or randomization. With the proliferation of massive datasets, it is crucial that we develop principled approaches to drawing actionable conclusions from imperfect information. Inferring valid causal conclusions is impeded by the fact that data are unstructured and filled with different sources of bias. The types of bias that we consider in this thesis include: confounding bias induced by common causes of observed exposures and outcomes, bias in estimation induced by high dimensional data and curse of dimensionality, discriminatory bias encoded in data that reflect historical patterns of discrimination and inequality, and missing data bias where instantiations of variables are systematically missing. The focus of this thesis is on the development of novel causal and statistical methodologies to better understand and resolve these pressing challenges. We draw on methodological insights from both machine learning/artificial intelligence and statistical theory. Specifically, we use ideas from graphical modeling to encode our assumptions about the underlying data generating mechanisms in a clear and succinct manner. Further, we use ideas from nonparametric and semiparametric theories to enable the use of flexible machine learning modes in the estimation of causal effects that are identified as functions of observed data. There are four main contributions to this thesis. First, we bridge the gap between identification and semiparametric estimation of causal effects that are identified in causal graphical models with unmeasured confounders. Second, we use semiparametric inference theory for marginal structural models to give the first general approach to causal sufficient dimension reduction of a high dimensional treatment. Third, we address conceptual, methodological, and practical gaps in assessing and overcoming disparities in automated decision making using causal inference and constrained optimization. Fourth, we use graphical representations of missing data mechanisms and provide a complete characterization of identification of the underlying joint distribution where some variables are systematically missing and others are unmeasured

    Deep Causal Learning: Representation, Discovery and Inference

    Full text link
    Causal learning has attracted much attention in recent years because causality reveals the essential relationship between things and indicates how the world progresses. However, there are many problems and bottlenecks in traditional causal learning methods, such as high-dimensional unstructured variables, combinatorial optimization problems, unknown intervention, unobserved confounders, selection bias and estimation bias. Deep causal learning, that is, causal learning based on deep neural networks, brings new insights for addressing these problems. While many deep learning-based causal discovery and causal inference methods have been proposed, there is a lack of reviews exploring the internal mechanism of deep learning to improve causal learning. In this article, we comprehensively review how deep learning can contribute to causal learning by addressing conventional challenges from three aspects: representation, discovery, and inference. We point out that deep causal learning is important for the theoretical extension and application expansion of causal science and is also an indispensable part of general artificial intelligence. We conclude the article with a summary of open issues and potential directions for future work

    Semiparametric Inference For Causal Effects In Graphical Models With Hidden Variables

    Full text link
    Identification theory for causal effects in causal models associated with hidden variable directed acyclic graphs (DAGs) is well studied. However, the corresponding algorithms are underused due to the complexity of estimating the identifying functionals they output. In this work, we bridge the gap between identification and estimation of population-level causal effects involving a single treatment and a single outcome. We derive influence function based estimators that exhibit double robustness for the identified effects in a large class of hidden variable DAGs where the treatment satisfies a simple graphical criterion; this class includes models yielding the adjustment and front-door functionals as special cases. We also provide necessary and sufficient conditions under which the statistical model of a hidden variable DAG is nonparametrically saturated and implies no equality constraints on the observed data distribution. Further, we derive an important class of hidden variable DAGs that imply observed data distributions observationally equivalent (up to equality constraints) to fully observed DAGs. In these classes of DAGs, we derive estimators that achieve the semiparametric efficiency bounds for the target of interest where the treatment satisfies our graphical criterion. Finally, we provide a sound and complete identification algorithm that directly yields a weight based estimation strategy for any identifiable effect in hidden variable causal models.Comment: 75 page
    corecore