186,687 research outputs found
Semiparametric theory and empirical processes in causal inference
In this paper we review important aspects of semiparametric theory and
empirical processes that arise in causal inference problems. We begin with a
brief introduction to the general problem of causal inference, and go on to
discuss estimation and inference for causal effects under semiparametric
models, which allow parts of the data-generating process to be unrestricted if
they are not of particular interest (i.e., nuisance functions). These models
are very useful in causal problems because the outcome process is often complex
and difficult to model, and there may only be information available about the
treatment process (at best). Semiparametric theory gives a framework for
benchmarking efficiency and constructing estimators in such settings. In the
second part of the paper we discuss empirical process theory, which provides
powerful tools for understanding the asymptotic behavior of semiparametric
estimators that depend on flexible nonparametric estimators of nuisance
functions. These tools are crucial for incorporating machine learning and other
modern methods into causal inference analyses. We conclude by examining related
extensions and future directions for work in semiparametric causal inference
Causal inference in drug discovery and development
To discover new drugs is to seek and to prove causality. As an emerging approach leveraging human knowledge and creativity, data, and machine intelligence, causal inference holds the promise of reducing cognitive bias and improving decision-making in drug discovery. Although it has been applied across the value chain, the concepts and practice of causal inference remain obscure to many practitioners. This article offers a nontechnical introduction to causal inference, reviews its recent applications, and discusses opportunities and challenges of adopting the causal language in drug discovery and development
Introduction to the Symposium: Causal Inference and Public Health
Assessing the extent to which public health research findings can be causally interpreted continues to be a critical endeavor. In this symposium, we invited several researchers to review issues related to causal inference in social epidemiology and environmental science and to discuss the importance of external validity in public health. Together, this set of articles provides an integral overview of the strengths and limitations of applying causal inference frameworks and related approaches to a variety of public health problems, for both internal and external validity
A Primer on Causality in Data Science
Many questions in Data Science are fundamentally causal in that our objective
is to learn the effect of some exposure, randomized or not, on an outcome
interest. Even studies that are seemingly non-causal, such as those with the
goal of prediction or prevalence estimation, have causal elements, including
differential censoring or measurement. As a result, we, as Data Scientists,
need to consider the underlying causal mechanisms that gave rise to the data,
rather than simply the pattern or association observed in those data. In this
work, we review the 'Causal Roadmap' of Petersen and van der Laan (2014) to
provide an introduction to some key concepts in causal inference. Similar to
other causal frameworks, the steps of the Roadmap include clearly stating the
scientific question, defining of the causal model, translating the scientific
question into a causal parameter, assessing the assumptions needed to express
the causal parameter as a statistical estimand, implementation of statistical
estimators including parametric and semi-parametric methods, and interpretation
of our findings. We believe that using such a framework in Data Science will
help to ensure that our statistical analyses are guided by the scientific
question driving our research, while avoiding over-interpreting our results. We
focus on the effect of an exposure occurring at a single time point and
highlight the use of targeted maximum likelihood estimation (TMLE) with Super
Learner.Comment: 26 pages (with references); 4 figure
Aspects of causal inference.
Observational studies differ from experimental studies in that assignment of subjects to treatments is not randomized but rather occurs due to natural mechanisms, which are usually hidden from researchers. Yet objectives of the two studies are frequently the same: identify the causal – rather than merely associational – relationship between some treatment or exposure and an outcome. The statistical issues that arise in properly analyzing observational data for this goal are numerous and fascinating, and these issues are encompassed in the domain of causal inference. The research presented in this dissertation explores several distinct aspects of causal inference. This dissertation is divided into four chapters. Chapter One gives an introduction to major concepts, underlying assumptions, and analytical frameworks encountered in the domain of causal inference. The next three chapters describe extensive research projects that are linked together by those threads. Chapter Two deals with propensity score techniques and, more specifically, how to specify the propensity score model to achieve the best treatment effect estimates. This chapter not only provides a theoretical proof showing that one particular type of specification is best, but also demonstrates an original method for applying that result. The method presented in Chapter Three has a similar purpose – obtaining precise and accurate estimates of causal effects – but views the challenge through a Bayesian, rather than a frequentist, lens. Here, a hierarchical Bayesian model is developed that is grounded in the framework of causal inference. While Chapters Two and Three focus on scenarios involving causal inference from observational data, Chapter Four presents a method that has been designed to apply equally well to experimental data. The intent of the research here is to provide a method for identifying subgroups of the population in which the treatment effect differs from the overall population average treatment effect. Maintaining a central theme of causal inference, the research focuses on avoiding confounding bias while identifying effect modifiers that characterize the subgroups. In all, this dissertation is intended to provide views of causal inference concepts from several distinct angles, demonstrating the complexity and richness of this domain
Recommended from our members
Bayesian Structural Causal Inference with Probabilistic Programming
Reasoning about causal relationships is central to the human experience. This evokes a natural question in our pursuit of human-like artificial intelligence: how might we imbue intelligent systems with similar causal reasoning capabilities? Better yet, how might we imbue intelligent systems with the ability to learn cause and effect relationships from observation and experimentation? Unfortunately, reasoning about cause and effect requires more than just data: it also requires partial knowledge about data generating mechanisms. Given this need, our task then as computational scientists is to design data structures for representing partial causal knowledge, and algorithms for updating that knowledge in light of observations and experiments. In this dissertation, I explore the Bayesian structural approach to causal inference in which probability distributions over structural causal models are one such data structure, and probabilistic inference in multi-world transformations of those models as the corresponding algorithmic task. Specifically, I demonstrate that this approach has two distinct advantages over the dominant computational paradigm of causal graphical models: (i) it expands the breadth of compatible assumptions; and (ii) it seamlessly integrates with modern Bayesian modeling and inference technologies to facilitate quantification of uncertainty about causal structure and the effects of interventions.
Specifically, doing so allows the emerging and powerful technology of probabilistic programming to be brought to bear on a large and diverse set of causal inference problems. In Chapter 3, I present an example-driven pedagogical introduction to the Bayesian structural approach to causal inference, demonstrating how priors over structural causal models induce joint distributions over observed and latent counterfactual random variables, and how the resulting posterior distributions capture common motifs in causal inference. In particular, I show how various assumptions about latent confounding influence our ability to estimate causal effects from data and I provide examples of common observational and quasi-experimental designs expressed as probabilistic programs. In Chapter 4, I present an advanced application of the Bayesian structural approach for modeling hierarchical relational dependencies with latent confounders, and how to combine such assumptions with flexible Gaussian process models. In Chapter 5, I present a prototype software implementation for causal inference using probabilistic programming, accommodating a broad class of multi-source observational and experimental data. Finally, in Chapter 6, I present Simulation-Based Identifiability, a gradient-based optimization method for determining if any differentiable and bounded prior over structural causal models converges to a unique causal conclusion asymptotically
Applications of Causality and Causal Inference in Software Engineering
Causal inference is a study of causal relationships between events and the
statistical study of inferring these relationships through interventions and
other statistical techniques. Causal reasoning is any line of work toward
determining causal relationships, including causal inference. This paper
explores the relationship between causal reasoning and various fields of
software engineering. This paper aims to uncover which software engineering
fields are currently benefiting from the study of causal inference and causal
reasoning, as well as which aspects of various problems are best addressed
using this methodology. With this information, this paper also aims to find
future subjects and fields that would benefit from this form of reasoning and
to provide that information to future researchers. This paper follows a
systematic literature review, including; the formulation of a search query,
inclusion and exclusion criteria of the search results, clarifying questions
answered by the found literature, and synthesizing the results from the
literature review. Through close examination of the 45 found papers relevant to
the research questions, it was revealed that the majority of causal reasoning
as related to software engineering is related to testing through root cause
localization. Furthermore, most causal reasoning is done informally through an
exploratory process of forming a Causality Graph as opposed to strict
statistical analysis or introduction of interventions. Finally, causal
reasoning is also used as a justification for many tools intended to make the
software more human-readable by providing additional causal information to
logging processes or modeling languages
- …