Search CORE

15 research outputs found

Variational Temporal Deconfounder for Individualized Treatment Effect Estimation from Longitudinal Observational Data

Author: Bian Jiang
Feng Zheng
Prosperi Mattia
Publication venue
Publication date: 23/07/2022
Field of study

Estimating treatment effects, especially individualized treatment effects (ITE), using observational data is challenging due to the complex situations of confounding bias. Existing approaches for estimating treatment effects from longitudinal observational data are usually built upon a strong assumption of "unconfoundedness", which is hard to fulfill in real-world practice. In this paper, we propose the Variational Temporal Deconfounder (VTD), an approach that leverages deep variational embeddings in the longitudinal setting using proxies (i.e., surrogate variables that serve for unobservable variables). Specifically, VTD leverages observed proxies to learn a hidden embedding that reflects the true hidden confounders in the observational data. As such, our VTD method does not rely on the "unconfoundedness" assumption. We test our VTD method on both synthetic and real-world clinical data, and the results show that our approach is effective when hidden confounding is the leading bias compared to other existing models

arXiv.org e-Print Archive

R-miss-tastic: a unified platform for missing values methods and workflows

Author: Josse Julie
Mayer Imke
Sportisse Aude
Tierney Nicholas
Vialaneix Nathalie
Publication venue
Publication date: 24/03/2021
Field of study

Missing values are unavoidable when working with data. Their occurrence is exacerbated as more data from different sources become available. However, most statistical models and visualization methods require complete data, and improper handling of missing data results in information loss, or biased analyses. Since the seminal work of Rubin (1976), there has been a burgeoning literature on missing values with heterogeneous aims and motivations. This has resulted in the development of various methods, formalizations, and tools (including a large number of R packages and Python modules). However, for practitioners, it remains challenging to decide which method is most suited for their problem, partially because handling missing data is still not a topic systematically covered in statistics or data science curricula. To help address this challenge, we have launched a unified platform: "R-miss-tastic", which aims to provide an overview of standard missing values problems, methods, how to handle them in analyses, and relevant implementations of methodologies. In the same perspective, we have also developed several pipelines in R and Python to allow for a hands-on illustration of how to handle missing values in various statistical tasks such as estimation and prediction, while ensuring reproducibility of the analyses. This will hopefully also provide some guidance on deciding which method to choose for a specific problem and data. The objective of this work is not only to comprehensively organize materials, but also to create standardized analysis workflows, and to provide a common ground for discussions among the community. This platform is thus suited for beginners, students, more advanced analysts and researchers.Comment: 38 pages, 9 figure

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Hal-Diderot

Recommended from our members

Improving Evaluation Methods for Causal Modeling

Author: Gentzel Amanda
Publication venue: ScholarWorks@UMass Amherst
Publication date: 30/06/2021
Field of study

Causal modeling is central to many areas of artificial intelligence, including complex reasoning, planning, knowledge-base construction, robotics, explanation, and fairness. Active communities of researchers in machine learning, statistics, social science, and other fields develop and enhance algorithms that learn causal models from data, and this work has produced a series of impressive technical advances. However, evaluation techniques for causal modeling algorithms have remained somewhat primitive, limiting what we can learn from the experimental studies of algorithm performance, constraining the types of algorithms and model representations that researchers consider, and creating a gap between theory and practice. We argue for expanding the standard techniques for evaluating algorithms that construct causal models. Specifically, we argue for the addition of evaluation techniques that use interventional measures rather than structural or observational measures, and that evaluate with those measures on empirical data rather than synthetic data. We survey the current practice in evaluation and show that, while the evaluation techniques we advocate are rarely used in practice, they are feasible and produce substantially different results than using structural measures and synthetic data. We also provide a protocol for generating observational-style data sets from experimental data, allowing the creation of a large number of data sets suitable for evaluation of causal modeling algorithms. We then perform a large-scale evaluation of seven causal modeling methods over 37 data sets, drawn from randomized controlled trials, as well as simulators, real-world computational systems, and observational data sets augmented with a synthetic response variable. We find notable performance differences when comparing across data from different sources. This difference demonstrates the importance of using data from a variety of sources when evaluating any causal modeling methods

ScholarWorks@UMass Amherst