12 research outputs found

    G-computation and doubly robust standardisation for continuous-time data: A comparison with inverse probability weighting.

    Get PDF
    In time-to-event settings, g-computation and doubly robust estimators are based on discrete-time data. However, many biological processes are evolving continuously over time. In this paper, we extend the g-computation and the doubly robust standardisation procedures to a continuous-time context. We compare their performance to the well-known inverse-probability-weighting estimator for the estimation of the hazard ratio and restricted mean survival times difference, using a simulation study. Under a correct model specification, all methods are unbiased, but g-computation and the doubly robust standardisation are more efficient than inverse-probability-weighting. We also analyse two real-world datasets to illustrate the practical implementation of these approaches. We have updated the R package RISCA to facilitate the use of these methods and their dissemination

    Counterfactual prediction in causal estimation from real-life data

    No full text
    L’absence de randomisation pour les données de vie réelle complique l’analyse statistique et nécessite l’utilisation de méthodes complexes. L’application web Plug-Stat® facilitece type d’analyse en proposant des interfaces intuitives pour les non-spécialistes. La présente thèse cherche à optimiser Plug-Stat® en automatisant le plus possible l’étape d’estimation causale. Trois travaux étudiant le comportement de ces méthodes et un quatrième proposant un outil d’aide à la décision sont présentés. Le premier travail compare par simulations les méthodes d’inférence causale les plus courantes selon différents ensembles d’ajustement. Le second travail compare différentes approches de machine learning en combinaison avec la g- computation pour éviter les biais liés à une mauvaise spécification du modèle. Le troisième travail présente le développement d’un estimateur de g-computation en présence de censure à droite. Ce nouvel estimateur est également combiné avec un score de propension pour former un estimateur doublement robuste. Ces trois méthodes sont ensuite comparées par simulations. Le dernier travail propose un algorithme d’aide à la vérification de l’hypothèse de positivité. Au final, la g-computation semble être une méthode à considérer pour l'automatisation de Plug-Stat®. Elle ne nécessite pas d'hypothèse d'équilibre et le machine learning évite les problèmes de spécification. Enfin, la vérification de la positivité est automatisée au moyen d'arbres de décision pour permettre à l’investigateur de redéfinir sa population d'étude.The lack of randomisation in observational studies makes statistical analysis harder and requires complex methods. Plug-Stat®, a web application, facilitates such analyses by proposing intuitive interfaces. This thesis searches to optimise Plug-Stat® by maximally automating the causal estimation step. This thesis presents three works investigating the behaviour of these specific methods, and a fourth one showing a decision-making tool for causal studies. The first study compares by simulations the most common causal inference methods across several adjustment sets. The second work compares several machine learning approaches associated with the g-computation to avoid model misspecifications. The third work presents the development of a g-computation estimator in the presence of right censoring. This novel estimator is also associated with a propensity score to form a doubly robust estimator. The three methods are then compared through simulations. The last work proposes an algorithm able to check potential positivity violations. The g-computation should be considered for the automation of Plug-Stat®. It is free of the balancing assumption, and machine learning avoids misspecifications. Positivity checking is also automated through decision trees for helping the investigator to redefine his study population

    The Causal Cookbook: Recipes for Propensity Scores, G-Computation, and Doubly Robust Standardization

    No full text
    Recent developments in the causal-inference literature have renewed psychologists’ interest in how to improve causal conclusions based on observational data. A lot of the recent writing has focused on concerns of causal identification (under which conditions is it, in principle, possible to recover causal effects?); in this primer, we turn to causal estimation (how do researchers actually turn the data into an effect estimate?) and modern approaches to it that are commonly used in epidemiology. First, we explain how causal estimands can be defined rigorously with the help of the potential-outcomes framework, and we highlight four crucial assumptions necessary for causal inference to succeed (exchangeability, positivity, consistency, and noninterference). Next, we present three types of approaches to causal estimation and compare their strengths and weaknesses: propensity-score methods (in which the independent variable is modeled as a function of controls), g-computation methods (in which the dependent variable is modeled as a function of both controls and the independent variable), and doubly robust estimators (which combine models for both independent and dependent variables). A companion R Notebook is available at github.com/ArthurChatton/CausalCookbook. We hope that this nontechnical introduction not only helps psychologists and other social scientists expand their causal toolbox but also facilitates communication across disciplinary boundaries when it comes to causal inference, a research goal common to all fields of research

    G-computation and machine learning for estimating the causal effects of binary exposure statuses on binary outcomes

    No full text
    International audienceAbstract In clinical research, there is a growing interest in the use of propensity score-based methods to estimate causal effects. G-computation is an alternative because of its high statistical power. Machine learning is also increasingly used because of its possible robustness to model misspecification. In this paper, we aimed to propose an approach that combines machine learning and G-computation when both the outcome and the exposure status are binary and is able to deal with small samples. We evaluated the performances of several methods, including penalized logistic regressions, a neural network, a support vector machine, boosted classification and regression trees, and a super learner through simulations. We proposed six different scenarios characterised by various sample sizes, numbers of covariates and relationships between covariates, exposure statuses, and outcomes. We have also illustrated the application of these methods, in which they were used to estimate the efficacy of barbiturates prescribed during the first 24 h of an episode of intracranial hypertension. In the context of GC, for estimating the individual outcome probabilities in two counterfactual worlds, we reported that the super learner tended to outperform the other approaches in terms of both bias and variance, especially for small sample sizes. The support vector machine performed well, but its mean bias was slightly higher than that of the super learner. In the investigated scenarios, G-computation associated with the super learner was a performant method for drawing causal inferences, even from small sample sizes

    G-computation, propensity score-based methods, and targeted maximum likelihood estimator for causal inference with different covariates sets: a comparative simulation study

    Get PDF
    International audienceControlling for confounding bias is crucial in causal inference. Distinct methods are currently employed to mitigate the effects of confounding bias. Each requires the introduction of a set of covariates, which remains difficult to choose, especially regarding the different methods. We conduct a simulation study to compare the relative performance results obtained by using four different sets of covariates (those causing the outcome, those causing the treatment allocation, those causing both the outcome and the treatment allocation, and all the covariates) and four methods: g-computation, inverse probability of treatment weighting, full matching and targeted maximum likelihood estimator. Our simulations are in the context of a binary treatment, a binary outcome and baseline confounders. The simulations suggest that considering all the covariates causing the outcome led to the lowest bias and variance, particularly for g-computation. The consideration of all the covariates did not decrease the bias but significantly reduced the power. We apply these methods to two real-world examples that have clinical relevance, thereby illustrating the real-world importance of using these methods. We propose an R package RISCA to encourage the use of g-computation in causal inference

    Causal and associational language in observational health research:a systematic evaluation

    No full text
    We estimated the degree to which language used in the high profile medical/public health/epidemiology literature implied causality using language linking exposures to outcomes and action recommendations; examined disconnects between language and recommendations; identified the most common linking phrases; and estimated how strongly linking phrases imply causality. We searched and screened for 1,170 articles from 18 high-profile journals (65 per journal) published from 2010-2019. Based on written framing and systematic guidance, three reviewers rated the degree of causality implied in abstracts and full text for exposure/outcome linking language and action recommendations. Reviewers rated the causal implication of exposure/outcome linking language as None (no causal implication) in 13.8%, Weak 34.2%, Moderate 33.2%, and Strong 18.7% of abstracts. The implied causality of action recommendations was higher than the implied causality of linking sentences for 44.5% or commensurate for 40.3% of articles. The most common linking word in abstracts was "associate" (45.7%). Reviewers’ ratings of linking word roots were highly heterogeneous; over half of reviewers rated "association" as having at least some causal implication
    corecore