153 research outputs found
Why Prefer Double Robust Estimates? Illustration with Causal Point Treatment Studies
In point treatment marginal structural models with treatment A, outcome Y and covariates W, causal parameters can be estimated under the assumption of no unobserved confounders. Three estimates can be used: the G-computation, Inverse Probability of Treatment Weighted (IPTW) or Double Robust (DR) estimates. The properties of the IPTW and DR estimates are known under an assumption on the treatment mechanism that we name Experimental Treatment Assignment (ETA) assumption. We show that the DR estimating function is unbiased when the ETA assumption is violated if the model used to regress Y on A and W is correctly specified. The practical consequence is that the IPTW estimate is biased at finite sample size when the ETA assumption is approximately or theoretically violated, whereas the finite sample bias for the DR estimate is negligible if the model used to regress Y on A and W is correctly specified. This result also implies that estimating point treatment causal parameters using a DR estimating function is more robust than using the G-computation formula. We conclude with a methodology to construct DR estimates for a general data structure and prove that such DR estimates are more robust than their corresponding maximum likelihood estimates
Locally Efficient Estimation of Nonparametric Causal Effects on Mean Outcomes in Longitudinal Studies
Marginal Structural Models (MSM) have been introduced by Robins (1998a) as a powerful tool for causal inference as they directly model causal curves of interest, i.e. mean treatment-specific outcomes possibly adjusted for baseline covariates. Two estimators of the corresponding MSM parameters of interest have been proposed, see van der Laan and Robins (2002): the Inverse Probability of Treatment Weighted (IPTW) and the Double Robust (DR) estimators. A parametric MSM approach to causal inference has been favored since the introduction of MSM. It relies on correct specification of a parametric MSM to consistently estimate the parameter of interest using the IPTW or DR estimator. In this paper, we develop an alternative nonparametric MSM approach to causal inference that extends the definition of causal parameters of interest. Such an approach is particularly suitable for investigating causal effects in practice as it does not require the assumption of a correctly specified MSM. We first propose a methodology to generate nonparametric parameters of interest for investigating causal curves in which the treatment is longitudinal. We provide insight on how to interpret these parameters in practice and choose the parameter of interest to best answer the causal question of interest. We also provide two estimators consistent with this approach, i.e. which do not entirely rely, even indirectly, on correct specification of a MSM: the unique IPTW and locally efficient DR estimators. All results are illustrated with a simulation study in which the practical performances of the DR estimators are evaluated for the first time using longitudinal non-survival data. In the last section, we compare the proposed nonparametric MSM approach to causal inference to the more typical parametric MSM approach and contribute to the general understanding of MSM estimation by addressing the issue of MSM misspecification
G-computation Estimation of Nonparametric Causal Effects on Time-Dependent Mean Outcomes in Longitudinal Studies
Two approaches to Causal Inference based on Marginal Structural Models (MSM) have been proposed. They provide different representations of causal effects with distinct causal parameters. Initially, a parametric MSM approach to Causal Inference was developed: it relies on correct specification of a parametric MSM. Recently, a new approach based on nonparametric MSM was introduced. This later approach does not require the assumption of a correctly specified MSM and thus is more realistic if one believes that correct specification of a parametric MSM is unlikely in practice. However, this approach was described only for investigating causal effects on mean outcomes collected at the end of longitudinal studies. In this paper we first generalize the nonparametric MSM approach to the investigation of causal effects on time-dependent outcomes, i.e. for outcomes collected throughout longitudinal studies. This article then develops the G-computation estimation of the corresponding nonparametric MSM parameters and compares its implementation to its analogue in the parametric MSM approach. Finally, we propose new algorithms to address an important computing limitation independent of the MSM approach chosen but inherent to the implementation of the G-computation estimator a) with continuous treatment and/or b) in longitudinal studies with long follow-up and time dependent outcomes. These new algorithms for the implementation of the G-computation estimator lead to a generalization of nonparametric causal effects and should allow broader application of these methodologies in real life studies. Results are illustrated with two simulation studies
Cross-validated Bagged Prediction of Survival
In this article, we show how to apply our previously proposed Deletion/Substitution/Addition algorithm in the context of right-censoring for the prediction of survival. Furthermore, we introduce how to incorporate bagging into the algorithm to obtain a cross-validated bagged estimator. The method is used for predicting the survival time of patients with diffuse large B-cell lymphoma based on gene expression variables
Causal Inference in Longitudinal Studies with History-Restricted Marginal Structural Models
Causal Inference based on Marginal Structural Models (MSMs) is particularly attractive to subject-matter investigators because MSM parameters provide explicit representations of causal effects. We introduce History-Restricted Marginal Structural Models (HRMSMs) for longitudinal data for the purpose of defining causal parameters which may often be better suited for Public Health research. This new class of MSMs allows investigators to analyze the causal effect of a treatment on an outcome based on a fixed, shorter and user-specified history of exposure compared to MSMs. By default, the latter represents the treatment causal effect of interest based on a treatment history defined by the treatments assigned between the study\u27s start and outcome collection. Beyond allowing a more flexible causal analysis, the proposed HRMSMs also mitigate computing issues related to MSMs as well as statistical power concerns when designing longitudinal studies. We develop three consistent estimators of HRMSM parameters under sufficient model assumptions: the Inverse Probability of Treatment Weighted (IPTW), G-computation and Double Robust (DR) estimators. In addition, we show that the assumptions commonly adopted for identification and consistent estimation of MSM parameters (existence of counterfactuals, consistency, time-ordering and sequential randomization assumptions) also lead to identification and consistent estimation of HRMSM parameters
Comparison of the Inverse Probability of Treatment Weighted (IPTW) Estimator With a Naïve Estimator in the Analysis of Longitudinal Data With Time-Dependent Confounding: A Simulation Study
A simulation study was conducted to compare estimates from a naïve estimator, using standard conditional regression, and an IPTW (Inverse Probability of Treatment Weighted) estimator, to true causal parameters for a given MSM (Marginal Structural Model). The study was extracted from a larger epidemiological study (Longitudinal Study of Effects of Physical Activity and Body Composition on Functional Limitation in the Elderly, by Tager et. al [accepted, Epidemiology, September 2003]), which examined the causal effects of physical activity and body composition on functional limitation. The simulation emulated the larger study in terms of the exposure and outcome variables of interest-- physical activity (LTPA), body composition (LNFAT), and physical limitation (PF), but used one time-dependent confounder (HEALTH) to illustrate the effects of estimating causal effects in the presence of time-dependent confounding. In addition to being a time-dependent confounder (i.e. predictor of exposure and outcome over time), HEALTH was also affected by past treatment. Under these conditions, naïve estimates are known to give biased estimates of the causal effects of interest (Robins, 2000). The true causal parameters for LNFAT (-0.61) and LTPA (-0.70) were obtained by assessing the log-odds of functional limitation for a 1-unit increase in LNFAT and participation in vigorous exercise in an ideal experiment in which the counterfactual outcomes were known for every possible combination of LNFAT and LTPA for each subject. Under conditions of moderate confounding, the IPTW estimates for LNFAT and LTPA were -0.62 and -0.94, respectively, versus the naïve estimates of -0.78 and -0.80. For increased levels of confounding of the LNFAT and LTPA variables, the IPTW estimates were -0.60 and -1.28, respectively, and the naïve estimates were -0.85 and -0.87. The bias of the IPTW estimates, particularly under increased levels of confounding, was explored and linked to violation of particular assumptions regarding the IPTW estimation of causal parameters for the MSM
Causal Inference in Epidemiological Studies with Strong Confounding
One of the identifiabilty assumptions of causal effects defined by marginal structural model (MSM) parameters is the experimental treatment assignment (ETA) assumption. Practical violations of this assumption frequently occur in data analysis, when certain exposures are rarely observed within some strata of the population. The inverse probability of treatment weighted (IPTW) estimator is particularly sensitive to violations of this assumption, however, we demonstrate that this is a problem for all estimators of causal effects. This is due to the fact that the ETA assumption is about information (or lack thereof) in the data. A new class of causal models, causal models for realistic individualized exposure rules (CMRIER), introduced in van der Laan and Petersen (2007), is based on dynamic interventions. CMRIER generalize MSM, and their parameters remain fully identifiable from the observed data, even when the ETA assumption is violated, if the dynamic interventions are set to be realistic. Examples of such realistic interventions are provided. We argue that causal effects defined by CMRIER may be more appropriate in many situations, particularly those with policy considerations. Through simulation studies, we examine the performance of the IPTW estimator of the CMRIER parameters in contrast to that of the MSM parameters. We also apply the methodology to a real data analysis in air pollution epidemiology to illustrate the interpretation of the causal effects defined by CMRIER
- …