84 research outputs found
Causal survival analysis under competing risks using longitudinal modified treatment policies
Longitudinal modified treatment policies (LMTP) have been recently developed
as a novel method to define and estimate causal parameters that depend on the
natural value of treatment. LMTPs represent an important advancement in causal
inference for longitudinal studies as they allow the non-parametric definition
and estimation of the joint effect of multiple categorical, numerical, or
continuous exposures measured at several time points. We extend the LMTP
methodology to problems in which the outcome is a time-to-event variable
subject to right-censoring and competing risks. We present identification
results and non-parametric locally efficient estimators that use flexible
data-adaptive regression techniques to alleviate model misspecification bias,
while retaining important asymptotic properties such as -consistency.
We present an application to the estimation of the effect of the
time-to-intubation on acute kidney injury amongst COVID-19 hospitalized
patients, where death by other causes is taken to be the competing event
A generalization of moderated statistics to data adaptive semiparametric estimation in high-dimensional biology
The widespread availability of high-dimensional biological data has made the
simultaneous screening of numerous biological characteristics a central
statistical problem in computational biology. While the dimensionality of such
datasets continues to increase, the problem of teasing out the effects of
biomarkers in studies measuring baseline confounders while avoiding model
misspecification remains only partially addressed. Efficient estimators
constructed from data adaptive estimates of the data-generating distribution
provide an avenue for avoiding model misspecification; however, in the context
of high-dimensional problems requiring simultaneous estimation of numerous
parameters, standard variance estimators have proven unstable, resulting in
unreliable Type-I error control under standard multiple testing corrections. We
present the formulation of a general approach for applying empirical Bayes
shrinkage approaches to asymptotically linear estimators of parameters defined
in the nonparametric model. The proposal applies existing shrinkage estimators
to the estimated variance of the influence function, allowing for increased
inferential stability in high-dimensional settings. A methodology for
nonparametric variable importance analysis for use with high-dimensional
biological datasets with modest sample sizes is introduced and the proposed
technique is demonstrated to be robust in small samples even when relying on
data adaptive estimators that eschew parametric forms. Use of the proposed
variance moderation strategy in constructing stabilized variable importance
measures of biomarkers is demonstrated by application to an observational study
of occupational exposure. The result is a data adaptive approach for robustly
uncovering stable associations in high-dimensional data with limited sample
sizes
Revisiting the propensity score's central role: Towards bridging balance and efficiency in the era of causal machine learning
About forty years ago, in a now--seminal contribution, Rosenbaum & Rubin
(1983) introduced a critical characterization of the propensity score as a
central quantity for drawing causal inferences in observational study settings.
In the decades since, much progress has been made across several research
fronts in causal inference, notably including the re-weighting and matching
paradigms. Focusing on the former and specifically on its intersection with
machine learning and semiparametric efficiency theory, we re-examine the role
of the propensity score in modern methodological developments. As Rosenbaum &
Rubin (1983)'s contribution spurred a focus on the balancing property of the
propensity score, we re-examine the degree to which and how this property plays
a role in the development of asymptotically efficient estimators of causal
effects; moreover, we discuss a connection between the balancing property and
efficient estimation in the form of score equations and propose a score test
for evaluating whether an estimator achieves balance.Comment: Accepted for publication in a forthcoming special issue of
Observational Studie
A nonparametric framework for treatment effect modifier discovery in high dimensions
Heterogeneous treatment effects are driven by treatment effect modifiers,
pre-treatment covariates that modify the effect of a treatment on an outcome.
Current approaches for uncovering these variables are limited to
low-dimensional data, data with weakly correlated covariates, or data generated
according to parametric processes. We resolve these issues by developing a
framework for defining model-agnostic treatment effect modifier variable
importance parameters applicable to high-dimensional data with arbitrary
correlation structure, deriving one-step, estimating equation and targeted
maximum likelihood estimators of these parameters, and establishing these
estimators' asymptotic properties. This framework is showcased by defining
variable importance parameters for data-generating processes with continuous,
binary, and time-to-event outcomes with binary treatments, and deriving
accompanying multiply-robust and asymptotically linear estimators. Simulation
experiments demonstrate that these estimators' asymptotic guarantees are
approximately achieved in realistic sample sizes for observational and
randomized studies alike. This framework is applied to gene expression data
collected for a clinical trial assessing the effect of a monoclonal antibody
therapy on disease-free survival in breast cancer patients. Genes predicted to
have the greatest potential for treatment effect modification have previously
been linked to breast cancer. An open-source R package implementing this
methodology, unihtee, is made available on GitHub at
https://github.com/insightsengineering/unihtee
- …