570 research outputs found
Selective machine learning of doubly robust functionals
While model selection is a well-studied topic in parametric and nonparametric
regression or density estimation, selection of possibly high-dimensional
nuisance parameters in semiparametric problems is far less developed. In this
paper, we propose a selective machine learning framework for making inferences
about a finite-dimensional functional defined on a semiparametric model, when
the latter admits a doubly robust estimating function and several candidate
machine learning algorithms are available for estimating the nuisance
parameters. We introduce two new selection criteria for bias reduction in
estimating the functional of interest, each based on a novel definition of
pseudo-risk for the functional that embodies the double robustness property and
thus is used to select the pair of learners that is nearest to fulfilling this
property. We establish an oracle property for a multi-fold cross-validation
version of the new selection criteria which states that our empirical criteria
perform nearly as well as an oracle with a priori knowledge of the pseudo-risk
for each pair of candidate learners. We also describe a smooth approximation to
the selection criteria which allows for valid post-selection inference.
Finally, we apply the approach to model selection of a semiparametric estimator
of average treatment effect given an ensemble of candidate machine learners to
account for confounding in an observational study
Semiparametric theory for causal mediation analysis: Efficiency bounds, multiple robustness and sensitivity analysis
While estimation of the marginal (total) causal effect of a point exposure on
an outcome is arguably the most common objective of experimental and
observational studies in the health and social sciences, in recent years,
investigators have also become increasingly interested in mediation analysis.
Specifically, upon evaluating the total effect of the exposure, investigators
routinely wish to make inferences about the direct or indirect pathways of the
effect of the exposure, through a mediator variable or not, that occurs
subsequently to the exposure and prior to the outcome. Although powerful
semiparametric methodologies have been developed to analyze observational
studies that produce double robust and highly efficient estimates of the
marginal total causal effect, similar methods for mediation analysis are
currently lacking. Thus, this paper develops a general semiparametric framework
for obtaining inferences about so-called marginal natural direct and indirect
causal effects, while appropriately accounting for a large number of
pre-exposure confounding factors for the exposure and the mediator variables.
Our analytic framework is particularly appealing, because it gives new insights
on issues of efficiency and robustness in the context of mediation analysis. In
particular, we propose new multiply robust locally efficient estimators of the
marginal natural indirect and direct causal effects, and develop a novel double
robust sensitivity analysis framework for the assumption of ignorability of the
mediator variable.Comment: Published in at http://dx.doi.org/10.1214/12-AOS990 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
- …