1,348 research outputs found
Deductive semiparametric estimation in Double-Sampling Designs with application to PEPFAR
Non-ignorable dropout is common in studies with long follow-up time, and it
can bias study results unless handled carefully. A double-sampling design
allocates additional resources to pursue a subsample of the dropouts and find
out their outcomes, which can address potential biases due to non-ignorable
dropout. It is desirable to construct semiparametric estimators for the
double-sampling design because of their robustness properties. However,
obtaining such semiparametric estimators remains a challenge due to the
requirement of the analytic form of the efficient influence function (EIF), the
derivation of which can be ad hoc and difficult for the double-sampling design.
Recent work has shown how the derivation of EIF can be made deductive and
computerizable using the functional derivative representation of the EIF in
nonparametric models. This approach, however, requires deriving the mixture of
a continuous distribution and a point mass, which can itself be challenging for
complicated problems such as the double-sampling design. We propose
semiparametric estimators for the survival probability in double-sampling
designs by generalizing the deductive and computerizable estimation approach.
In particular, we propose to build the semiparametric estimators based on a
discretized support structure, which approximates the possibly continuous
observed data distribution and circumvents the derivation of the mixture
distribution. Our approach is deductive in the sense that it is expected to
produce semiparametric locally efficient estimators within finite steps without
knowledge of the EIF. We apply the proposed estimators to estimating the
mortality rate in a double-sampling design component of the President's
Emergency Plan for AIDS Relief (PEPFAR) program. We evaluate the impact of
double-sampling selection criteria on the mortality rate estimates
Discussions
Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/111979/1/j.1751-5823.2011.00145.x.pd
Semiparametric theory and empirical processes in causal inference
In this paper we review important aspects of semiparametric theory and
empirical processes that arise in causal inference problems. We begin with a
brief introduction to the general problem of causal inference, and go on to
discuss estimation and inference for causal effects under semiparametric
models, which allow parts of the data-generating process to be unrestricted if
they are not of particular interest (i.e., nuisance functions). These models
are very useful in causal problems because the outcome process is often complex
and difficult to model, and there may only be information available about the
treatment process (at best). Semiparametric theory gives a framework for
benchmarking efficiency and constructing estimators in such settings. In the
second part of the paper we discuss empirical process theory, which provides
powerful tools for understanding the asymptotic behavior of semiparametric
estimators that depend on flexible nonparametric estimators of nuisance
functions. These tools are crucial for incorporating machine learning and other
modern methods into causal inference analyses. We conclude by examining related
extensions and future directions for work in semiparametric causal inference
Response-dependent two-phase sampling designs
This is the peer reviewed version of the following article: McIsaac, M. A. and Cook, R. J. (2014), Response-dependent two-phase sampling designs for biomarker studies. Can J Statistics, 42: 268–284. doi: 10.1002/cjs.11207, which has been published in final form at http://onlinelibrary.wiley.com/doi/10.1002/cjs.11207/references. This article may be used for non-commercial purposes in accordance With Wiley Terms and Conditions for self-archiving.Two-phase sampling designs are developed and investigated for use in the context of a rheumatology
study where interest lies in the association between a biomarker with an expensive assay and
disease progression. We derive optimal phase-II stratum-specific sampling probabilities for analyses
from parametric maximum likelihood (ML), mean score (MS), inverse probability weighted
(IPW), and augmented IPW (AIPW) estimating equations. The easy-to-implement optimally efficient
design for the MS estimator is found to be asymptotically optimal for the IPW and AIPW
estimators we consider, and is shown to result in efficiency gains over balanced and simple random
sampling even when analyses are likelihood-based. We further demonstrate the robustness
of this optimal design and show that it results in very efficient estimation even when the model or
parameters used in its derivation are misspecified.Natural Sciences and Engineering Research Council of Canada (RGPIN 155849); Canadian Institutes for Health Research (FRN 13887
Augmented two-step estimating equations with nuisance functionals and complex survey data
Statistical inference in the presence of nuisance functionals with complex
survey data is an important topic in social and economic studies. The Gini
index, Lorenz curves and quantile shares are among the commonly encountered
examples. The nuisance functionals are usually handled by a plug-in
nonparametric estimator and the main inferential procedure can be carried out
through a two-step generalized empirical likelihood method. Unfortunately, the
resulting inference is not efficient and the nonparametric version of the
Wilks' theorem breaks down even under simple random sampling. We propose an
augmented estimating equations method with nuisance functionals and complex
surveys. The second-step augmented estimating functions obey the Neyman
orthogonality condition and automatically handle the impact of the first-step
plug-in estimator, and the resulting estimator of the main parameters of
interest is invariant to the first step method. More importantly, the
generalized empirical likelihood based Wilks' theorem holds for the main
parameters of interest under the design-based framework for commonly used
survey designs, and the maximum generalized empirical likelihood estimators
achieve the semiparametric efficiency bound. Performances of the proposed
methods are demonstrated through simulation studies and an application using
the dataset from the New York City Social Indicators Survey.Comment: 43 page
- …