739 research outputs found
Analysis of Complex Survey Samples
I present software for analysing complex survey samples in R. The sampling scheme can be explicitly described or represented by replication weights. Variance estimation uses either replication or linearisation.
Robustness of Semiparametric Efficiency in Nearly-Correct Models for Two-Phase Samples
Augmented inverse-probability weighted (AIPW) estimators for incomplete-data models typically do not have full semiparametric efficiency, but do have model-robustness properties not shared by the efficient estimator. We examine the performance of efficient and AIPW estimators when the complete-data model is nearly correctly specified, in the sense that the misspecification is not reliably detectable from the data by any possible diagnostic or test. Asymptotic results for these nearly true models are obtained by representing them as sequences of misspecified models that are mutually contiguous with a correctly specified model. For some least favorable direction of model misspecification the bias in the efficient estimator induced by even this amount of model misspecification is comparable to the extra variability in the AIPW estimator, so that the mean squared error of the efficient estimator is no longer lower, at least in a local asymptotic minimax sense
An Empirical Process Limit Theorem for Sparsely Correlated Data
We consider data that are dependent, but where most small sets of observations are independent. By extending Bernstein\u27s inequality we prove a strong law of law numbers and an empirical process central limit theorem under bracketing entropy conditions
Model-robust regression and a Bayesian ``sandwich'' estimator
We present a new Bayesian approach to model-robust linear regression that
leads to uncertainty estimates with the same robustness properties as the
Huber--White sandwich estimator. The sandwich estimator is known to provide
asymptotically correct frequentist inference, even when standard modeling
assumptions such as linearity and homoscedasticity in the data-generating
mechanism are violated. Our derivation provides a compelling Bayesian
justification for using this simple and popular tool, and it also clarifies
what is being estimated when the data-generating mechanism is not linear. We
demonstrate the applicability of our approach using a simulation study and
health care cost data from an evaluation of the Washington State Basic Health
Plan.Comment: Published in at http://dx.doi.org/10.1214/10-AOAS362 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Survival Analysis in XLISP-Stat. A semiliterate program
This document contains program code and examples of survival analyses in XLISP-Stat, structured using the noweb (Ramsey,'93) literate programming system. It is described as a "semiliterate" program because most of the code was already written before it was converted to use noweb.
- …