3,334 research outputs found
Finite sample properties of multiple imputation estimators
Finite sample properties of multiple imputation estimators under the linear
regression model are studied. The exact bias of the multiple imputation
variance estimator is presented. A method of reducing the bias is presented and
simulation is used to make comparisons. We also show that the suggested method
can be used for a general class of linear estimators
Integration of survey data and big observational data for finite population inference using mass imputation
Multiple data sources are becoming increasingly available for statistical
analyses in the era of big data. As an important example in finite-population
inference, we consider an imputation approach to combining a probability sample
with big observational data. Unlike the usual imputation for missing data
analysis, we create imputed values for the whole elements in the probability
sample. Such mass imputation is attractive in the context of survey data
integration (Kim and Rao, 2012). We extend mass imputation as a tool for data
integration of survey data and big non-survey data. The mass imputation methods
and their statistical properties are presented. The matching estimator of
Rivers (2007) is also covered as a special case. Variance estimation with
mass-imputed data is discussed. The simulation results demonstrate the proposed
estimators outperform existing competitors in terms of robustness and
efficiency
Predictive mean matching imputation in survey sampling
Predictive mean matching imputation is popular for handling item nonresponse
in survey sampling. In this article, we study the asymptotic properties of the
predictive mean matching estimator of the population mean. For variance
estimation, the conventional bootstrap inference for matching estimators with
fixed matches has been shown to be invalid due to the nonsmoothness nature of
the matching estimator. We propose asymptotically valid replication variance
estimation. The key strategy is to construct replicates of the estimator
directly based on linear terms, instead of individual records of variables.
Extension to nearest neighbor imputation is also discussed. A simulation study
confirms that the new procedure provides valid variance estimation.Comment: 20 pages, 0 figure, 1 tabl
- …