Search CORE

51,152 research outputs found

Semiparametric response model with nonignorable nonresponse

Author: Kim Jae Kwang
Kim Jae Kwang
Uehara Masatoshi
Publication venue
Publication date: 30/10/2018
Field of study

How to deal with nonignorable response is often a challenging problem encountered in statistical analysis with missing data. Parametric model assumption for the response mechanism is often made and there is no way to validate the model assumption with missing data. We consider a semiparametric response model that relaxes the parametric model assumption in the response mechanism. Two types of efficient estimators, profile maximum likelihood estimator and profile calibration estimator, are proposed and their asymptotic properties are investigated. Two extensive simulation studies are used to compare with some existing methods. We present an application of our method using Korean Labor and Income Panel Survey data

arXiv.org e-Print Archive

Digital Repository @ Iowa State University (ISU)

Fractional Imputation in Survey Sampling: A Comparative Review

Author: Kim Jae Kwang
Kim Jae Kwang
Yang Shu
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 27/08/2015
Field of study

Fractional imputation (FI) is a relatively new method of imputation for handling item nonresponse in survey sampling. In FI, several imputed values with their fractional weights are created for each missing item. Each fractional weight represents the conditional probability of the imputed value given the observed data, and the parameters in the conditional probabilities are often computed by an iterative method such as EM algorithm. The underlying model for FI can be fully parametric, semiparametric, or nonparametric, depending on plausibility of assumptions and the data structure. In this paper, we give an overview of FI, introduce key ideas and methods to readers who are new to the FI literature, and highlight some new development. We also provide guidance on practical implementation of FI and valid inferential tools after imputation. We demonstrate the empirical performance of FI with respect to multiple imputation using a pseudo finite population generated from a sample in Monthly Retail Trade Survey in US Census Bureau.Comment: 26 pages, 2 figure

arXiv.org e-Print Archive

Digital Repository @ Iowa State University (ISU)

Crossref

Predictive mean matching imputation in survey sampling

Author: Kim Jae Kwang
Kim Jae Kwang
Yang Shu
Publication venue
Publication date: 12/01/2018
Field of study

Predictive mean matching imputation is popular for handling item nonresponse in survey sampling. In this article, we study the asymptotic properties of the predictive mean matching estimator of the population mean. For variance estimation, the conventional bootstrap inference for matching estimators with fixed matches has been shown to be invalid due to the nonsmoothness nature of the matching estimator. We propose asymptotically valid replication variance estimation. The key strategy is to construct replicates of the estimator directly based on linear terms, instead of individual records of variables. Extension to nearest neighbor imputation is also discussed. A simulation study confirms that the new procedure provides valid variance estimation.Comment: 20 pages, 0 figure, 1 tabl

arXiv.org e-Print Archive

Digital Repository @ Iowa State University (ISU)

Integration of survey data and big observational data for finite population inference using mass imputation

Author: Kim Jae Kwang
Kim Jae Kwang
Yang Shu
Publication venue
Publication date: 08/07/2018
Field of study

Multiple data sources are becoming increasingly available for statistical analyses in the era of big data. As an important example in finite-population inference, we consider an imputation approach to combining a probability sample with big observational data. Unlike the usual imputation for missing data analysis, we create imputed values for the whole elements in the probability sample. Such mass imputation is attractive in the context of survey data integration (Kim and Rao, 2012). We extend mass imputation as a tool for data integration of survey data and big non-survey data. The mass imputation methods and their statistical properties are presented. The matching estimator of Rivers (2007) is also covered as a special case. Variance estimation with mass-imputed data is discussed. The simulation results demonstrate the proposed estimators outperform existing competitors in terms of robustness and efficiency

arXiv.org e-Print Archive

Digital Repository @ Iowa State University (ISU)

A note on multiple imputation for method of moments estimation

Author: Kim Jae Kwang
Kim Jae Kwang
Yang Shu
Publication venue: 'Oxford University Press (OUP)'
Publication date: 27/08/2015
Field of study

Multiple imputation is a popular imputation method for general purpose estimation. Rubin(1987) provided an easily applicable formula for the variance estimation of multiple imputation. However, the validity of the multiple imputation inference requires the congeniality condition of Meng(1994), which is not necessarily satisfied for method of moments estimation. This paper presents the asymptotic bias of Rubin's variance estimator when the method of moments estimator is used as a complete-sample estimator in the multiple imputation procedure. A new variance estimator based on over-imputation is proposed to provide asymptotically valid inference for method of moments estimation.Comment: 8 pages, 0 figur

arXiv.org e-Print Archive

Digital Repository @ Iowa State University (ISU)

Bayesian Sparse Propensity Score Estimation for Unit Nonresponse

Author: Goh Gyuhyeong
Kim Jae Kwang
Kim Jae Kwang
Sang Hejian
Publication venue
Publication date: 27/07/2018
Field of study

Nonresponse weighting adjustment using propensity score is a popular method for handling unit nonresponse. However, including all available auxiliary variables into the propensity model can lead to inefficient and inconsistent estimation, especially with high-dimensional covariates. In this paper, a new Bayesian method using the Spike-and-Slab prior is proposed for sparse propensity score estimation. The proposed method is not based on any model assumption on the outcome variable and is computationally efficient. Instead of doing model selection and parameter estimation separately as in many frequentist methods, the proposed method simultaneously selects the sparse response probability model and provides consistent parameter estimation. Some asymptotic properties of the proposed method are presented. The efficiency of this sparse propensity score estimator is further improved by incorporating related auxiliary variables from the full sample. The finite-sample performance of the proposed method is investigated in two limited simulation studies, including a partially simulated real data example from the Korean Labor and Income Panel Survey.Comment: 38 pages, 3 table

arXiv.org e-Print Archive

Digital Repository @ Iowa State University (ISU)

Imputation estimators for unnormalized models with missing data

Author: Kim Jae Kwang
Kim Jae Kwang
Matsuda Takeru
Uehara Masatoshi
Publication venue
Publication date: 12/03/2019
Field of study

Several statistical models are given in the form of unnormalized densities, and calculation of the normalization constant is intractable. We propose estimation methods for such unnormalized models with missing data. The key concept is to combine imputation techniques with estimators for unnormalized models including noise contrastive estimation and score matching. In addition, we derive asymptotic distributions of the proposed estimators and construct confidence intervals. Simulation results with truncated Gaussian graphical models and the application to real data of wind direction reveal that the proposed methods effectively enable statistical inference with unnormalized models from missing data.Comment: To appear (AISTATS 2020

arXiv.org e-Print Archive

Digital Repository @ Iowa State University (ISU)

Finite sample properties of multiple imputation estimators

Author: Kim Jae Kwang
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 23/06/2004
Field of study

Finite sample properties of multiple imputation estimators under the linear regression model are studied. The exact bias of the multiple imputation variance estimator is presented. A method of reducing the bias is presented and simulation is used to make comparisons. We also show that the suggested method can be used for a general class of linear estimators

arXiv.org e-Print Archive

Crossref