6,942 research outputs found

    Handling Attrition in Longitudinal Studies: The Case for Refreshment Samples

    Get PDF
    Panel studies typically suffer from attrition, which reduces sample size and can result in biased inferences. It is impossible to know whether or not the attrition causes bias from the observed panel data alone. Refreshment samples - new, randomly sampled respondents given the questionnaire at the same time as a subsequent wave of the panel - offer information that can be used to diagnose and adjust for bias due to attrition. We review and bolster the case for the use of refreshment samples in panel studies. We include examples of both a fully Bayesian approach for analyzing the concatenated panel and refreshment data, and a multiple imputation approach for analyzing only the original panel. For the latter, we document a positive bias in the usual multiple imputation variance estimator. We present models appropriate for three waves and two refreshment samples, including nonterminal attrition. We illustrate the three-wave analysis using the 2007-2008 Associated Press-Yahoo! News Election Poll.Comment: Published in at http://dx.doi.org/10.1214/13-STS414 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

    An Object-Oriented Framework for Statistical Simulation: The R Package simFrame

    Get PDF
    Simulation studies are widely used by statisticians to gain insight into the quality of developed methods. Usually some guidelines regarding, e.g., simulation designs, contamination, missing data models or evaluation criteria are necessary in order to draw meaningful conclusions. The R package simFrame is an object-oriented framework for statistical simulation, which allows researchers to make use of a wide range of simulation designs with a minimal effort of programming. Its object-oriented implementation provides clear interfaces for extensions by the user. Since statistical simulation is an embarrassingly parallel process, the framework supports parallel computing to increase computational performance. Furthermore, an appropriate plot method is selected automatically depending on the structure of the simulation results. In this paper, the implementation of simFrame is discussed in great detail and the functionality of the framework is demonstrated in examples for different simulation designs.

    Multiple imputation for unit-nonresponse versus weighting including a comparison with a nonresponse follow-up study

    Get PDF
    The results of a national fear of crime survey are compared with results following the use of different nonresponse correction procedures. We compared naive estimates, weighted estimates, estimates after a thorough nonresponse follow-up and estimates after multiple imputation. A strong similarity between the MI and the follow-up-estimates was found. This suggests, that if the assumptions of MAR hold, carefully selected and collected additional data applied in a MI could yield similar estimates to a nonresponse follow-up at a much lower price and respondent burden. --Multiple Imputation,Unit-nonresponse,missing data,complex surveys.

    Stop or Continue Data Collection: A Nonignorable Missing Data Approach for Continuous Variables

    Full text link
    We present an approach to inform decisions about nonresponse follow-up sampling. The basic idea is (i) to create completed samples by imputing nonrespondents' data under various assumptions about the nonresponse mechanisms, (ii) take hypothetical samples of varying sizes from the completed samples, and (iii) compute and compare measures of accuracy and cost for different proposed sample sizes. As part of the methodology, we present a new approach for generating imputations for multivariate continuous data with nonignorable unit nonresponse. We fit mixtures of multivariate normal distributions to the respondents' data, and adjust the probabilities of the mixture components to generate nonrespondents' distributions with desired features. We illustrate the approaches using data from the 2007 U. S. Census of Manufactures

    Application of the European Customer Satisfaction Index to Postal Services. Structural Equation Models versus Partial Least Squares

    Get PDF
    Customer satisfaction and retention are key issues for organizations in today’s competitive market place. As such, much research and revenue has been invested in developing accurate ways of assessing consumer satisfaction at both the macro (national) and micro (organizational) level, facilitating comparisons in performance both within and between industries. Since the instigation of the national customer satisfaction indices (CSI), partial least squares (PLS) has been used to estimate the CSI models in preference to structural equation models (SEM) because they do not rely on strict assumptions about the data. However, this choice was based upon some misconceptions about the use of SEM’s and does not take into consideration more recent advances in SEM, including estimation methods that are robust to non-normality and missing data. In this paper, both SEM and PLS approaches were compared by evaluating perceptions of the Isle of Man Post Office Products and Customer service using a CSI format. The new robust SEM procedures were found to be advantageous over PLS. Product quality was found to be the only driver of customer satisfaction, while image and satisfaction were the only predictors of loyalty, thus arguing for the specificity of postal services.European Customer Satisfaction Index; ECSI; Structural Equation Models; Robust Statistics; Missing Data; Maximum Likelihood

    Imputation of missing values in survey data (Version 1.0)

    Get PDF
    Survey data often includes missing values. An approach to deal with missing values is imputation in order to obtain a complete dataset. However, the process of imputation requires researchers to make various decisions regarding the imputation method to be applied, the number of values to be imputed for each missing value, the selection of predictor variables, the treatment of multivariate nonresponse and the conduct of variance estimation. This survey guideline provides an overview of imputation procedures for missing values. It aims to support the reader with respect to aforementioned decisions when imputing missing values in survey data.Survey Daten enthalten häufig fehlende Werte. Eine Methode mit fehlenden Werten umzugehen ist die Imputation, welche darauf abzielt, einen vollständigen Datensatz zu erhalten. Im Zuge der Anwendung der Imputation müssen jedoch verschiedene Entscheidungen getroffen werden. Zum Beispiel muss festgelegt werden, welche Imputationsmethode verwendet werden soll, wie viele Werte für einen fehlenden Wert imputiert werden sollen, welche Variablen als Prädiktoren verwendet werden und wie mit multivariatem Nonresponse umzugehen ist und wie die Varianzschätzung durchgeführt werden soll. Diese Survey Guideline gibt einen Überblick über die Imputation fehlender Werte. Das Ziel ist es, den Leser bezüglich der zuvor genannten Fragestellungen bei der Imputation fehlender Werte in Survey Daten zu unterstützen

    Bayesian Estimation Under Informative Sampling

    Full text link
    Bayesian analysis is increasingly popular for use in social science and other application areas where the data are observations from an informative sample. An informative sampling design leads to inclusion probabilities that are correlated with the response variable of interest. Model inference performed on the observed sample taken from the population will be biased for the population generative model under informative sampling since the balance of information in the sample data is different from that for the population. Typical approaches to account for an informative sampling design under Bayesian estimation are often difficult to implement because they require re-parameterization of the hypothesized generating model, or focus on design, rather than model-based, inference. We propose to construct a pseudo-posterior distribution that utilizes sampling weights based on the marginal inclusion probabilities to exponentiate the likelihood contribution of each sampled unit, which weights the information in the sample back to the population. Our approach provides a nearly automated estimation procedure applicable to any model specified by the data analyst for the population and retains the population model parameterization and posterior sampling geometry. We construct conditions on known marginal and pairwise inclusion probabilities that define a class of sampling designs where L1L_{1} consistency of the pseudo posterior is guaranteed. We demonstrate our method on an application concerning the Bureau of Labor Statistics Job Openings and Labor Turnover Survey.Comment: 24 pages, 3 figure

    Adaptive Design to Adjust for Unit Nonresponse Using an External Micro-level Benchmark

    Full text link
    Traditional survey design draws a representative sample and implements post-survey weighting adjustments to compensate for nonresponse. When survey participation decline renders respondents nonrepresentative, the effectiveness of post-survey weighting adjustment becomes uncertain. Recent developments to improve respondent representativeness via adaptive data collection design have delivered promising results on bias reduction. This dissertation develops a new adaptive design to improve survey data quality, by capitalizing on a benchmark data which captures the target population. The basic idea is to adaptively draw samples that lead to representative respondents; and to compensate for nonrespondents by benchmarked imputation procedures. Respondent representativeness is enhanced by the sampling procedure as opposed to data collection, eliminating costs of nonresponse follow-up and inferential complexity due to varying data collection protocols. The new adaptive design consists of benchmarked sequential sampling (BSS) and benchmarked multiple imputation (B-MI) procedures. The new design first improves respondent representativeness by BSS, which conforms either the frame variables alone (BSS-Z) or both frame and survey covariate information (BSS-X) to those of the benchmark. With improved respondent representativeness, the benchmarked multiple imputation recovers the population information, leading to better quality survey estimates that are less susceptible to the unknown nonresponse pattern. This design applies to surveys with rich micro-level auxiliary data and surveys that use respondents of other surveys as sampling frame. The BSS-Z method is demonstrated using the National Health Interview Survey and Behavior Risk Factor Surveillance System; the BSS-X and the benchmarked MI methods are demonstrated using the American Community Survey, the Current Population Survey, and the Census Planning Database. An evaluation is done between the new design of adaptive sampling and imputation and the traditional design of fixed sampling and weighting (generalized regression estimator). To assess respondent representativeness, data from the new design is compared to those of the benchmark in marginal, conditional, and descriptive statistics. To assess the quality of the survey inference, a sample mean is calculated along with its root mean square error (RMSE), bias and coverage rate. To assess whether a design is of better value, a cost-effectiveness measure is derived from RMSE and a new cost model.PHDSurvey MethodologyUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/137173/1/julialee_1.pd
    corecore