Repeated Cross-Sectional Randomized Response Data Taking Design Change and Self-Protective Responses into Account

Abstract

Abstract. Randomized response (RR) is an interview technique that can be used to protect the privacy of respondents if sensitive questions are posed. This paper explains how to measure change in time if a binary RR question is posed at several time points. In cross-sectional research settings, new insights often gradually emerge. In our setting, a switch to another RR procedure necessitates the development of a trend model that estimates the effect of the covariate time if the dependent variable is measured by different RR designs. We also demonstrate that it is possible to deal with self-protective responses, thus accommodating our trend model with the latest developments in RR data analysis. Keywords: linear trend, longitudinal data, misclassification, randomized response, repeated cross-sections, self-protective responses Randomized response (RR) is an interview technique that can be used if sensitive questions are posed and respondents are reluctant to answer directly In addition to the RR setting, misclassification probabilities occur in several other fields of research. The one most closely related to RR is the postrandomization method (PRAM, Kooiman, Willenborg, & Gouweleeuw, 1997) that misclassifies values of categorical variables using a computerized process after the data are collected to protect the respondents' privacy. PRAM uses RR after the data collection. Misclassification also plays a role in medicine and epidemiology with the probabilities correctly classified as a case (sensitivity) or noncase (specificity), see This paper proposes a model to measure changes in time whenever RR is used to pose sensitive questions at several time points cross-sectionally. The model is illustrated with data from a Dutch repeated cross-sectional study on noncompliance to rules regarding social benefits. Data are collected every 2 years since 2000 and given that measures to prevent regulatory noncompliance are intensified during this period, the question arises as to whether the prevalence of regulatory noncompliance changes over the years and how the change can be modeled. Considering time a covariate, we propose a method to measure the effect of this covariate if the dependent variable is measured by RR. Several aspects of the cross-sectional study at hand make it impossible to use standard analysis methods and necessitate a new approach in the analysis of RR data to deal with research questions of this type. Firstly, the fact that RR variables represent misclassified responses on categorical variables precludes the use of, for example, the linear logit model (Agresti, 2002, p. 180), to test for a linear trend. Using the framework o

    Similar works