40 research outputs found
Benford's Law As an Instrument for Fraud Detection in Surveys Using the Data of the Socio-Economic Panel (SOEP)
This paper focuses on fraud detection in surveys using Socio-Economic Panel (SOEP) data as an example for testing newly methods proposed here. A statistical theorem referred to as Benford's Law states that in many sets of numerical data, the significant digits are not uniformly distributed, as one might expect, but rather adhere to a certain logarithmic probability function. To detect fraud we derive several requirements that should, according to this law, be fulfilled in the case of survey data. We show that in several SOEP subsamples, Benford's Law holds for the available continuous data. For this analysis, we have developed a measure that reflects the plausibility of the digit distribution in interviewer clusters. We are able to demonstrate that several interviews that were known to have been fabricated and therefore deleted in the original user data set can be detected using this method. Furthermore, in one subsample, we use this method to identify a case of an interviewer falsifying ten interviews who had not been detected previously by the fieldwork organization. In the last section of our paper, we try to explain the deviation from Benford's distribution empirically, and show that several factors can influence the test statistic used. To avoid misinterpretations and false conclusions, it is important to take these factors into account when Benford's Law is applied to survey data.Falsification, data quality, Benford's Law, SOEP
Respondent Behavior in Panel Studies: A Case Study for Income-Nonresponse by Means of the German Socio-Economic Panel (GSOEP)
Many validation studies deal with item-nonresponse and measurement error in earnings data. In this paper we explore motives of respondents for the failure to reveal earnings using the German Socio-Economic Panel (GSOEP). GSOEP collects socio-economic information of private households in the Federal Republic of Germany. We explain the evolution of income-nonresponse in the GSOEP and demonstrate the importance of a discrimination between refusing the income-statement or don't know.Respondent behavior; Interviewer effects; Item-Nonresponse; Panel analysis; Multilevel modeling
Respondent Behavior in Panel Studies: A Case Study of the German Socio-Economic Panel (GSOEP)
In the past there have been many empirical studies dealing with the behavior of respondents in interview situations. Most of these refer to data from surveys and describe an interview situation where respondents and interviewer meet only once. The advantage of this study is the possibility of investigating both respondent behavior over a long period of time and the change in behavior in cases where the respondent meets the interviewer several times and becomes familiar with him. For thi study we use the database of the German Socio-Economic Panel (GSOEP). The GSOEP is a longitudinal survey containing socio-economic information on private households in the Federal Republic of Germany. It provides a wealth of methodological information about the survey methods utilized and the characteristics of the interviewer.
The paper will focus on the analysis of response styles, item-nonresponse and social desirability.Die vorliegende Untersuchung beschäftigt sich mit dem Befragtenverhalten und
Interviewereinflüssen im Sozio-oekonomischen Panel (GSOEP). Gegenüber vielen anderen Studien zu Befragten- und Interviewereffekten bezieht sich diese einfache Querschnitterhebung. So besteht die Möglichkeit, das Befragtenverhalten über einen längeren Zeitraum zu beobachten und auch Verhaltensänderungen zu analysieren, wenn Befragte und Interviewer im Zuge der
Panellaufzeit mehrmals aufeinander treffen. Analysiert werden Befragteneffekte wie Response Styles, Item-Nonresponse und soziale Erwünschheit
Respondent behavior in panel studies : a case study for income-nonresponse by means of the German Socio-Economic Panel (GSOEP)
Many validation studies deal with item-nonresponse and measurement error in earnings data. In this paper we explore motives of respondents for the failure to reveal earnings using the German Socio-Economic Panel (GSOEP). GSOEP collects socio-economic information of private households in the Federal Republic of Germany. We explain the evolution of income-nonresponse in the GSOEP and demonstrate the importance of a discrimination between refusing the income-statement or don't know
Individual and Neighborhood Determinants of Survey Nonresponse: An Analysis Based on a New Subsample of the German Socio-Economic Panel (SOEP), Microgeographic Characteristics and Survey-Based Interviewer Characteristics
This study examines the phenomenon of nonresponse in the first wave of a refresher sample (subsample H) of the German Socio-Economic Panel Study (SOEP). Our first step is to link additional (commercial) microgeographic data on the immediate neighborhoods of the households visited by interviewers. These additional data (paradata) provide valuable information on respondents and nonrespondents, including milieu or lifestyle, dominant household structure, desire for anonymity, frequency of moves, and other important microgeographic information. This linked information is then used to analyze nonresponse. In a second step, we also use demographic variables for the interviewer from an administrative data set about the interviewers, and, in a third step, we use the results of a special interviewer survey. We use multilevel statistical modeling to examine the influence of neighborhoods and interviewers on non-contacts, inability to participate, and refusals. In our analysis, we find our additional variables useful for understanding and explaining non-contacts and refusals and the inability of some respondents to participate in surveys. These data provide an important basis for filling the information gap on response and nonresponse in panel surveys (and in cross-sectional surveys). However, the effect sizes of these effects are negligible. Ignoring these effects does not cause significant biases in statistical inferences drawn from the survey under consideration.Nonresponse, interviewer effects, microgeographic data, multilevel modeling, SOEP
Changing from PAPI to CAPI: A Longitudinal Study of Mode-Effects Based on an Experimental Design
This paper examines the implication of the move to CAPI for data quality by analyzing the conversion from PAPI to CAPI of a subsample of the German Socio-Economic Panel (SOEP) which was done within an experimental design. The 2000 addresses for the sample E of SOEP were split into two subsamples E1 and E2 with the same structure using twin - sample points. Each of the 125 sample points contained 16 addresses (8 for E1 and 8 for E2) and had to be realized in the first wave alternately with PAPI and CAPI mode per interviewer. In the subsequent waves the PAPI mode was partly replaced by CAPI. With this experimental longitudinal design we are able to control for possible interviewer effects in the analysis of mode effects. The paper assesses whether any mode effects are apparent for the response rate. Within the data, we examine monetary dimensions such as gross income, item and unit nonresponse rates. We were able to find some minor effects but our main results show that we have made the shift without introducing strong mode effects.CAPI, Mode effects, data quality, interviewer effects
Das "Interviewer-Panel" des Sozio-oekonomischen Panels : Darstellung und ausgewählte Analysen
Statistische Daten sind im allgemeinen - gemessen an idealtypischen Erhebungsbedingungen - nicht fehlerfrei, insbesondere können sie durch die Art, wie sie erhoben werden, beeinflusst werden ("Erhebungsartefakte"). Interviewereffekte spielen in diesem Zusammenhang in der Literatur eine prominente Rolle als Fehlerquelle (vgl. z.B. Esser 1984; Hermann 1983; Schanz/Schmidt 1984; Hoag/Allerbeck 1981; Erbslöh/Wiendeck 1974; Sudman/Bradburn 1974 und Hyman et al. 1954). Die Erhebung des Sozio-oekonomischen Panels (SOEP) bietet besonders gute Möglichkeiten, Erhebungsartefakte zu analysieren, da das SOEP eine prospektive Längsschnittstudie ist, für die die Erhebungsmethode Fall für Fall protokolliert wird und für die auch die persönlichen Merkmale der Interviewer bekannt sind (vgl. Riebschläger 1996; Riebschläger/Wagner 1991). Diese Merkmale bilden wie- derum ein eigenständiges "Interviewerpanel", dessen Verknüpfung mit den eigentlichen SOEP-Daten ungewöhnlich gute Möglichkeiten bietet, Erhebungsartefakte zu identifizieren. In diesem Beitrag wird das bislang nicht als solches ausgewertete Interviewerpanel des SOEP dargestellt und es werden beispielhafte Analysen zum Interviewerverhalten durchgeführt. Der Interviewer-Datensatz steht für weitere Analysen künftig allen SOEP-Nutzern standardmäßig zur Verfügung. Es ist zu hoffen, daß er intensiv genutzt wird
Was kann man am Beispiel des SOEP bezüglich Nonresponse lernen?
"Die vorliegende Untersuchung beschäftigt sich mit dem Ausfallprozeß in der Basiserhebung des Sozio-oekonomischen Panels (SOEP). Neben einer detaillierten Deskription der Ausfälle in der ersten Welle, werden Multilevelmodelle verwendet, um den Prozeß der Interviewteilnahme in Abhängigkeit von Befragten-, Interviewer- und Situationsmerkmalen zu erklären. Hierbei wird zwischen Erreichbarkeit und Kooperationsbereitschaft der Befragten und zusätzlich zwischen Erst- und Nachbearbeitung differenziert. Durch diese Erweiterung besteht die Möglichkeit, auch die Konvertierung von Verweigerern in der Erstbearbeitung bei der Modellierung mit zu berücksichtigen." (Autorenreferat)"The following study describes the process of non-response in the first wave in the German Socio-Economic Panel (GSOEP). Multilevel statistical modelling is used to explore the influence of characteristics of respondents and interviewers on non-contacts and refusal rates. In addition, a further distinction between first treatment (contact) and followup treatment (contact) allows us to analyse the converted respondents who first decided to refuse but then did participate when contacted again." (author's abstract