Search CORE

367 research outputs found

Record-Linkage from a Technical Point of View

Author: Rainer Schnell
Publication venue
Publication date
Field of study

TRecord linkage is used for preparing sampling frames, deduplication of lists and combining information on the same object from two different databases. If the identifiers of the same objects in two different databases have error free unique common identifiers like personal identification numbers (PID), record linkage is a simple file merge operation. If the identifiers contains errors, record linkage is a challenging task. In many applications, the files have widely different numbers of observations, for example a few thousand records of a sample survey and a few million records of an administrative database of social security numbers. Available software, privacy issues and future research topics are discussed.Record-Linkage, Data-mining, Privacy preserving protocols

Research Papers in Economics

Avoiding Problems of Traditional Sampling Strategies for Household Surveys in Germany: Some New Suggestions

Author: Rainer Schnell
Publication venue
Publication date
Field of study

All of the sampling plans currently in use for general population surveys in Germany suffer from methodological and practical problems. A new sampling plan is thus urgently needed: one with a low cost overhead that can be prepared in a very short time. Germany also lacks a sampling plan covering all institutional populations, immigrants in general, and illegal immigrants in particular. The availability of new databases covering these populations suggests ways of developing, implementing, and testing new sampling plans for population surveys in Germany. One such sampling plan (G-Plan) is proposed here for the first time. The implementation problems of this design must be studied in a number of empirical pretests.

Research Papers in Economics

Multiple imputation for unit-nonresponse versus weighting including a comparison with a nonresponse follow-up study

Author: Rässler Susanne
Schnell Rainer
Publication venue
Publication date
Field of study

The results of a national fear of crime survey are compared with results following the use of different nonresponse correction procedures. We compared naive estimates, weighted estimates, estimates after a thorough nonresponse follow-up and estimates after multiple imputation. A strong similarity between the MI and the follow-up-estimates was found. This suggests, that if the assumptions of MAR hold, carefully selected and collected additional data applied in a MI could yield similar estimates to a nonresponse follow-up at a much lower price and respondent burden. --Multiple Imputation,Unit-nonresponse,missing data,complex surveys.

Research Papers in Economics

Record-linkage from a technical point of view

Author: Schnell Rainer
Publication venue: 'Botanic Garden & Botanical Museum Berlin-Dahlem BGBM'
Publication date: 14/01/2014
Field of study

"Record linkage is used for preparing sampling frames, deduplication of lists and combining information on the same object from two different databases. If the identifiers of the same objects in two different databases have error free unique common identifiers like personal identification numbers (PID), record linkage is a simple file merge operation. If the identifiers contain errors, record linkage is a challenging task. In many applications, the files have widely different numbers of observations, for example a few thousand records of a sample survey and a few million records of an administrative database of social security numbers. Available software, privacy issues and future research topics are discussed." [author's abstract

SSOAR - Social Science Open Access Repository

Biological variables in social surveys

Author: Schnell Rainer
Publication venue: 'Botanic Garden & Botanical Museum Berlin-Dahlem BGBM'
Publication date: 17/01/2014
Field of study

"Social scientists have long virtually ignored the biological constraints of human behavior. Yet if the prediction of behavior is considered essential to a social science, neglecting any variable that might influence human behavior is unacceptable. This paper provides examples of important biological variables and describes their measurement in social surveys." (author's abstract

SSOAR - Social Science Open Access Repository

Biological Variables in Social Surveys

Author: Rainer Schnell
Publication venue
Publication date
Field of study

Social scientists have long virtually ignored the biological constraints of human behavior. Yet if the prediction of behavior is considered essential to a social science, neglecting any variable that might influence human behavior is unacceptable. This paper provides examples of important biological variables and describes their measurement in social surveys.

Research Papers in Economics

Recommended from our members

The Accuracy of Pre-Election Polling of German General Elections

Author: Noack Marcel
Schnell Rainer
Publication venue
Publication date: 01/01/2014
Field of study

Pre-election polls are the most prominent type of surveys. As with any other survey, estimates are only of interest if they do not deviate significantly from the true state of nature. Even though pre-election polls in Germany as well as in other countries repeatedly show noticeably inaccurate results, their failure appears to be quickly forgotten. No comparison considering all available German data on actual election results and the confidence intervals based on pre-election polls has been published. In the study reported here only 69% of confidence intervals covered the election result, whereas statistically 95% would have to be expected. German pre-election polls even just a month ahead are therefore much less accurate than most introductory statistical textbooks would suggest

City Research Online

Directory of Open Access Journals

SSOAR - Social Science Open Access Repository

The effect of the refusal avoidance training experiment on final disposition codes in the German ESS-2

Author: Schnell Rainer
Trappmann Mark
Publication venue: Konstanz
Publication date: 27/03/2012
Field of study

"The implementation of a Refusal Avoidance Training (RAT) within wave 2 of the German part of the European Social Survey (ESS) successful reduced the amount of reported refusal by nearly 7%. The effect of the reduction was compensated by a nearly equal increase in the proportion of non-contacted designated respondents. This effect may be due to non-random allocation of trained interviewers. Further randomized experiments are neccessary to separate the effects of RAT on response rates." (author's abstract

SSOAR - Social Science Open Access Repository

Multiple imputation for unit-nonresponse versus weighting including a comparison with a nonresponse follow-up study

Author: Rässler Susanne
Schnell Rainer
Publication venue: Konstanz
Publication date: 30/03/2012
Field of study

"The results of a national fear of crime survey are compared with results following the use of different nonresponse correction procedures. We compared naive estimates, weighted estimates, estimates after a thorough nonresponse follow-up and estimates after multiple imputation. A strong similarity between the MI and the follow-up-estimates was found. This suggests, that if the assumptions of MAR hold, carefully selected and collected additional data applied in a MI could yield similar estimates to a nonresponse follow-up at a much lower price and respondent burden." (author's abstract

SSOAR - Social Science Open Access Repository

Big Data is not the New Oil: Common Misconceptions about Population Data

Author: Christen Peter
Schnell Rainer
Publication venue
Publication date: 02/09/2022
Field of study

Databases covering all individuals of a population are increasingly used for research and decision-making. The massive size of such databases is often mistaken as a guarantee for valid inferences. However, population data have characteristics that make them challenging to use. Various assumptions on population coverage and data quality are commonly made, including how such data were captured and what types of processing have been applied to them. Furthermore, the full potential of population data can often only be unlocked when such data are linked to other databases. Record linkage often implies subtle technical problems, which are easily missed. We discuss a diverse range of misconceptions relevant for anybody capturing, processing, linking, or analysing population data. Remarkably many of these misconceptions are due to the social nature of data collections and are therefore missed by purely technical accounts of data processing. Many of these misconceptions are also not well documented in scientific publications. We conclude with a set of recommendations for using population data

arXiv.org e-Print Archive