research

Estimation of the Probit Model from Anonymized Micro Data

Abstract

The demand of scientists for confidential micro data from official sources has created discussion of how to anonymize these data in such a way that they can be given to the scientific community. We report results from a German project which exploits various options of anonymization for producing such ”scientific-use- files”. The main concern in the project however is whether estimation of stochastic models from these perturbed data is possible and – more importantly – leads to reliable results. In this paper we concentrate on estimation of the probit model under the assumption that only anonymized data are available. In particular we assume that the binary dependent variable has undergone post-randomization (PRAM) and that the set of explanatory variables has been perturbed by addition of noise. We employ a maximum likelihood estimator which is consistent if only the dependent variable has been anonymized by PRAM. The errors-in-variables structure of the regressors then is handled by the simulation extrapolation (SIMEX) estimation procedure where we compare performance of quadratic and nonlinear (rational) extrapolation.anonymization, misclassification, noise addition, post-randomization, SIMEX procedure, statistical disclosure.

    Similar works