4 research outputs found
Panel Regression with Random Noise
The paper explores the effect of measurement errors on the estimation of a linear panel data model. The conventional fixed effects estimator, which ignores measurement errors, is biased. By correcting for the bias one can construct consistent and asymptotically normal estimators. In addition, we find estimates for the asymptotic variances of these estimators. The paper focuses on multiplicative errors, which are often deliberately added to the data in order to minimize their disclosure risk. They can be analyzed in a similar way as additive errors, but with some important and consequential differences.panel regression, multiplicative measurement errors, bias correction, asymptotic variance, disclosure control
Privacy-Preserving Classification on Deep Neural Network
Neural Networks (NN) are today increasingly used in Machine Learning where they have become deeper and deeper to accurately model or classify high-level abstractions of data. Their development however also gives rise to important data privacy risks. This observation motives Microsoft researchers to propose a framework, called Cryptonets. The core idea is to combine simplifications of the NN with Fully Homomorphic Encryptions (FHE) techniques to get both confidentiality of the manipulated data and efficiency of the processing. While efficiency and accuracy are demonstrated when the number of non-linear layers is small (eg ), Cryptonets unfortunately becomes ineffective for deeper NNs which let the problem of privacy preserving matching open in these contexts. This work successfully addresses this problem by combining the original ideas of Cryptonets\u27 solution with the batch normalization principle introduced at ICML 2015 by Ioffe and Szegedy. We experimentally validate the soundness of our approach with a neural network with non-linear layers. When applied to the MNIST database, it competes the accuracy of the best non-secure versions, thus significantly improving Cryptonets
Privacy in statistical databases UNESCO Chair in Data Privacy International Conference, PSD 2008, Istanbul, Turkey, September 24-26, 2008, Proceedings
Privacy in statistical databases is a discipline whose purpose is to provide solutions
to the tension between the increasing social, political and economical
demand of accurate information, and the legal and ethical obligation to protect
the privacy of the various parties involved. Those parties are the respondents
(the individuals and enterprises to which the database records refer), the data
owners (those organizations spending money in data collection) and the users
(the ones querying the database, who would like their queries to stay confidential).
Beyond law and ethics, there are also practical reasons for data collecting
agencies to invest in respondent privacy: if individual respondents feel their privacy
guaranteed, they are likely to provide more accurate responses. Data owner
privacy is primarily motivated by practical considerations: if an enterprise collects
data at its own expense, it may wish to minimize leakage of those data to
other enterprises (even to those with whom joint data exploitation is planned).
Finally, user privacy results in increased user satisfaction, even if it may curtail
the ability of the database owner to profile users.
There are at least two traditions in statistical database privacy, both of which
started in the 1970s: one stems from official statistics, where the discipline is
also known as statistical disclosure control (SDC), and the other originates from
computer science and database technology. In official statistics, the basic concern
is respondent privacy. In computer science, one started with respondent
privacy but, from 2000 onwards, growing attention has been devoted to owner
privacy (privacy-preserving data mining) and user privacy (private information
retrieval). In the last few years, the interest and the achievements of computer
scientists in the topic have substantially increased.
“Privacy in Statistical Databases 2008” (PSD 2008) is held under the sponsorship
of the UNESCO Chair in Data Privacy, which intends to act as a stable
umbrella for the PSD biennial conference series from now on. PSD 2008 is a successor
of PSD 2006, the final conference of the Eurostat-funded CENEX-SDC
project, held in Rome in 2006, and PSD 2004, the final conference of the EU
FP5 CASC project (IST-2000-25069), held in Barcelona in 2004. Proceedings of
PSD 2006 and PSD 2004 were published by Springer in LNCS 4302 and LNCS
3050, respectively. The three PSD conferences held so far are a follow-up of a
series of high-quality technical conferences on SDC which started one decade
ago with “Statistical Data Protection-SDP’98”, held in Lisbon in 1998 and with
proceedings published by OPOCE, and continued with the AMRADS project
SDC Workshop, held in Luxemburg in 2001 and with proceedings published by
Springer in LNCS 2316