Search CORE

52 research outputs found

Differentially Private Model Selection with Penalized and Constrained Likelihood

Author: Chaudhuri K.
Chaudhuri K.
Chaudhuri K.
Dalenius T.
Duchi J. C.
Fienberg S.
Gaboardi M.
Hardt M.
Lei J.
Rubin D. B.
Smith A.
Tibshirani R.
Uhler C.
Publication venue
Publication date: 14/07/2016
Field of study

In statistical disclosure control, the goal of data analysis is twofold: The released information must provide accurate and useful statistics about the underlying population of interest, while minimizing the potential for an individual record to be identified. In recent years, the notion of differential privacy has received much attention in theoretical computer science, machine learning, and statistics. It provides a rigorous and strong notion of protection for individuals' sensitive information. A fundamental question is how to incorporate differential privacy into traditional statistical inference procedures. In this paper we study model selection in multivariate linear regression under the constraint of differential privacy. We show that model selection procedures based on penalized least squares or likelihood can be made differentially private by a combination of regularization and randomization, and propose two algorithms to do so. We show that our private procedures are consistent under essentially the same conditions as the corresponding non-private procedures. We also find that under differential privacy, the procedure becomes more sensitive to the tuning parameters. We illustrate and evaluate our method using simulation studies and two real data examples

arXiv.org e-Print Archive

Crossref

Boston University Institutional Repository (OpenBU)

Recommended from our members

Anonymisation of geographical distance matrices via Lipschitz embedding

Author: AS Whittemore
BS Everitt
CD Lloyd
DR Helsel
G Duncan
GR Hjaltason
GT Duncan
H-W Jung
J Bourgain
J Höhne
J Konc
JJ Trinckes
K Emam El
K Emam El
K Emam El
K Emam El
K Emam El
K Kenthapadi
K Riesen
KC Clarke
KH Hampton
L Sweeney
LA Waller
M Kroll
Martin Kroll
MM Merener
MP Armstrong
MP Gutmann
Rainer Schnell
RS Bivand
S Dray
SC Wieland
T Dalenius
Ö Uzuner
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

BACKGROUND: Anonymisation of spatially referenced data has received increasing attention in recent years. Whereas the research focus has been on the anonymisation of point locations, the disclosure risk arising from the publishing of inter-point distances and corresponding anonymisation methods have not been studied systematically. METHODS: We propose a new anonymisation method for the release of geographical distances between records of a microdata file-for example patients in a medical database. We discuss a data release scheme in which microdata without coordinates and an additional distance matrix between the corresponding rows of the microdata set are released. In contrast to most other approaches this method preserves small distances better than larger distances. The distances are modified by a variant of Lipschitz embedding. RESULTS: The effects of the embedding parameters on the risk of data disclosure are evaluated by linkage experiments using simulated data. The results indicate small disclosure risks for appropriate embedding parameters. CONCLUSION: The proposed method is useful if published distance information might be misused for the re-identification of records. The method can be used for publishing scientific-use-files and as an additional tool for record-linkage studies

City Research Online

Crossref

Springer - Publisher Connector

PubMed Central