Search CORE

19,206 research outputs found

Sharing Social Network Data: Differentially Private Estimation of Exponential-Family Random Graph Models

Author: Carroll R. J.
Chaudhuri A.
Duchi J. C.
Fienberg S.
Geyer C. J.
Hunter D. R.
Karwa V.
Kinney S. K.
Lu W.
Morris M.
Raghunathan T. E.
Reiter J. P.
Snijders T. A.
Zhou Y.
Publication venue
Publication date: 23/09/2016
Field of study

Motivated by a real-life problem of sharing social network data that contain sensitive personal information, we propose a novel approach to release and analyze synthetic graphs in order to protect privacy of individual relationships captured by the social network while maintaining the validity of statistical results. A case study using a version of the Enron e-mail corpus dataset demonstrates the application and usefulness of the proposed techniques in solving the challenging problem of maintaining privacy \emph{and} supporting open access to network data to ensure reproducibility of existing studies and discovering new scientific insights that can be obtained by analyzing such data. We use a simple yet effective randomized response mechanism to generate synthetic networks under

\epsilon

-edge differential privacy, and then use likelihood based inference for missing data and Markov chain Monte Carlo techniques to fit exponential-family random graph models to the generated synthetic networks.Comment: Updated, 39 page

arXiv.org e-Print Archive

Research Online

Differentially Private Exponential Random Graphs

Author: A. Goldenberg
A. Hout
C. Dwork
C. Dwork
C.J. Geyer
D.R. Hunter
G. Robins
L. Michell
M. Morris
M. Pearson
M.S. Handcock
M.S. Handcock
O. Frank
P.S. Bearman
S.M. Goodreau
T.A.B. Snijders
V. Karwa
Y.M.J. Woo
Publication venue
Publication date: 01/01/2014
Field of study

We propose methods to release and analyze synthetic graphs in order to protect privacy of individual relationships captured by the social network. Proposed techniques aim at fitting and estimating a wide class of exponential random graph models (ERGMs) in a differentially private manner, and thus offer rigorous privacy guarantees. More specifically, we use the randomized response mechanism to release networks under

\epsilon

-edge differential privacy. To maintain utility for statistical inference, treating the original graph as missing, we propose a way to use likelihood based inference and Markov chain Monte Carlo (MCMC) techniques to fit ERGMs to the produced synthetic networks. We demonstrate the usefulness of the proposed techniques on a real data example.Comment: minor edit

arXiv.org e-Print Archive

Research Online

Noise Infusion as a Confidentiality Protection Measure for Graph-Based Statistics

Author: Abowd John M.
McKinney Kevin L.
Publication venue: DigitalCommons@ILR
Publication date: 28/07/2015
Field of study

We use the bipartite graph representation of longitudinally linked employer-employee data, and the associated projections onto the employer and employee nodes, respectively, to characterize the set of potential statistical summaries that the trusted custodian might produce. We consider noise infusion as the primary confidentiality protection method. We show that a relatively straightforward extension of the dynamic noise-infusion method used in the U.S. Census Bureau’s Quarterly Workforce Indicators can be adapted to provide the same confidentiality guarantees for the graph-based statistics: all inputs have been modified by a minimum percentage deviation (i.e., no actual respondent data are used) and, as the number of entities contributing to a particular statistic increases, the accuracy of that statistic approaches the unprotected value. Our method also ensures that the protected statistics will be identical in all releases based on the same inputs

CiteSeerX

Proceedings from the Synthetic LBD International Seminar

Author: Kinney Saki
Schmutte Ian M
Vilhuber Lars
Publication venue: DigitalCommons@ILR
Publication date: 22/09/2017
Field of study

On May 9, 2017, we hosted a seminar to discuss the conditions necessary to im- plement the SynLBD approach with interested parties, with the goal of providing a straightforward toolkit to implement the same procedure on other data. The proceed- ings summarize the discussions during the workshop