3,526 research outputs found
Differentially Private Exponential Random Graphs
We propose methods to release and analyze synthetic graphs in order to
protect privacy of individual relationships captured by the social network.
Proposed techniques aim at fitting and estimating a wide class of exponential
random graph models (ERGMs) in a differentially private manner, and thus offer
rigorous privacy guarantees. More specifically, we use the randomized response
mechanism to release networks under -edge differential privacy. To
maintain utility for statistical inference, treating the original graph as
missing, we propose a way to use likelihood based inference and Markov chain
Monte Carlo (MCMC) techniques to fit ERGMs to the produced synthetic networks.
We demonstrate the usefulness of the proposed techniques on a real data
example.Comment: minor edit
Mining Frequent Graph Patterns with Differential Privacy
Discovering frequent graph patterns in a graph database offers valuable
information in a variety of applications. However, if the graph dataset
contains sensitive data of individuals such as mobile phone-call graphs and
web-click graphs, releasing discovered frequent patterns may present a threat
to the privacy of individuals. {\em Differential privacy} has recently emerged
as the {\em de facto} standard for private data analysis due to its provable
privacy guarantee. In this paper we propose the first differentially private
algorithm for mining frequent graph patterns.
We first show that previous techniques on differentially private discovery of
frequent {\em itemsets} cannot apply in mining frequent graph patterns due to
the inherent complexity of handling structural information in graphs. We then
address this challenge by proposing a Markov Chain Monte Carlo (MCMC) sampling
based algorithm. Unlike previous work on frequent itemset mining, our
techniques do not rely on the output of a non-private mining algorithm.
Instead, we observe that both frequent graph pattern mining and the guarantee
of differential privacy can be unified into an MCMC sampling framework. In
addition, we establish the privacy and utility guarantee of our algorithm and
propose an efficient neighboring pattern counting technique as well.
Experimental results show that the proposed algorithm is able to output
frequent patterns with good precision
Private Graphon Estimation for Sparse Graphs
We design algorithms for fitting a high-dimensional statistical model to a
large, sparse network without revealing sensitive information of individual
members. Given a sparse input graph , our algorithms output a
node-differentially-private nonparametric block model approximation. By
node-differentially-private, we mean that our output hides the insertion or
removal of a vertex and all its adjacent edges. If is an instance of the
network obtained from a generative nonparametric model defined in terms of a
graphon , our model guarantees consistency, in the sense that as the number
of vertices tends to infinity, the output of our algorithm converges to in
an appropriate version of the norm. In particular, this means we can
estimate the sizes of all multi-way cuts in .
Our results hold as long as is bounded, the average degree of grows
at least like the log of the number of vertices, and the number of blocks goes
to infinity at an appropriate rate. We give explicit error bounds in terms of
the parameters of the model; in several settings, our bounds improve on or
match known nonprivate results.Comment: 36 page
Sharing Social Network Data: Differentially Private Estimation of Exponential-Family Random Graph Models
Motivated by a real-life problem of sharing social network data that contain
sensitive personal information, we propose a novel approach to release and
analyze synthetic graphs in order to protect privacy of individual
relationships captured by the social network while maintaining the validity of
statistical results. A case study using a version of the Enron e-mail corpus
dataset demonstrates the application and usefulness of the proposed techniques
in solving the challenging problem of maintaining privacy \emph{and} supporting
open access to network data to ensure reproducibility of existing studies and
discovering new scientific insights that can be obtained by analyzing such
data. We use a simple yet effective randomized response mechanism to generate
synthetic networks under -edge differential privacy, and then use
likelihood based inference for missing data and Markov chain Monte Carlo
techniques to fit exponential-family random graph models to the generated
synthetic networks.Comment: Updated, 39 page
Detecting Communities under Differential Privacy
Complex networks usually expose community structure with groups of nodes
sharing many links with the other nodes in the same group and relatively few
with the nodes of the rest. This feature captures valuable information about
the organization and even the evolution of the network. Over the last decade, a
great number of algorithms for community detection have been proposed to deal
with the increasingly complex networks. However, the problem of doing this in a
private manner is rarely considered. In this paper, we solve this problem under
differential privacy, a prominent privacy concept for releasing private data.
We analyze the major challenges behind the problem and propose several schemes
to tackle them from two perspectives: input perturbation and algorithm
perturbation. We choose Louvain method as the back-end community detection for
input perturbation schemes and propose the method LouvainDP which runs Louvain
algorithm on a noisy super-graph. For algorithm perturbation, we design
ModDivisive using exponential mechanism with the modularity as the score. We
have thoroughly evaluated our techniques on real graphs of different sizes and
verified their outperformance over the state-of-the-art
- …