1,623 research outputs found
Novel differentially private mechanisms for graphs
In this paper, we introduce new methods for releasing
differentially private graphs. Our techniques are based
on a new way to distribute noise among edges weights. More
precisely, we rely on the addition of noise whose amplitude
is edge-calibrated and optimize the distribution of the privacy
budget among subsets of edges. The generic privacy framework
that we propose can capture all privacy notions introduced so
far in the literature to release graphs in a differentially private
manner. Furthermore, experimental results on real datasets show
that our methods outperform the standard existing techniques, in
particular in terms of the preservation of utility. In addition, these
experiments show that our mechanisms guarantee epsilon-differential
privacy for a reasonable level of privacy epsilon, while preserving the
spectral information of the input graph
Sharing Social Network Data: Differentially Private Estimation of Exponential-Family Random Graph Models
Motivated by a real-life problem of sharing social network data that contain
sensitive personal information, we propose a novel approach to release and
analyze synthetic graphs in order to protect privacy of individual
relationships captured by the social network while maintaining the validity of
statistical results. A case study using a version of the Enron e-mail corpus
dataset demonstrates the application and usefulness of the proposed techniques
in solving the challenging problem of maintaining privacy \emph{and} supporting
open access to network data to ensure reproducibility of existing studies and
discovering new scientific insights that can be obtained by analyzing such
data. We use a simple yet effective randomized response mechanism to generate
synthetic networks under -edge differential privacy, and then use
likelihood based inference for missing data and Markov chain Monte Carlo
techniques to fit exponential-family random graph models to the generated
synthetic networks.Comment: Updated, 39 page
Blowfish Privacy: Tuning Privacy-Utility Trade-offs using Policies
Privacy definitions provide ways for trading-off the privacy of individuals
in a statistical database for the utility of downstream analysis of the data.
In this paper, we present Blowfish, a class of privacy definitions inspired by
the Pufferfish framework, that provides a rich interface for this trade-off. In
particular, we allow data publishers to extend differential privacy using a
policy, which specifies (a) secrets, or information that must be kept secret,
and (b) constraints that may be known about the data. While the secret
specification allows increased utility by lessening protection for certain
individual properties, the constraint specification provides added protection
against an adversary who knows correlations in the data (arising from
constraints). We formalize policies and present novel algorithms that can
handle general specifications of sensitive information and certain count
constraints. We show that there are reasonable policies under which our privacy
mechanisms for k-means clustering, histograms and range queries introduce
significantly lesser noise than their differentially private counterparts. We
quantify the privacy-utility trade-offs for various policies analytically and
empirically on real datasets.Comment: Full version of the paper at SIGMOD'14 Snowbird, Utah US
Mining Frequent Graph Patterns with Differential Privacy
Discovering frequent graph patterns in a graph database offers valuable
information in a variety of applications. However, if the graph dataset
contains sensitive data of individuals such as mobile phone-call graphs and
web-click graphs, releasing discovered frequent patterns may present a threat
to the privacy of individuals. {\em Differential privacy} has recently emerged
as the {\em de facto} standard for private data analysis due to its provable
privacy guarantee. In this paper we propose the first differentially private
algorithm for mining frequent graph patterns.
We first show that previous techniques on differentially private discovery of
frequent {\em itemsets} cannot apply in mining frequent graph patterns due to
the inherent complexity of handling structural information in graphs. We then
address this challenge by proposing a Markov Chain Monte Carlo (MCMC) sampling
based algorithm. Unlike previous work on frequent itemset mining, our
techniques do not rely on the output of a non-private mining algorithm.
Instead, we observe that both frequent graph pattern mining and the guarantee
of differential privacy can be unified into an MCMC sampling framework. In
addition, we establish the privacy and utility guarantee of our algorithm and
propose an efficient neighboring pattern counting technique as well.
Experimental results show that the proposed algorithm is able to output
frequent patterns with good precision
- …