32 research outputs found
Differentially Private Data Analysis of Social Networks via Restricted Sensitivity
We introduce the notion of restricted sensitivity as an alternative to global
and smooth sensitivity to improve accuracy in differentially private data
analysis. The definition of restricted sensitivity is similar to that of global
sensitivity except that instead of quantifying over all possible datasets, we
take advantage of any beliefs about the dataset that a querier may have, to
quantify over a restricted class of datasets. Specifically, given a query f and
a hypothesis H about the structure of a dataset D, we show generically how to
transform f into a new query f_H whose global sensitivity (over all datasets
including those that do not satisfy H) matches the restricted sensitivity of
the query f. Moreover, if the belief of the querier is correct (i.e., D is in
H) then f_H(D) = f(D). If the belief is incorrect, then f_H(D) may be
inaccurate.
We demonstrate the usefulness of this notion by considering the task of
answering queries regarding social-networks, which we model as a combination of
a graph and a labeling of its vertices. In particular, while our generic
procedure is computationally inefficient, for the specific definition of H as
graphs of bounded degree, we exhibit efficient ways of constructing f_H using
different projection-based techniques. We then analyze two important query
classes: subgraph counting queries (e.g., number of triangles) and local
profile queries (e.g., number of people who know a spy and a computer-scientist
who know each other). We demonstrate that the restricted sensitivity of such
queries can be significantly lower than their smooth sensitivity. Thus, using
restricted sensitivity we can maintain privacy whether or not D is in H, while
providing more accurate results in the event that H holds true
Detecting Communities under Differential Privacy
Complex networks usually expose community structure with groups of nodes
sharing many links with the other nodes in the same group and relatively few
with the nodes of the rest. This feature captures valuable information about
the organization and even the evolution of the network. Over the last decade, a
great number of algorithms for community detection have been proposed to deal
with the increasingly complex networks. However, the problem of doing this in a
private manner is rarely considered. In this paper, we solve this problem under
differential privacy, a prominent privacy concept for releasing private data.
We analyze the major challenges behind the problem and propose several schemes
to tackle them from two perspectives: input perturbation and algorithm
perturbation. We choose Louvain method as the back-end community detection for
input perturbation schemes and propose the method LouvainDP which runs Louvain
algorithm on a noisy super-graph. For algorithm perturbation, we design
ModDivisive using exponential mechanism with the modularity as the score. We
have thoroughly evaluated our techniques on real graphs of different sizes and
verified their outperformance over the state-of-the-art
Blowfish Privacy: Tuning Privacy-Utility Trade-offs using Policies
Privacy definitions provide ways for trading-off the privacy of individuals
in a statistical database for the utility of downstream analysis of the data.
In this paper, we present Blowfish, a class of privacy definitions inspired by
the Pufferfish framework, that provides a rich interface for this trade-off. In
particular, we allow data publishers to extend differential privacy using a
policy, which specifies (a) secrets, or information that must be kept secret,
and (b) constraints that may be known about the data. While the secret
specification allows increased utility by lessening protection for certain
individual properties, the constraint specification provides added protection
against an adversary who knows correlations in the data (arising from
constraints). We formalize policies and present novel algorithms that can
handle general specifications of sensitive information and certain count
constraints. We show that there are reasonable policies under which our privacy
mechanisms for k-means clustering, histograms and range queries introduce
significantly lesser noise than their differentially private counterparts. We
quantify the privacy-utility trade-offs for various policies analytically and
empirically on real datasets.Comment: Full version of the paper at SIGMOD'14 Snowbird, Utah US