10,447 research outputs found
Formal Verification of Differential Privacy for Interactive Systems
Differential privacy is a promising approach to privacy preserving data
analysis with a well-developed theory for functions. Despite recent work on
implementing systems that aim to provide differential privacy, the problem of
formally verifying that these systems have differential privacy has not been
adequately addressed. This paper presents the first results towards automated
verification of source code for differentially private interactive systems. We
develop a formal probabilistic automaton model of differential privacy for
systems by adapting prior work on differential privacy for functions. The main
technical result of the paper is a sound proof technique based on a form of
probabilistic bisimulation relation for proving that a system modeled as a
probabilistic automaton satisfies differential privacy. The novelty lies in the
way we track quantitative privacy leakage bounds using a relation family
instead of a single relation. We illustrate the proof technique on a
representative automaton motivated by PINQ, an implemented system that is
intended to provide differential privacy. To make our proof technique easier to
apply to realistic systems, we prove a form of refinement theorem and apply it
to show that a refinement of the abstract PINQ automaton also satisfies our
differential privacy definition. Finally, we begin the process of automating
our proof technique by providing an algorithm for mechanically checking a
restricted class of relations from the proof technique.Comment: 65 pages with 1 figur
Mining Frequent Graph Patterns with Differential Privacy
Discovering frequent graph patterns in a graph database offers valuable
information in a variety of applications. However, if the graph dataset
contains sensitive data of individuals such as mobile phone-call graphs and
web-click graphs, releasing discovered frequent patterns may present a threat
to the privacy of individuals. {\em Differential privacy} has recently emerged
as the {\em de facto} standard for private data analysis due to its provable
privacy guarantee. In this paper we propose the first differentially private
algorithm for mining frequent graph patterns.
We first show that previous techniques on differentially private discovery of
frequent {\em itemsets} cannot apply in mining frequent graph patterns due to
the inherent complexity of handling structural information in graphs. We then
address this challenge by proposing a Markov Chain Monte Carlo (MCMC) sampling
based algorithm. Unlike previous work on frequent itemset mining, our
techniques do not rely on the output of a non-private mining algorithm.
Instead, we observe that both frequent graph pattern mining and the guarantee
of differential privacy can be unified into an MCMC sampling framework. In
addition, we establish the privacy and utility guarantee of our algorithm and
propose an efficient neighboring pattern counting technique as well.
Experimental results show that the proposed algorithm is able to output
frequent patterns with good precision
Proving Differential Privacy with Shadow Execution
Recent work on formal verification of differential privacy shows a trend
toward usability and expressiveness -- generating a correctness proof of
sophisticated algorithm while minimizing the annotation burden on programmers.
Sometimes, combining those two requires substantial changes to program logics:
one recent paper is able to verify Report Noisy Max automatically, but it
involves a complex verification system using customized program logics and
verifiers.
In this paper, we propose a new proof technique, called shadow execution, and
embed it into a language called ShadowDP. ShadowDP uses shadow execution to
generate proofs of differential privacy with very few programmer annotations
and without relying on customized logics and verifiers. In addition to
verifying Report Noisy Max, we show that it can verify a new variant of Sparse
Vector that reports the gap between some noisy query answers and the noisy
threshold. Moreover, ShadowDP reduces the complexity of verification: for all
of the algorithms we have evaluated, type checking and verification in total
takes at most 3 seconds, while prior work takes minutes on the same algorithms.Comment: 23 pages, 12 figures, PLDI'1
Blowfish Privacy: Tuning Privacy-Utility Trade-offs using Policies
Privacy definitions provide ways for trading-off the privacy of individuals
in a statistical database for the utility of downstream analysis of the data.
In this paper, we present Blowfish, a class of privacy definitions inspired by
the Pufferfish framework, that provides a rich interface for this trade-off. In
particular, we allow data publishers to extend differential privacy using a
policy, which specifies (a) secrets, or information that must be kept secret,
and (b) constraints that may be known about the data. While the secret
specification allows increased utility by lessening protection for certain
individual properties, the constraint specification provides added protection
against an adversary who knows correlations in the data (arising from
constraints). We formalize policies and present novel algorithms that can
handle general specifications of sensitive information and certain count
constraints. We show that there are reasonable policies under which our privacy
mechanisms for k-means clustering, histograms and range queries introduce
significantly lesser noise than their differentially private counterparts. We
quantify the privacy-utility trade-offs for various policies analytically and
empirically on real datasets.Comment: Full version of the paper at SIGMOD'14 Snowbird, Utah US
- …