3,704 research outputs found
Probabilistic Couplings For Probabilistic Reasoning
This thesis explores proofs by coupling from the perspective of formal verification. Long employed in probability theory and theoretical computer science, these proofs construct couplings between the output distributions of two probabilistic processes. Couplings can imply various probabilistic relational properties, guarantees that compare two runs of a probabilistic computation.
To give a formal account of this clean proof technique, we first show that proofs in the program logic pRHL (probabilistic Relational Hoare Logic) describe couplings. We formalize couplings that establish various probabilistic properties, including distribution equivalence, convergence, and stochastic domination. Then we deepen the connection between couplings and pRHL by giving a proofs-as-programs interpretation: a coupling proof encodes a probabilistic product program, whose properties imply relational properties of the original two programs. We design the logic xpRHL (product pRHL) to build the product program, with extensions to model more advanced constructions including shift coupling and path coupling.
We then develop an approximate version of probabilistic coupling, based on approximate liftings. It is known that the existence of an approximate lifting implies differential privacy, a relational notion of statistical privacy. We propose a corresponding proof technique---proof by approximate coupling---inspired by the logic apRHL, a version of pRHL for building approximate liftings. Drawing on ideas from existing privacy proofs, we extend apRHL with novel proof rules for constructing new approximate couplings. We give approximate coupling proofs of privacy for the Report-noisy-max and Sparse Vector mechanisms, well-known algorithms from the privacy literature with notoriously subtle privacy proofs, and produce the first formalized proof of privacy for these algorithms in apRHL.
Finally, we enrich the theory of approximate couplings with several more sophisticated constructions: a principle for showing accuracy-dependent privacy, a generalization of the advanced composition theorem from differential privacy, and an optimal approximate coupling relating two subsets of samples. We also show equivalences between approximate couplings and other existing definitions. These ingredients support the first formalized proof of privacy for the Between Thresholds mechanism, an extension of the Sparse Vector mechanism
Sample Complexity Bounds on Differentially Private Learning via Communication Complexity
In this work we analyze the sample complexity of classification by
differentially private algorithms. Differential privacy is a strong and
well-studied notion of privacy introduced by Dwork et al. (2006) that ensures
that the output of an algorithm leaks little information about the data point
provided by any of the participating individuals. Sample complexity of private
PAC and agnostic learning was studied in a number of prior works starting with
(Kasiviswanathan et al., 2008) but a number of basic questions still remain
open, most notably whether learning with privacy requires more samples than
learning without privacy.
We show that the sample complexity of learning with (pure) differential
privacy can be arbitrarily higher than the sample complexity of learning
without the privacy constraint or the sample complexity of learning with
approximate differential privacy. Our second contribution and the main tool is
an equivalence between the sample complexity of (pure) differentially private
learning of a concept class (or ) and the randomized one-way
communication complexity of the evaluation problem for concepts from . Using
this equivalence we prove the following bounds:
1. , where is the Littlestone's (1987)
dimension characterizing the number of mistakes in the online-mistake-bound
learning model. Known bounds on then imply that can be much
higher than the VC-dimension of .
2. For any , there exists a class such that but .
3. For any , there exists a class such that the sample complexity of
(pure) -differentially private PAC learning is but
the sample complexity of the relaxed -differentially private
PAC learning is . This resolves an open problem of
Beimel et al. (2013b).Comment: Extended abstract appears in Conference on Learning Theory (COLT)
201
Hypothesis Testing Interpretations and Renyi Differential Privacy
Differential privacy is a de facto standard in data privacy, with
applications in the public and private sectors. A way to explain differential
privacy, which is particularly appealing to statistician and social scientists
is by means of its statistical hypothesis testing interpretation. Informally,
one cannot effectively test whether a specific individual has contributed her
data by observing the output of a private mechanism---any test cannot have both
high significance and high power.
In this paper, we identify some conditions under which a privacy definition
given in terms of a statistical divergence satisfies a similar interpretation.
These conditions are useful to analyze the distinguishability power of
divergences and we use them to study the hypothesis testing interpretation of
some relaxations of differential privacy based on Renyi divergence. This
analysis also results in an improved conversion rule between these definitions
and differential privacy
Advanced Probabilistic Couplings for Differential Privacy
Differential privacy is a promising formal approach to data privacy, which
provides a quantitative bound on the privacy cost of an algorithm that operates
on sensitive information. Several tools have been developed for the formal
verification of differentially private algorithms, including program logics and
type systems. However, these tools do not capture fundamental techniques that
have emerged in recent years, and cannot be used for reasoning about
cutting-edge differentially private algorithms. Existing techniques fail to
handle three broad classes of algorithms: 1) algorithms where privacy depends
accuracy guarantees, 2) algorithms that are analyzed with the advanced
composition theorem, which shows slower growth in the privacy cost, 3)
algorithms that interactively accept adaptive inputs.
We address these limitations with a new formalism extending apRHL, a
relational program logic that has been used for proving differential privacy of
non-interactive algorithms, and incorporating aHL, a (non-relational) program
logic for accuracy properties. We illustrate our approach through a single
running example, which exemplifies the three classes of algorithms and explores
new variants of the Sparse Vector technique, a well-studied algorithm from the
privacy literature. We implement our logic in EasyCrypt, and formally verify
privacy. We also introduce a novel coupling technique called \emph{optimal
subset coupling} that may be of independent interest
On the Differential Privacy of Bayesian Inference
We study how to communicate findings of Bayesian inference to third parties,
while preserving the strong guarantee of differential privacy. Our main
contributions are four different algorithms for private Bayesian inference on
proba-bilistic graphical models. These include two mechanisms for adding noise
to the Bayesian updates, either directly to the posterior parameters, or to
their Fourier transform so as to preserve update consistency. We also utilise a
recently introduced posterior sampling mechanism, for which we prove bounds for
the specific but general case of discrete Bayesian networks; and we introduce a
maximum-a-posteriori private mechanism. Our analysis includes utility and
privacy bounds, with a novel focus on the influence of graph structure on
privacy. Worked examples and experiments with Bayesian na{\"i}ve Bayes and
Bayesian linear regression illustrate the application of our mechanisms.Comment: AAAI 2016, Feb 2016, Phoenix, Arizona, United State
- …