3,704 research outputs found

    Probabilistic Couplings For Probabilistic Reasoning

    Get PDF
    This thesis explores proofs by coupling from the perspective of formal verification. Long employed in probability theory and theoretical computer science, these proofs construct couplings between the output distributions of two probabilistic processes. Couplings can imply various probabilistic relational properties, guarantees that compare two runs of a probabilistic computation. To give a formal account of this clean proof technique, we first show that proofs in the program logic pRHL (probabilistic Relational Hoare Logic) describe couplings. We formalize couplings that establish various probabilistic properties, including distribution equivalence, convergence, and stochastic domination. Then we deepen the connection between couplings and pRHL by giving a proofs-as-programs interpretation: a coupling proof encodes a probabilistic product program, whose properties imply relational properties of the original two programs. We design the logic xpRHL (product pRHL) to build the product program, with extensions to model more advanced constructions including shift coupling and path coupling. We then develop an approximate version of probabilistic coupling, based on approximate liftings. It is known that the existence of an approximate lifting implies differential privacy, a relational notion of statistical privacy. We propose a corresponding proof technique---proof by approximate coupling---inspired by the logic apRHL, a version of pRHL for building approximate liftings. Drawing on ideas from existing privacy proofs, we extend apRHL with novel proof rules for constructing new approximate couplings. We give approximate coupling proofs of privacy for the Report-noisy-max and Sparse Vector mechanisms, well-known algorithms from the privacy literature with notoriously subtle privacy proofs, and produce the first formalized proof of privacy for these algorithms in apRHL. Finally, we enrich the theory of approximate couplings with several more sophisticated constructions: a principle for showing accuracy-dependent privacy, a generalization of the advanced composition theorem from differential privacy, and an optimal approximate coupling relating two subsets of samples. We also show equivalences between approximate couplings and other existing definitions. These ingredients support the first formalized proof of privacy for the Between Thresholds mechanism, an extension of the Sparse Vector mechanism

    Sample Complexity Bounds on Differentially Private Learning via Communication Complexity

    Full text link
    In this work we analyze the sample complexity of classification by differentially private algorithms. Differential privacy is a strong and well-studied notion of privacy introduced by Dwork et al. (2006) that ensures that the output of an algorithm leaks little information about the data point provided by any of the participating individuals. Sample complexity of private PAC and agnostic learning was studied in a number of prior works starting with (Kasiviswanathan et al., 2008) but a number of basic questions still remain open, most notably whether learning with privacy requires more samples than learning without privacy. We show that the sample complexity of learning with (pure) differential privacy can be arbitrarily higher than the sample complexity of learning without the privacy constraint or the sample complexity of learning with approximate differential privacy. Our second contribution and the main tool is an equivalence between the sample complexity of (pure) differentially private learning of a concept class CC (or SCDP(C)SCDP(C)) and the randomized one-way communication complexity of the evaluation problem for concepts from CC. Using this equivalence we prove the following bounds: 1. SCDP(C)=Ω(LDim(C))SCDP(C) = \Omega(LDim(C)), where LDim(C)LDim(C) is the Littlestone's (1987) dimension characterizing the number of mistakes in the online-mistake-bound learning model. Known bounds on LDim(C)LDim(C) then imply that SCDP(C)SCDP(C) can be much higher than the VC-dimension of CC. 2. For any tt, there exists a class CC such that LDim(C)=2LDim(C)=2 but SCDP(C)tSCDP(C) \geq t. 3. For any tt, there exists a class CC such that the sample complexity of (pure) α\alpha-differentially private PAC learning is Ω(t/α)\Omega(t/\alpha) but the sample complexity of the relaxed (α,β)(\alpha,\beta)-differentially private PAC learning is O(log(1/β)/α)O(\log(1/\beta)/\alpha). This resolves an open problem of Beimel et al. (2013b).Comment: Extended abstract appears in Conference on Learning Theory (COLT) 201

    Hypothesis Testing Interpretations and Renyi Differential Privacy

    Full text link
    Differential privacy is a de facto standard in data privacy, with applications in the public and private sectors. A way to explain differential privacy, which is particularly appealing to statistician and social scientists is by means of its statistical hypothesis testing interpretation. Informally, one cannot effectively test whether a specific individual has contributed her data by observing the output of a private mechanism---any test cannot have both high significance and high power. In this paper, we identify some conditions under which a privacy definition given in terms of a statistical divergence satisfies a similar interpretation. These conditions are useful to analyze the distinguishability power of divergences and we use them to study the hypothesis testing interpretation of some relaxations of differential privacy based on Renyi divergence. This analysis also results in an improved conversion rule between these definitions and differential privacy

    Advanced Probabilistic Couplings for Differential Privacy

    Get PDF
    Differential privacy is a promising formal approach to data privacy, which provides a quantitative bound on the privacy cost of an algorithm that operates on sensitive information. Several tools have been developed for the formal verification of differentially private algorithms, including program logics and type systems. However, these tools do not capture fundamental techniques that have emerged in recent years, and cannot be used for reasoning about cutting-edge differentially private algorithms. Existing techniques fail to handle three broad classes of algorithms: 1) algorithms where privacy depends accuracy guarantees, 2) algorithms that are analyzed with the advanced composition theorem, which shows slower growth in the privacy cost, 3) algorithms that interactively accept adaptive inputs. We address these limitations with a new formalism extending apRHL, a relational program logic that has been used for proving differential privacy of non-interactive algorithms, and incorporating aHL, a (non-relational) program logic for accuracy properties. We illustrate our approach through a single running example, which exemplifies the three classes of algorithms and explores new variants of the Sparse Vector technique, a well-studied algorithm from the privacy literature. We implement our logic in EasyCrypt, and formally verify privacy. We also introduce a novel coupling technique called \emph{optimal subset coupling} that may be of independent interest

    On the Differential Privacy of Bayesian Inference

    Get PDF
    We study how to communicate findings of Bayesian inference to third parties, while preserving the strong guarantee of differential privacy. Our main contributions are four different algorithms for private Bayesian inference on proba-bilistic graphical models. These include two mechanisms for adding noise to the Bayesian updates, either directly to the posterior parameters, or to their Fourier transform so as to preserve update consistency. We also utilise a recently introduced posterior sampling mechanism, for which we prove bounds for the specific but general case of discrete Bayesian networks; and we introduce a maximum-a-posteriori private mechanism. Our analysis includes utility and privacy bounds, with a novel focus on the influence of graph structure on privacy. Worked examples and experiments with Bayesian na{\"i}ve Bayes and Bayesian linear regression illustrate the application of our mechanisms.Comment: AAAI 2016, Feb 2016, Phoenix, Arizona, United State
    corecore