371 research outputs found

    Strong Hardness of Privacy from Weak Traitor Tracing

    Get PDF
    A central problem in differential privacy is to accurately answer a large family QQ of statistical queries over a data universe XX. A statistical query on a dataset DXnD \in X^n asks ``what fraction of the elements of DD satisfy a given predicate pp on XX?\u27\u27 Ignoring computational constraints, it is possible to accurately answer exponentially many queries on an exponential size universe while satisfying differential privacy (Blum et al., STOC\u2708). Dwork et al. (STOC\u2709) and Boneh and Zhandry (CRYPTO\u2714) showed that if both QQ and XX are of polynomial size, then there is an efficient differentially private algorithm that accurately answers all the queries. They also proved that if QQ and XX are both exponentially large, then under a plausible assumption, no efficient algorithm exists. We show that, under the same assumption, if either the number of queries or the data universe is of exponential size, then there is no differentially private algorithm that answers all the queries. Specifically, we prove that if one-way functions and indistinguishability obfuscation exist, then: 1) For every nn, there is a family QQ of O~(n7)\tilde{O}(n^7) queries on a data universe XX of size 2d2^d such that no poly(n,d)poly(n,d) time differentially private algorithm takes a dataset DXnD \in X^n and outputs accurate answers to every query in QQ. 2) For every nn, there is a family QQ of 2d2^d queries on a data universe XX of size O~(n7)\tilde{O}(n^7) such that no poly(n,d)poly(n,d) time differentially private algorithm takes a dataset DXnD \in X^n and outputs accurate answers to every query in QQ. In both cases, the result is nearly quantitatively tight, since there is an efficient differentially private algorithm that answers Ω~(n2)\tilde{\Omega}(n^2) queries on an exponential size data universe, and one that answers exponentially many queries on a data universe of size Ω~(n2)\tilde{\Omega}(n^2). Our proofs build on the connection between hardness results in differential privacy and traitor-tracing schemes (Dwork et al., STOC\u2709; Ullman, STOC\u2713). We prove our hardness result for a polynomial size query set (resp., data universe) by showing that they follow from the existence of a special type of traitor-tracing scheme with very short ciphertexts (resp., secret keys), but very weak security guarantees, and then constructing such a scheme

    Order-Revealing Encryption and the Hardness of Private Learning

    Full text link
    An order-revealing encryption scheme gives a public procedure by which two ciphertexts can be compared to reveal the ordering of their underlying plaintexts. We show how to use order-revealing encryption to separate computationally efficient PAC learning from efficient (ϵ,δ)(\epsilon, \delta)-differentially private PAC learning. That is, we construct a concept class that is efficiently PAC learnable, but for which every efficient learner fails to be differentially private. This answers a question of Kasiviswanathan et al. (FOCS '08, SIAM J. Comput. '11). To prove our result, we give a generic transformation from an order-revealing encryption scheme into one with strongly correct comparison, which enables the consistent comparison of ciphertexts that are not obtained as the valid encryption of any message. We believe this construction may be of independent interest.Comment: 28 page

    Preventing False Discovery in Interactive Data Analysis is Hard

    Full text link
    We show that, under a standard hardness assumption, there is no computationally efficient algorithm that given nn samples from an unknown distribution can give valid answers to n3+o(1)n^{3+o(1)} adaptively chosen statistical queries. A statistical query asks for the expectation of a predicate over the underlying distribution, and an answer to a statistical query is valid if it is "close" to the correct expectation over the distribution. Our result stands in stark contrast to the well known fact that exponentially many statistical queries can be answered validly and efficiently if the queries are chosen non-adaptively (no query may depend on the answers to previous queries). Moreover, a recent work by Dwork et al. shows how to accurately answer exponentially many adaptively chosen statistical queries via a computationally inefficient algorithm; and how to answer a quadratic number of adaptive queries via a computationally efficient algorithm. The latter result implies that our result is tight up to a linear factor in n.n. Conceptually, our result demonstrates that achieving statistical validity alone can be a source of computational intractability in adaptive settings. For example, in the modern large collaborative research environment, data analysts typically choose a particular approach based on previous findings. False discovery occurs if a research finding is supported by the data but not by the underlying distribution. While the study of preventing false discovery in Statistics is decades old, to the best of our knowledge our result is the first to demonstrate a computational barrier. In particular, our result suggests that the perceived difficulty of preventing false discovery in today's collaborative research environment may be inherent

    Hardness of Non-Interactive Differential Privacy from One-Way Functions

    Get PDF
    A central challenge in differential privacy is to design computationally efficient non-interactive algorithms that can answer large numbers of statistical queries on a sensitive dataset. That is, we would like to design a differentially private algorithm that takes a dataset DXnD \in X^n consisting of some small number of elements nn from some large data universe XX, and efficiently outputs a summary that allows a user to efficiently obtain an answer to any query in some large family QQ. Ignoring computational constraints, this problem can be solved even when XX and QQ are exponentially large and nn is just a small polynomial; however, all algorithms with remotely similar guarantees run in exponential time. There have been several results showing that, under the strong assumption of indistinguishability obfuscation (iO), no efficient differentially private algorithm exists when XX and QQ can be exponentially large. However, there are no strong separations between information-theoretic and computationally efficient differentially private algorithms under any standard complexity assumption. In this work we show that, if one-way functions exist, there is no general purpose differentially private algorithm that works when XX and QQ are exponentially large, and nn is an arbitrary polynomial. In fact, we show that this result holds even if XX is just subexponentially large (assuming only polynomially-hard one-way functions). This result solves an open problem posed by Vadhan in his recent survey

    Order-Revealing Encryption and the Hardness of Private Learning

    Get PDF
    An order-revealing encryption scheme gives a public procedure by which two ciphertexts can be compared to reveal the ordering of their underlying plaintexts. We show how to use order-revealing encryption to separate computationally efficient PAC learning from efficient (ϵ,δ)(\epsilon, \delta)-differentially private PAC learning. That is, we construct a concept class that is efficiently PAC learnable, but for which every efficient learner fails to be differentially private. This answers a question of Kasiviswanathan et al. (FOCS \u2708, SIAM J. Comput. \u2711). To prove our result, we give a generic transformation from an order-revealing encryption scheme into one with strongly correct comparison, which enables the consistent comparison of ciphertexts that are not obtained as the valid encryption of any message. We believe this construction may be of independent interest
    corecore