42,993 research outputs found

    Differentially Private Release and Learning of Threshold Functions

    Full text link
    We prove new upper and lower bounds on the sample complexity of (ϵ,δ)(\epsilon, \delta) differentially private algorithms for releasing approximate answers to threshold functions. A threshold function cxc_x over a totally ordered domain XX evaluates to cx(y)=1c_x(y) = 1 if yxy \le x, and evaluates to 00 otherwise. We give the first nontrivial lower bound for releasing thresholds with (ϵ,δ)(\epsilon,\delta) differential privacy, showing that the task is impossible over an infinite domain XX, and moreover requires sample complexity nΩ(logX)n \ge \Omega(\log^*|X|), which grows with the size of the domain. Inspired by the techniques used to prove this lower bound, we give an algorithm for releasing thresholds with n2(1+o(1))logXn \le 2^{(1+ o(1))\log^*|X|} samples. This improves the previous best upper bound of 8(1+o(1))logX8^{(1 + o(1))\log^*|X|} (Beimel et al., RANDOM '13). Our sample complexity upper and lower bounds also apply to the tasks of learning distributions with respect to Kolmogorov distance and of properly PAC learning thresholds with differential privacy. The lower bound gives the first separation between the sample complexity of properly learning a concept class with (ϵ,δ)(\epsilon,\delta) differential privacy and learning without privacy. For properly learning thresholds in \ell dimensions, this lower bound extends to nΩ(logX)n \ge \Omega(\ell \cdot \log^*|X|). To obtain our results, we give reductions in both directions from releasing and properly learning thresholds and the simpler interior point problem. Given a database DD of elements from XX, the interior point problem asks for an element between the smallest and largest elements in DD. We introduce new recursive constructions for bounding the sample complexity of the interior point problem, as well as further reductions and techniques for proving impossibility results for other basic problems in differential privacy.Comment: 43 page

    Learning Coverage Functions and Private Release of Marginals

    Full text link
    We study the problem of approximating and learning coverage functions. A function c:2[n]R+c: 2^{[n]} \rightarrow \mathbf{R}^{+} is a coverage function, if there exists a universe UU with non-negative weights w(u)w(u) for each uUu \in U and subsets A1,A2,,AnA_1, A_2, \ldots, A_n of UU such that c(S)=uiSAiw(u)c(S) = \sum_{u \in \cup_{i \in S} A_i} w(u). Alternatively, coverage functions can be described as non-negative linear combinations of monotone disjunctions. They are a natural subclass of submodular functions and arise in a number of applications. We give an algorithm that for any γ,δ>0\gamma,\delta>0, given random and uniform examples of an unknown coverage function cc, finds a function hh that approximates cc within factor 1+γ1+\gamma on all but δ\delta-fraction of the points in time poly(n,1/γ,1/δ)poly(n,1/\gamma,1/\delta). This is the first fully-polynomial algorithm for learning an interesting class of functions in the demanding PMAC model of Balcan and Harvey (2011). Our algorithms are based on several new structural properties of coverage functions. Using the results in (Feldman and Kothari, 2014), we also show that coverage functions are learnable agnostically with excess 1\ell_1-error ϵ\epsilon over all product and symmetric distributions in time nlog(1/ϵ)n^{\log(1/\epsilon)}. In contrast, we show that, without assumptions on the distribution, learning coverage functions is at least as hard as learning polynomial-size disjoint DNF formulas, a class of functions for which the best known algorithm runs in time 2O~(n1/3)2^{\tilde{O}(n^{1/3})} (Klivans and Servedio, 2004). As an application of our learning results, we give simple differentially-private algorithms for releasing monotone conjunction counting queries with low average error. In particular, for any knk \leq n, we obtain private release of kk-way marginals with average error αˉ\bar{\alpha} in time nO(log(1/αˉ))n^{O(\log(1/\bar{\alpha}))}

    All-Payer Claims Database Development Manual: Establishing a Foundation for Health Care Transparency and Informed Decision Making

    Get PDF
    With support from the Gary and Mary West Health Policy Center, the APCD Council has developed a manual for states to develop all-payer claims databases. Titled All-Payer Claims Database Development Manual: Establishing a Foundation for Health Care Transparency and Informed Decision Making, the manual is a first-of its-kind resource that provides states with detailed guidance on common data standards, collection, aggregation and analysis involved with establishing these databases

    Advanced Probabilistic Couplings for Differential Privacy

    Get PDF
    Differential privacy is a promising formal approach to data privacy, which provides a quantitative bound on the privacy cost of an algorithm that operates on sensitive information. Several tools have been developed for the formal verification of differentially private algorithms, including program logics and type systems. However, these tools do not capture fundamental techniques that have emerged in recent years, and cannot be used for reasoning about cutting-edge differentially private algorithms. Existing techniques fail to handle three broad classes of algorithms: 1) algorithms where privacy depends accuracy guarantees, 2) algorithms that are analyzed with the advanced composition theorem, which shows slower growth in the privacy cost, 3) algorithms that interactively accept adaptive inputs. We address these limitations with a new formalism extending apRHL, a relational program logic that has been used for proving differential privacy of non-interactive algorithms, and incorporating aHL, a (non-relational) program logic for accuracy properties. We illustrate our approach through a single running example, which exemplifies the three classes of algorithms and explores new variants of the Sparse Vector technique, a well-studied algorithm from the privacy literature. We implement our logic in EasyCrypt, and formally verify privacy. We also introduce a novel coupling technique called \emph{optimal subset coupling} that may be of independent interest

    Order-Revealing Encryption and the Hardness of Private Learning

    Full text link
    An order-revealing encryption scheme gives a public procedure by which two ciphertexts can be compared to reveal the ordering of their underlying plaintexts. We show how to use order-revealing encryption to separate computationally efficient PAC learning from efficient (ϵ,δ)(\epsilon, \delta)-differentially private PAC learning. That is, we construct a concept class that is efficiently PAC learnable, but for which every efficient learner fails to be differentially private. This answers a question of Kasiviswanathan et al. (FOCS '08, SIAM J. Comput. '11). To prove our result, we give a generic transformation from an order-revealing encryption scheme into one with strongly correct comparison, which enables the consistent comparison of ciphertexts that are not obtained as the valid encryption of any message. We believe this construction may be of independent interest.Comment: 28 page

    Key information sets and unistats : overview and next steps

    Get PDF

    Education Maintenance Allowances awarded in Wales, 2012/13

    Get PDF
    corecore