79,712 research outputs found

    Random Differential Privacy

    Full text link
    We propose a relaxed privacy definition called {\em random differential privacy} (RDP). Differential privacy requires that adding any new observation to a database will have small effect on the output of the data-release procedure. Random differential privacy requires that adding a {\em randomly drawn new observation} to a database will have small effect on the output. We show an analog of the composition property of differentially private procedures which applies to our new definition. We show how to release an RDP histogram and we show that RDP histograms are much more accurate than histograms obtained using ordinary differential privacy. We finally show an analog of the global sensitivity framework for the release of functions under our privacy definition

    Property Testing for Differential Privacy

    Full text link
    We consider the problem of property testing for differential privacy: with black-box access to a purportedly private algorithm, can we verify its privacy guarantees? In particular, we show that any privacy guarantee that can be efficiently verified is also efficiently breakable in the sense that there exist two databases between which we can efficiently distinguish. We give lower bounds on the query complexity of verifying pure differential privacy, approximate differential privacy, random pure differential privacy, and random approximate differential privacy. We also give algorithmic upper bounds. The lower bounds obtained in the work are infeasible for the scale of parameters that are typically considered reasonable in the differential privacy literature, even when we suppose that the verifier has access to an (untrusted) description of the algorithm. A central message of this work is that verifying privacy requires compromise by either the verifier or the algorithm owner. Either the verifier has to be satisfied with a weak privacy guarantee, or the algorithm owner has to compromise on side information or access to the algorithm.Comment: Allerton, 201

    A Random Matrix Approach to Differential Privacy and Structure Preserved Social Network Graph Publishing

    Full text link
    Online social networks are being increasingly used for analyzing various societal phenomena such as epidemiology, information dissemination, marketing and sentiment flow. Popular analysis techniques such as clustering and influential node analysis, require the computation of eigenvectors of the real graph's adjacency matrix. Recent de-anonymization attacks on Netflix and AOL datasets show that an open access to such graphs pose privacy threats. Among the various privacy preserving models, Differential privacy provides the strongest privacy guarantees. In this paper we propose a privacy preserving mechanism for publishing social network graph data, which satisfies differential privacy guarantees by utilizing a combination of theory of random matrix and that of differential privacy. The key idea is to project each row of an adjacency matrix to a low dimensional space using the random projection approach and then perturb the projected matrix with random noise. We show that as compared to existing approaches for differential private approximation of eigenvectors, our approach is computationally efficient, preserves the utility and satisfies differential privacy. We evaluate our approach on social network graphs of Facebook, Live Journal and Pokec. The results show that even for high values of noise variance sigma=1 the clustering quality given by normalized mutual information gain is as low as 0.74. For influential node discovery, the propose approach is able to correctly recover 80 of the most influential nodes. We also compare our results with an approach presented in [43], which directly perturbs the eigenvector of the original data by a Laplacian noise. The results show that this approach requires a large random perturbation in order to preserve the differential privacy, which leads to a poor estimation of eigenvectors for large social networks

    Differential Privacy for Sets in Euclidean Space

    Full text link
    As multi-agent systems become more numerous and more data-driven, novel forms of privacy are needed in order to protect data types that are not accounted for by existing privacy frameworks. In this paper, we present a new form of privacy for set-valued data which extends the notion of differential privacy to sets which users want to protect. While differential privacy is typically defined in terms of probability distributions, we show that it is more natural here to define privacy for sets over their capacity functionals, which capture the probability of a random set intersecting some other set. In terms of sets' capacity functionals, we provide a novel definition of differential privacy for set-valued data. Based on this definition, we introduce the Laplacian Perturbation Mechanism (so named because it applies random perturbations to sets), and show that it provides ?-differential privacy as prescribed by our definition. These theoretical results are supported by numerical results, demonstrating the practical applicability of the developments made.Comment: 14 pages, 3 figures; Submitted to ACC 201

    Approximate Relational Hoare Logic for Continuous Random Samplings

    Full text link
    Approximate relational Hoare logic (apRHL) is a logic for formal verification of the differential privacy of databases written in the programming language pWHILE. Strictly speaking, however, this logic deals only with discrete random samplings. In this paper, we define the graded relational lifting of the subprobabilistic variant of Giry monad, which described differential privacy. We extend the logic apRHL with this graded lifting to deal with continuous random samplings. We give a generic method to give proof rules of apRHL for continuous random samplings

    Lower Bounds for Locally Private Estimation via Communication Complexity

    Full text link
    We develop lower bounds for estimation under local privacy constraints---including differential privacy and its relaxations to approximate or R\'{e}nyi differential privacy---by showing an equivalence between private estimation and communication-restricted estimation problems. Our results apply to arbitrarily interactive privacy mechanisms, and they also give sharp lower bounds for all levels of differential privacy protections, that is, privacy mechanisms with privacy levels ε[0,)\varepsilon \in [0, \infty). As a particular consequence of our results, we show that the minimax mean-squared error for estimating the mean of a bounded or Gaussian random vector in dd dimensions scales as dndmin{ε,ε2}\frac{d}{n} \cdot \frac{d}{ \min\{\varepsilon, \varepsilon^2\}}.Comment: To appear in Conference on Learning Theory 201

    Preserving Differential Privacy Between Features in Distributed Estimation

    Full text link
    Privacy is crucial in many applications of machine learning. Legal, ethical and societal issues restrict the sharing of sensitive data making it difficult to learn from datasets that are partitioned between many parties. One important instance of such a distributed setting arises when information about each record in the dataset is held by different data owners (the design matrix is "vertically-partitioned"). In this setting few approaches exist for private data sharing for the purposes of statistical estimation and the classical setup of differential privacy with a "trusted curator" preparing the data does not apply. We work with the notion of (ϵ,δ)(\epsilon,\delta)-distributed differential privacy which extends single-party differential privacy to the distributed, vertically-partitioned case. We propose PriDE, a scalable framework for distributed estimation where each party communicates perturbed random projections of their locally held features ensuring (ϵ,δ)(\epsilon,\delta)-distributed differential privacy is preserved. For 2\ell_2-penalized supervised learning problems PriDE has bounded estimation error compared with the optimal estimates obtained without privacy constraints in the non-distributed setting. We confirm this empirically on real world and synthetic datasets

    Pain-Free Random Differential Privacy with Sensitivity Sampling

    Full text link
    Popular approaches to differential privacy, such as the Laplace and exponential mechanisms, calibrate randomised smoothing through global sensitivity of the target non-private function. Bounding such sensitivity is often a prohibitively complex analytic calculation. As an alternative, we propose a straightforward sampler for estimating sensitivity of non-private mechanisms. Since our sensitivity estimates hold with high probability, any mechanism that would be (ϵ,δ)(\epsilon,\delta)-differentially private under bounded global sensitivity automatically achieves (ϵ,δ,γ)(\epsilon,\delta,\gamma)-random differential privacy (Hall et al., 2012), without any target-specific calculations required. We demonstrate on worked example learners how our usable approach adopts a naturally-relaxed privacy guarantee, while achieving more accurate releases even for non-private functions that are black-box computer programs.Comment: 12 pages, 9 figures, 1 table; full report of paper accepted into ICML'201

    Successive Refinement of Privacy

    Full text link
    This work examines a novel question: how much randomness is needed to achieve local differential privacy (LDP)? A motivating scenario is providing {\em multiple levels of privacy} to multiple analysts, either for distribution or for heavy-hitter estimation, using the \emph{same} (randomized) output. We call this setting \emph{successive refinement of privacy}, as it provides hierarchical access to the raw data with different privacy levels. For example, the same randomized output could enable one analyst to reconstruct the input, while another can only estimate the distribution subject to LDP requirements. This extends the classical Shannon (wiretap) security setting to local differential privacy. We provide (order-wise) tight characterizations of privacy-utility-randomness trade-offs in several cases for distribution estimation, including the standard LDP setting under a randomness constraint. We also provide a non-trivial privacy mechanism for multi-level privacy. Furthermore, we show that we cannot reuse random keys over time while preserving privacy of each user

    Compressive Mechanism: Utilizing Sparse Representation in Differential Privacy

    Full text link
    Differential privacy provides the first theoretical foundation with provable privacy guarantee against adversaries with arbitrary prior knowledge. The main idea to achieve differential privacy is to inject random noise into statistical query results. Besides correctness, the most important goal in the design of a differentially private mechanism is to reduce the effect of random noise, ensuring that the noisy results can still be useful. This paper proposes the \emph{compressive mechanism}, a novel solution on the basis of state-of-the-art compression technique, called \emph{compressive sensing}. Compressive sensing is a decent theoretical tool for compact synopsis construction, using random projections. In this paper, we show that the amount of noise is significantly reduced from O(n)O(\sqrt{n}) to O(log(n))O(\log(n)), when the noise insertion procedure is carried on the synopsis samples instead of the original database. As an extension, we also apply the proposed compressive mechanism to solve the problem of continual release of statistical results. Extensive experiments using real datasets justify our accuracy claims.Comment: 20 pages, 6 figure
    corecore