1,622 research outputs found

    Linear and Range Counting under Metric-based Local Differential Privacy

    Full text link
    Local differential privacy (LDP) enables private data sharing and analytics without the need for a trusted data collector. Error-optimal primitives (for, e.g., estimating means and item frequencies) under LDP have been well studied. For analytical tasks such as range queries, however, the best known error bound is dependent on the domain size of private data, which is potentially prohibitive. This deficiency is inherent as LDP protects the same level of indistinguishability between any pair of private data values for each data downer. In this paper, we utilize an extension of ϵ\epsilon-LDP called Metric-LDP or EE-LDP, where a metric EE defines heterogeneous privacy guarantees for different pairs of private data values and thus provides a more flexible knob than ϵ\epsilon does to relax LDP and tune utility-privacy trade-offs. We show that, under such privacy relaxations, for analytical workloads such as linear counting, multi-dimensional range counting queries, and quantile queries, we can achieve significant gains in utility. In particular, for range queries under EE-LDP where the metric EE is the L1L^1-distance function scaled by ϵ\epsilon, we design mechanisms with errors independent on the domain sizes; instead, their errors depend on the metric EE, which specifies in what granularity the private data is protected. We believe that the primitives we design for EE-LDP will be useful in developing mechanisms for other analytical tasks, and encourage the adoption of LDP in practice

    Differential Privacy, Property Testing, and Perturbations

    Full text link
    Controlling the dissemination of information about ourselves has become a minefield in the modern age. We release data about ourselves every day and don’t always fully understand what information is contained in this data. It is often the case that the combination of seemingly innocuous pieces of data can be combined to reveal more sensitive information about ourselves than we intended. Differential privacy has developed as a technique to prevent this type of privacy leakage. It borrows ideas from information theory to inject enough uncertainty into the data so that sensitive information is provably absent from the privatised data. Current research in differential privacy walks the fine line between removing sensitive information while allowing non-sensitive information to be released. At its heart, this thesis is about the study of information. Many of the results can be formulated as asking a subset of the questions: does the data you have contain enough information to learn what you would like to learn? and how can I affect the data to ensure you can’t discern sensitive information? We will often approach the former question from both directions: information theoretic lower bounds on recovery and algorithmic upper bounds. We begin with an information theoretic lower bound for graphon estimation. This explores the fundamental limits of how much information about the underlying population is contained in a finite sample of data. We then move on to exploring the connection between information theoretic results and privacy in the context of linear inverse problems. We find that there is a discrepancy between how the inverse problems community and the privacy community view good recovery of information. Next, we explore black-box testing for privacy. We argue that the amount of information required to verify the privacy guarantee of an algorithm, without access to the internals of the algorithm, is lower bounded by the amount of information required to break the privacy guarantee. Finally, we explore a setting where imposing privacy is a help rather than a hindrance: online linear optimisation. We argue that private algorithms have the right kind of stability guarantee to ensure low regret for online linear optimisation.PHDMathematicsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/143940/1/amcm_1.pd

    SoK: Differential Privacies

    Get PDF
    Shortly after it was first introduced in 2006, differential privacy became the flagship data privacy definition. Since then, numerous variants and extensions were proposed to adapt it to different scenarios and attacker models. In this work, we propose a systematic taxonomy of these variants and extensions. We list all data privacy definitions based on differential privacy, and partition them into seven categories, depending on which aspect of the original definition is modified. These categories act like dimensions: variants from the same category cannot be combined, but variants from different categories can be combined to form new definitions. We also establish a partial ordering of relative strength between these notions by summarizing existing results. Furthermore, we list which of these definitions satisfy some desirable properties, like composition, post-processing, and convexity by either providing a novel proof or collecting existing ones.Comment: This is the full version of the SoK paper with the same title, accepted at PETS (Privacy Enhancing Technologies Symposium) 202

    Private Graph Data Release: A Survey

    Full text link
    The application of graph analytics to various domains have yielded tremendous societal and economical benefits in recent years. However, the increasingly widespread adoption of graph analytics comes with a commensurate increase in the need to protect private information in graph databases, especially in light of the many privacy breaches in real-world graph data that was supposed to preserve sensitive information. This paper provides a comprehensive survey of private graph data release algorithms that seek to achieve the fine balance between privacy and utility, with a specific focus on provably private mechanisms. Many of these mechanisms fall under natural extensions of the Differential Privacy framework to graph data, but we also investigate more general privacy formulations like Pufferfish Privacy that can deal with the limitations of Differential Privacy. A wide-ranging survey of the applications of private graph data release mechanisms to social networks, finance, supply chain, health and energy is also provided. This survey paper and the taxonomy it provides should benefit practitioners and researchers alike in the increasingly important area of private graph data release and analysis

    Triangle Counting with Local Edge Differential Privacy

    Get PDF

    SoK: Chasing Accuracy and Privacy, and Catching Both in Differentially Private Histogram Publication

    Get PDF
    Histograms and synthetic data are of key importance in data analysis. However, researchers have shown that even aggregated data such as histograms, containing no obvious sensitive attributes, can result in privacy leakage. To enable data analysis, a strong notion of privacy is required to avoid risking unintended privacy violations.Such a strong notion of privacy is differential privacy, a statistical notion of privacy that makes privacy leakage quantifiable. The caveat regarding differential privacy is that while it has strong guarantees for privacy, privacy comes at a cost of accuracy. Despite this trade-off being a central and important issue in the adoption of differential privacy, there exists a gap in the literature regarding providing an understanding of the trade-off and how to address it appropriately. Through a systematic literature review (SLR), we investigate the state-of-the-art within accuracy improving differentially private algorithms for histogram and synthetic data publishing. Our contribution is two-fold: 1) we identify trends and connections in the contributions to the field of differential privacy for histograms and synthetic data and 2) we provide an understanding of the privacy/accuracy trade-off challenge by crystallizing different dimensions to accuracy improvement. Accordingly, we position and visualize the ideas in relation to each other and external work, and deconstruct each algorithm to examine the building blocks separately with the aim of pinpointing which dimension of accuracy improvement each technique/approach is targeting. Hence, this systematization of knowledge (SoK) provides an understanding of in which dimensions and how accuracy improvement can be pursued without sacrificing privacy
    corecore