1,622 research outputs found
Linear and Range Counting under Metric-based Local Differential Privacy
Local differential privacy (LDP) enables private data sharing and analytics
without the need for a trusted data collector. Error-optimal primitives (for,
e.g., estimating means and item frequencies) under LDP have been well studied.
For analytical tasks such as range queries, however, the best known error bound
is dependent on the domain size of private data, which is potentially
prohibitive. This deficiency is inherent as LDP protects the same level of
indistinguishability between any pair of private data values for each data
downer.
In this paper, we utilize an extension of -LDP called Metric-LDP or
-LDP, where a metric defines heterogeneous privacy guarantees for
different pairs of private data values and thus provides a more flexible knob
than does to relax LDP and tune utility-privacy trade-offs. We show
that, under such privacy relaxations, for analytical workloads such as linear
counting, multi-dimensional range counting queries, and quantile queries, we
can achieve significant gains in utility. In particular, for range queries
under -LDP where the metric is the -distance function scaled by
, we design mechanisms with errors independent on the domain sizes;
instead, their errors depend on the metric , which specifies in what
granularity the private data is protected. We believe that the primitives we
design for -LDP will be useful in developing mechanisms for other analytical
tasks, and encourage the adoption of LDP in practice
Differential Privacy, Property Testing, and Perturbations
Controlling the dissemination of information about ourselves has become a minefield in
the modern age. We release data about ourselves every day and don’t always fully understand
what information is contained in this data. It is often the case that the combination
of seemingly innocuous pieces of data can be combined to reveal more sensitive information
about ourselves than we intended. Differential privacy has developed as a technique
to prevent this type of privacy leakage. It borrows ideas from information theory to inject
enough uncertainty into the data so that sensitive information is provably absent from
the privatised data. Current research in differential privacy walks the fine line between
removing sensitive information while allowing non-sensitive information to be released.
At its heart, this thesis is about the study of information. Many of the results can be
formulated as asking a subset of the questions: does the data you have contain enough
information to learn what you would like to learn? and how can I affect the data to ensure
you can’t discern sensitive information? We will often approach the former question from
both directions: information theoretic lower bounds on recovery and algorithmic upper
bounds.
We begin with an information theoretic lower bound for graphon estimation. This explores
the fundamental limits of how much information about the underlying population is
contained in a finite sample of data. We then move on to exploring the connection between
information theoretic results and privacy in the context of linear inverse problems. We find
that there is a discrepancy between how the inverse problems community and the privacy
community view good recovery of information. Next, we explore black-box testing for
privacy. We argue that the amount of information required to verify the privacy guarantee
of an algorithm, without access to the internals of the algorithm, is lower bounded by the
amount of information required to break the privacy guarantee. Finally, we explore a setting
where imposing privacy is a help rather than a hindrance: online linear optimisation.
We argue that private algorithms have the right kind of stability guarantee to ensure low
regret for online linear optimisation.PHDMathematicsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/143940/1/amcm_1.pd
SoK: Differential Privacies
Shortly after it was first introduced in 2006, differential privacy became
the flagship data privacy definition. Since then, numerous variants and
extensions were proposed to adapt it to different scenarios and attacker
models. In this work, we propose a systematic taxonomy of these variants and
extensions. We list all data privacy definitions based on differential privacy,
and partition them into seven categories, depending on which aspect of the
original definition is modified.
These categories act like dimensions: variants from the same category cannot
be combined, but variants from different categories can be combined to form new
definitions. We also establish a partial ordering of relative strength between
these notions by summarizing existing results. Furthermore, we list which of
these definitions satisfy some desirable properties, like composition,
post-processing, and convexity by either providing a novel proof or collecting
existing ones.Comment: This is the full version of the SoK paper with the same title,
accepted at PETS (Privacy Enhancing Technologies Symposium) 202
Private Graph Data Release: A Survey
The application of graph analytics to various domains have yielded tremendous
societal and economical benefits in recent years. However, the increasingly
widespread adoption of graph analytics comes with a commensurate increase in
the need to protect private information in graph databases, especially in light
of the many privacy breaches in real-world graph data that was supposed to
preserve sensitive information. This paper provides a comprehensive survey of
private graph data release algorithms that seek to achieve the fine balance
between privacy and utility, with a specific focus on provably private
mechanisms. Many of these mechanisms fall under natural extensions of the
Differential Privacy framework to graph data, but we also investigate more
general privacy formulations like Pufferfish Privacy that can deal with the
limitations of Differential Privacy. A wide-ranging survey of the applications
of private graph data release mechanisms to social networks, finance, supply
chain, health and energy is also provided. This survey paper and the taxonomy
it provides should benefit practitioners and researchers alike in the
increasingly important area of private graph data release and analysis
SoK: Chasing Accuracy and Privacy, and Catching Both in Differentially Private Histogram Publication
Histograms and synthetic data are of key importance in data analysis. However, researchers have shown that even aggregated data such as histograms, containing no obvious sensitive attributes, can result in privacy leakage. To enable data analysis, a strong notion of privacy is required to avoid risking unintended privacy violations.Such a strong notion of privacy is differential privacy, a statistical notion of privacy that makes privacy leakage quantifiable. The caveat regarding differential privacy is that while it has strong guarantees for privacy, privacy comes at a cost of accuracy. Despite this trade-off being a central and important issue in the adoption of differential privacy, there exists a gap in the literature regarding providing an understanding of the trade-off and how to address it appropriately. Through a systematic literature review (SLR), we investigate the state-of-the-art within accuracy improving differentially private algorithms for histogram and synthetic data publishing. Our contribution is two-fold: 1) we identify trends and connections in the contributions to the field of differential privacy for histograms and synthetic data and 2) we provide an understanding of the privacy/accuracy trade-off challenge by crystallizing different dimensions to accuracy improvement. Accordingly, we position and visualize the ideas in relation to each other and external work, and deconstruct each algorithm to examine the building blocks separately with the aim of pinpointing which dimension of accuracy improvement each technique/approach is targeting. Hence, this systematization of knowledge (SoK) provides an understanding of in which dimensions and how accuracy improvement can be pursued without sacrificing privacy
- …