25,469 research outputs found
Sensitivity of Counting Queries
In the context of statistical databases, the release of accurate statistical information about the collected data often puts at risk the privacy of the individual contributors. The goal of differential privacy is to maximise the utility of a query while protecting the individual records in the database. A natural way to achieve differential privacy is to add statistical noise to the result of the query.
In this context, a mechanism for releasing statistical information is thus a trade-off between utility and privacy. In order to balance these two "conflicting" requirements, privacy preserving mechanisms calibrate the added noise to the so-called sensitivity of the query, and thus a precise estimate of the sensitivity of the query is necessary to determine the amplitude of the noise to be added.
In this paper, we initiate a systematic study of sensitivity of counting queries over relational databases. We first observe that the sensitivity of a Relational Algebra query with counting is not computable in general, and that while the sensitivity of Conjunctive Queries with counting is computable, it becomes unbounded as soon as the query includes a join. We then consider restricted classes of databases (databases with constraints), and study the problem of computing the sensitivity of a query given such constraints. We are able to establish bounds on the sensitivity of counting conjunctive queries over constrained databases. The kind of constraints studied here are: functional dependencies and cardinality dependencies. The latter is a natural generalisation of functional dependencies that allows us to provide tight bounds on the sensitivity of counting conjunctive queries
Differentially Private Data Analysis of Social Networks via Restricted Sensitivity
We introduce the notion of restricted sensitivity as an alternative to global
and smooth sensitivity to improve accuracy in differentially private data
analysis. The definition of restricted sensitivity is similar to that of global
sensitivity except that instead of quantifying over all possible datasets, we
take advantage of any beliefs about the dataset that a querier may have, to
quantify over a restricted class of datasets. Specifically, given a query f and
a hypothesis H about the structure of a dataset D, we show generically how to
transform f into a new query f_H whose global sensitivity (over all datasets
including those that do not satisfy H) matches the restricted sensitivity of
the query f. Moreover, if the belief of the querier is correct (i.e., D is in
H) then f_H(D) = f(D). If the belief is incorrect, then f_H(D) may be
inaccurate.
We demonstrate the usefulness of this notion by considering the task of
answering queries regarding social-networks, which we model as a combination of
a graph and a labeling of its vertices. In particular, while our generic
procedure is computationally inefficient, for the specific definition of H as
graphs of bounded degree, we exhibit efficient ways of constructing f_H using
different projection-based techniques. We then analyze two important query
classes: subgraph counting queries (e.g., number of triangles) and local
profile queries (e.g., number of people who know a spy and a computer-scientist
who know each other). We demonstrate that the restricted sensitivity of such
queries can be significantly lower than their smooth sensitivity. Thus, using
restricted sensitivity we can maintain privacy whether or not D is in H, while
providing more accurate results in the event that H holds true
Interactive Range Queries under Differential Privacy
Differential privacy approaches employ a curator to control data sharing with analysts without compromising individual privacy. The curatorās role is to guard the data and determine what is appropriate for release using the parameter epsilon to adjust the accuracy of the released data. A low epsilon value provides more privacy, while a higher epsilon value is associated with higher accuracy. Counting queries, which ācountā the number of items in a dataset that meet speciļ¬c conditions, impose additional restrictions on privacy protection. In particular, if the resulting counts are low, the data released is more speciļ¬c and can lead to privacy loss. This work addresses privacy challenges in single-attribute counting-range queries by proposing a Workload Partitioning Mechanism (WPM) which generates estimated answers based on query sensitivity. The mechanism is then extended to handle multiple-attribute range queries by preventing interrelated attributes from revealing private information about individuals. Further, the mechanism is paired with access control to improve system privacy and security, thus illustrating its practicality. The work also extends the WPM to reduce the error to be polylogarithmic in the sensitivity degree of the issued queries. This thesis describes the research questions addressed by WPM to date, and discusses future plans to expand the current research tasks toward developing a more efļ¬cient mechanism for range queries
Sensitivity estimation for differentially private query processing
Differential privacy has become a popular privacy-preserving method in data
analysis, query processing, and machine learning, which adds noise to the query
result to avoid leaking privacy. Sensitivity, or the maximum impact of deleting
or inserting a tuple on query results, determines the amount of noise added.
Computing the sensitivity of some simple queries such as counting query is
easy, however, computing the sensitivity of complex queries containing join
operations is challenging. Global sensitivity of such a query is unboundedly
large, which corrupts the accuracy of the query answer. Elastic sensitivity and
residual sensitivity offer upper bounds of local sensitivity to reduce the
noise, but they suffer from either low accuracy or high computational overhead.
We propose two fast query sensitivity estimation methods based on sampling and
sketch respectively, offering competitive accuracy and higher efficiency
compared to the state-of-the-art methods
Computing Local Sensitivities of Counting Queries with Joins
Local sensitivity of a query Q given a database instance D, i.e. how much the
output Q(D) changes when a tuple is added to D or deleted from D, has many
applications including query analysis, outlier detection, and in differential
privacy. However, it is NP-hard to find local sensitivity of a conjunctive
query in terms of the size of the query, even for the class of acyclic queries.
Although the complexity is polynomial when the query size is fixed, the naive
algorithms are not efficient for large databases and queries involving multiple
joins. In this paper, we present a novel approach to compute local sensitivity
of counting queries involving join operations by tracking and summarizing tuple
sensitivities -- the maximum change a tuple can cause in the query result when
it is added or removed. We give algorithms for the sensitivity problem for full
acyclic join queries using join trees, that run in polynomial time in both the
size of the database and query for an interesting sub-class of queries, which
we call 'doubly acyclic queries' that include path queries, and in polynomial
time in combined complexity when the maximum degree in the join tree is
bounded. Our algorithms can be extended to certain non-acyclic queries using
generalized hypertree decompositions. We evaluate our approach experimentally,
and show applications of our algorithms to obtain better results for
differential privacy by orders of magnitude.Comment: To be published in Proceedings of the 2020 ACM SIGMOD International
Conference on Management of Dat
An Adaptive Mechanism for Accurate Query Answering under Differential Privacy
We propose a novel mechanism for answering sets of count- ing queries under
differential privacy. Given a workload of counting queries, the mechanism
automatically selects a different set of "strategy" queries to answer
privately, using those answers to derive answers to the workload. The main
algorithm proposed in this paper approximates the optimal strategy for any
workload of linear counting queries. With no cost to the privacy guarantee, the
mechanism improves significantly on prior approaches and achieves near-optimal
error for many workloads, when applied under (\epsilon, \delta)-differential
privacy. The result is an adaptive mechanism which can help users achieve good
utility without requiring that they reason carefully about the best formulation
of their task.Comment: VLDB2012. arXiv admin note: substantial text overlap with
arXiv:1103.136
- ā¦