183 research outputs found
Optimizing Batch Linear Queries under Exact and Approximate Differential Privacy
Differential privacy is a promising privacy-preserving paradigm for
statistical query processing over sensitive data. It works by injecting random
noise into each query result, such that it is provably hard for the adversary
to infer the presence or absence of any individual record from the published
noisy results. The main objective in differentially private query processing is
to maximize the accuracy of the query results, while satisfying the privacy
guarantees. Previous work, notably \cite{LHR+10}, has suggested that with an
appropriate strategy, processing a batch of correlated queries as a whole
achieves considerably higher accuracy than answering them individually.
However, to our knowledge there is currently no practical solution to find such
a strategy for an arbitrary query batch; existing methods either return
strategies of poor quality (often worse than naive methods) or require
prohibitively expensive computations for even moderately large domains.
Motivated by this, we propose low-rank mechanism (LRM), the first practical
differentially private technique for answering batch linear queries with high
accuracy. LRM works for both exact (i.e., -) and approximate (i.e.,
(, )-) differential privacy definitions. We derive the
utility guarantees of LRM, and provide guidance on how to set the privacy
parameters given the user's utility expectation. Extensive experiments using
real data demonstrate that our proposed method consistently outperforms
state-of-the-art query processing solutions under differential privacy, by
large margins.Comment: ACM Transactions on Database Systems (ACM TODS). arXiv admin note:
text overlap with arXiv:1212.230
10141 Abstracts Collection -- Distributed Usage Control
From 06.04. to 09.04.2010, the Dagstuhl Seminar 10141 ``Distributed Usage Control \u27\u27 was held in Schloss Dagstuhl~--~Leibniz Center for Informatics.
During the seminar, several participants presented their current
research, and ongoing work and open problems were discussed. Abstracts of
the presentations given during the seminar as well as abstracts of
seminar results and ideas are put together in this paper. The first section
describes the seminar topics and goals in general.
Links to extended abstracts or full papers are provided, if available
DRS: Dynamic Resource Scheduling for Real-Time Analytics over Fast Streams
In a data stream management system (DSMS), users register continuous queries,
and receive result updates as data arrive and expire. We focus on applications
with real-time constraints, in which the user must receive each result update
within a given period after the update occurs. To handle fast data, the DSMS is
commonly placed on top of a cloud infrastructure. Because stream properties
such as arrival rates can fluctuate unpredictably, cloud resources must be
dynamically provisioned and scheduled accordingly to ensure real-time response.
It is quite essential, for the existing systems or future developments, to
possess the ability of scheduling resources dynamically according to the
current workload, in order to avoid wasting resources, or failing in delivering
correct results on time. Motivated by this, we propose DRS, a novel dynamic
resource scheduler for cloud-based DSMSs. DRS overcomes three fundamental
challenges: (a) how to model the relationship between the provisioned resources
and query response time (b) where to best place resources; and (c) how to
measure system load with minimal overhead. In particular, DRS includes an
accurate performance model based on the theory of \emph{Jackson open queueing
networks} and is capable of handling \emph{arbitrary} operator topologies,
possibly with loops, splits and joins. Extensive experiments with real data
confirm that DRS achieves real-time response with close to optimal resource
consumption.Comment: This is the our latest version with certain modificatio
Functional Mechanism: Regression Analysis under Differential Privacy
\epsilon-differential privacy is the state-of-the-art model for releasing
sensitive information while protecting privacy. Numerous methods have been
proposed to enforce epsilon-differential privacy in various analytical tasks,
e.g., regression analysis. Existing solutions for regression analysis, however,
are either limited to non-standard types of regression or unable to produce
accurate regression results. Motivated by this, we propose the Functional
Mechanism, a differentially private method designed for a large class of
optimization-based analyses. The main idea is to enforce epsilon-differential
privacy by perturbing the objective function of the optimization problem,
rather than its results. As case studies, we apply the functional mechanism to
address two most widely used regression models, namely, linear regression and
logistic regression. Both theoretical analysis and thorough experimental
evaluations show that the functional mechanism is highly effective and
efficient, and it significantly outperforms existing solutions.Comment: VLDB201
Low-Rank Mechanism: Optimizing Batch Queries under Differential Privacy
Differential privacy is a promising privacy-preserving paradigm for
statistical query processing over sensitive data. It works by injecting random
noise into each query result, such that it is provably hard for the adversary
to infer the presence or absence of any individual record from the published
noisy results. The main objective in differentially private query processing is
to maximize the accuracy of the query results, while satisfying the privacy
guarantees. Previous work, notably the matrix mechanism, has suggested that
processing a batch of correlated queries as a whole can potentially achieve
considerable accuracy gains, compared to answering them individually. However,
as we point out in this paper, the matrix mechanism is mainly of theoretical
interest; in particular, several inherent problems in its design limit its
accuracy in practice, which almost never exceeds that of naive methods. In
fact, we are not aware of any existing solution that can effectively optimize a
query batch under differential privacy. Motivated by this, we propose the
Low-Rank Mechanism (LRM), the first practical differentially private technique
for answering batch queries with high accuracy, based on a low rank
approximation of the workload matrix. We prove that the accuracy provided by
LRM is close to the theoretical lower bound for any mechanism to answer a batch
of queries under differential privacy. Extensive experiments using real data
demonstrate that LRM consistently outperforms state-of-the-art query processing
solutions under differential privacy, by large margins.Comment: VLDB201
- …