3,532 research outputs found
Linear and Range Counting under Metric-based Local Differential Privacy
Local differential privacy (LDP) enables private data sharing and analytics
without the need for a trusted data collector. Error-optimal primitives (for,
e.g., estimating means and item frequencies) under LDP have been well studied.
For analytical tasks such as range queries, however, the best known error bound
is dependent on the domain size of private data, which is potentially
prohibitive. This deficiency is inherent as LDP protects the same level of
indistinguishability between any pair of private data values for each data
downer.
In this paper, we utilize an extension of -LDP called Metric-LDP or
-LDP, where a metric defines heterogeneous privacy guarantees for
different pairs of private data values and thus provides a more flexible knob
than does to relax LDP and tune utility-privacy trade-offs. We show
that, under such privacy relaxations, for analytical workloads such as linear
counting, multi-dimensional range counting queries, and quantile queries, we
can achieve significant gains in utility. In particular, for range queries
under -LDP where the metric is the -distance function scaled by
, we design mechanisms with errors independent on the domain sizes;
instead, their errors depend on the metric , which specifies in what
granularity the private data is protected. We believe that the primitives we
design for -LDP will be useful in developing mechanisms for other analytical
tasks, and encourage the adoption of LDP in practice
A Theory of Pricing Private Data
Personal data has value to both its owner and to institutions who would like
to analyze it. Privacy mechanisms protect the owner's data while releasing to
analysts noisy versions of aggregate query results. But such strict protections
of individual's data have not yet found wide use in practice. Instead, Internet
companies, for example, commonly provide free services in return for valuable
sensitive information from users, which they exploit and sometimes sell to
third parties.
As the awareness of the value of the personal data increases, so has the
drive to compensate the end user for her private information. The idea of
monetizing private data can improve over the narrower view of hiding private
data, since it empowers individuals to control their data through financial
means.
In this paper we propose a theoretical framework for assigning prices to
noisy query answers, as a function of their accuracy, and for dividing the
price amongst data owners who deserve compensation for their loss of privacy.
Our framework adopts and extends key principles from both differential privacy
and query pricing in data markets. We identify essential properties of the
price function and micro-payments, and characterize valid solutions.Comment: 25 pages, 2 figures. Best Paper Award, to appear in the 16th
International Conference on Database Theory (ICDT), 201
Budget Feasible Mechanisms for Experimental Design
In the classical experimental design setting, an experimenter E has access to
a population of potential experiment subjects , each
associated with a vector of features . Conducting an experiment
with subject reveals an unknown value to E. E typically assumes
some hypothetical relationship between 's and 's, e.g., , and estimates from experiments, e.g., through linear
regression. As a proxy for various practical constraints, E may select only a
subset of subjects on which to conduct the experiment.
We initiate the study of budgeted mechanisms for experimental design. In this
setting, E has a budget . Each subject declares an associated cost to be part of the experiment, and must be paid at least her cost. In
particular, the Experimental Design Problem (EDP) is to find a set of
subjects for the experiment that maximizes V(S) = \log\det(I_d+\sum_{i\in
S}x_i\T{x_i}) under the constraint ; our objective
function corresponds to the information gain in parameter that is
learned through linear regression methods, and is related to the so-called
-optimality criterion. Further, the subjects are strategic and may lie about
their costs.
We present a deterministic, polynomial time, budget feasible mechanism
scheme, that is approximately truthful and yields a constant factor
approximation to EDP. In particular, for any small and , we can construct a (12.98, )-approximate mechanism that is
-truthful and runs in polynomial time in both and
. We also establish that no truthful,
budget-feasible algorithms is possible within a factor 2 approximation, and
show how to generalize our approach to a wide class of learning problems,
beyond linear regression
k-anonymous Microdata Release via Post Randomisation Method
The problem of the release of anonymized microdata is an important topic in
the fields of statistical disclosure control (SDC) and privacy preserving data
publishing (PPDP), and yet it remains sufficiently unsolved. In these research
fields, k-anonymity has been widely studied as an anonymity notion for mainly
deterministic anonymization algorithms, and some probabilistic relaxations have
been developed. However, they are not sufficient due to their limitations,
i.e., being weaker than the original k-anonymity or requiring strong parametric
assumptions. First we propose Pk-anonymity, a new probabilistic k-anonymity,
and prove that Pk-anonymity is a mathematical extension of k-anonymity rather
than a relaxation. Furthermore, Pk-anonymity requires no parametric
assumptions. This property has a significant meaning in the viewpoint that it
enables us to compare privacy levels of probabilistic microdata release
algorithms with deterministic ones. Second, we apply Pk-anonymity to the post
randomization method (PRAM), which is an SDC algorithm based on randomization.
PRAM is proven to satisfy Pk-anonymity in a controlled way, i.e, one can
control PRAM's parameter so that Pk-anonymity is satisfied. On the other hand,
PRAM is also known to satisfy -differential privacy, a recent
popular and strong privacy notion. This fact means that our results
significantly enhance PRAM since it implies the satisfaction of both important
notions: k-anonymity and -differential privacy.Comment: 22 pages, 4 figure
- …