Search CORE

3,532 research outputs found

Linear and Range Counting under Metric-based Local Differential Privacy

Author: Ding Bolin
He Xi
Xiang Zhuolun
Zhou Jingren
Publication venue
Publication date: 16/05/2020
Field of study

Local differential privacy (LDP) enables private data sharing and analytics without the need for a trusted data collector. Error-optimal primitives (for, e.g., estimating means and item frequencies) under LDP have been well studied. For analytical tasks such as range queries, however, the best known error bound is dependent on the domain size of private data, which is potentially prohibitive. This deficiency is inherent as LDP protects the same level of indistinguishability between any pair of private data values for each data downer. In this paper, we utilize an extension of

\epsilon

-LDP called Metric-LDP or

E

-LDP, where a metric

E

defines heterogeneous privacy guarantees for different pairs of private data values and thus provides a more flexible knob than

\epsilon

does to relax LDP and tune utility-privacy trade-offs. We show that, under such privacy relaxations, for analytical workloads such as linear counting, multi-dimensional range counting queries, and quantile queries, we can achieve significant gains in utility. In particular, for range queries under

E

-LDP where the metric

E

is the

L^1

-distance function scaled by

\epsilon

, we design mechanisms with errors independent on the domain sizes; instead, their errors depend on the metric

E

, which specifies in what granularity the private data is protected. We believe that the primitives we design for

E

-LDP will be useful in developing mechanisms for other analytical tasks, and encourage the adoption of LDP in practice

arXiv.org e-Print Archive

Crossref

A Theory of Pricing Private Data

Author: Li Chao
Li Daniel Yang
Miklau Gerome
Suciu Dan
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 17/12/2012
Field of study

Personal data has value to both its owner and to institutions who would like to analyze it. Privacy mechanisms protect the owner's data while releasing to analysts noisy versions of aggregate query results. But such strict protections of individual's data have not yet found wide use in practice. Instead, Internet companies, for example, commonly provide free services in return for valuable sensitive information from users, which they exploit and sometimes sell to third parties. As the awareness of the value of the personal data increases, so has the drive to compensate the end user for her private information. The idea of monetizing private data can improve over the narrower view of hiding private data, since it empowers individuals to control their data through financial means. In this paper we propose a theoretical framework for assigning prices to noisy query answers, as a function of their accuracy, and for dividing the price amongst data owners who deserve compensation for their loss of privacy. Our framework adopts and extends key principles from both differential privacy and query pricing in data markets. We identify essential properties of the price function and micro-payments, and characterize valid solutions.Comment: 25 pages, 2 figures. Best Paper Award, to appear in the 16th International Conference on Database Theory (ICDT), 201

arXiv.org e-Print Archive

CiteSeerX

Crossref

Budget Feasible Mechanisms for Experimental Design

Author: A. Archer
A. Atkinson
A.A. Ageev
G. Calinescu
J. Ginebra
L. Vandenberghe
M. Sviridenko
R. Lavi
R. Myerson
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 11/07/2013
Field of study

In the classical experimental design setting, an experimenter E has access to a population of

n

potential experiment subjects

i\in \{1,...,n\}

, each associated with a vector of features

x_i\in R^d

. Conducting an experiment with subject

i

reveals an unknown value

y_i\in R

to E. E typically assumes some hypothetical relationship between

x_i

's and

y_i

's, e.g.,

y_i \approx \beta x_i

, and estimates

\beta

from experiments, e.g., through linear regression. As a proxy for various practical constraints, E may select only a subset of subjects on which to conduct the experiment. We initiate the study of budgeted mechanisms for experimental design. In this setting, E has a budget

B

. Each subject

i

declares an associated cost

c_i >0

to be part of the experiment, and must be paid at least her cost. In particular, the Experimental Design Problem (EDP) is to find a set

S

of subjects for the experiment that maximizes V(S) = \log\det(I_d+\sum_{i\in S}x_i\T{x_i}) under the constraint

\sum_{i\in S}c_i\leq B

; our objective function corresponds to the information gain in parameter

\beta

that is learned through linear regression methods, and is related to the so-called

D

-optimality criterion. Further, the subjects are strategic and may lie about their costs. We present a deterministic, polynomial time, budget feasible mechanism scheme, that is approximately truthful and yields a constant factor approximation to EDP. In particular, for any small

\delta > 0

and

\epsilon > 0

, we can construct a (12.98,

\epsilon

)-approximate mechanism that is

\delta

-truthful and runs in polynomial time in both

n

and

\log\log\frac{B}{\epsilon\delta}

. We also establish that no truthful, budget-feasible algorithms is possible within a factor 2 approximation, and show how to generalize our approach to a wide class of learning problems, beyond linear regression

arXiv.org e-Print Archive

Crossref

k-anonymous Microdata Release via Post Randomisation Method

Author: Chida Koji
Ikarashi Dai
Kikuchi Ryo
Takahashi Katsumi
Publication venue
Publication date: 21/04/2015
Field of study

The problem of the release of anonymized microdata is an important topic in the fields of statistical disclosure control (SDC) and privacy preserving data publishing (PPDP), and yet it remains sufficiently unsolved. In these research fields, k-anonymity has been widely studied as an anonymity notion for mainly deterministic anonymization algorithms, and some probabilistic relaxations have been developed. However, they are not sufficient due to their limitations, i.e., being weaker than the original k-anonymity or requiring strong parametric assumptions. First we propose Pk-anonymity, a new probabilistic k-anonymity, and prove that Pk-anonymity is a mathematical extension of k-anonymity rather than a relaxation. Furthermore, Pk-anonymity requires no parametric assumptions. This property has a significant meaning in the viewpoint that it enables us to compare privacy levels of probabilistic microdata release algorithms with deterministic ones. Second, we apply Pk-anonymity to the post randomization method (PRAM), which is an SDC algorithm based on randomization. PRAM is proven to satisfy Pk-anonymity in a controlled way, i.e, one can control PRAM's parameter so that Pk-anonymity is satisfied. On the other hand, PRAM is also known to satisfy

{\varepsilon}

-differential privacy, a recent popular and strong privacy notion. This fact means that our results significantly enhance PRAM since it implies the satisfaction of both important notions: k-anonymity and

{\varepsilon}

-differential privacy.Comment: 22 pages, 4 figure

arXiv.org e-Print Archive

Crossref