Search CORE

845 research outputs found

Differentially Private Data Releasing for Smooth Queries with Synthetic Database Output

Author: Huang Junliang
Jin Chi
Wang Liwei
Wang Ziteng
Zhong Yiqiao
Publication venue
Publication date: 06/01/2014
Field of study

We consider accurately answering smooth queries while preserving differential privacy. A query is said to be

K

-smooth if it is specified by a function defined on

[-1,1]^d

whose partial derivatives up to order

K

are all bounded. We develop an

\epsilon

-differentially private mechanism for the class of

K

-smooth queries. The major advantage of the algorithm is that it outputs a synthetic database. In real applications, a synthetic database output is appealing. Our mechanism achieves an accuracy of

O (n^{-\frac{K}{2d+K}}/\epsilon )

, and runs in polynomial time. We also generalize the mechanism to preserve

(\epsilon, \delta)

-differential privacy with slightly improved accuracy. Extensive experiments on benchmark datasets demonstrate that the mechanisms have good accuracy and are efficient

arXiv.org e-Print Archive

CiteSeerX

Nearly Optimal Private Convolution

Author: A. Bhaskara
A. Blum
C. Dwork
C. Li
C. Li
J. Bolot
J. Thaler
K. Chandrasekaran
L. Lovász
R.M. Gray
T.-H. Hubert Chan
Publication venue
Publication date: 01/01/2013
Field of study

We study computing the convolution of a private input

x

with a public input

h

, while satisfying the guarantees of

(\epsilon, \delta)

-differential privacy. Convolution is a fundamental operation, intimately related to Fourier Transforms. In our setting, the private input may represent a time series of sensitive events or a histogram of a database of confidential personal information. Convolution then captures important primitives including linear filtering, which is an essential tool in time series analysis, and aggregation queries on projections of the data. We give a nearly optimal algorithm for computing convolutions while satisfying

(\epsilon, \delta)

-differential privacy. Surprisingly, we follow the simple strategy of adding independent Laplacian noise to each Fourier coefficient and bounding the privacy loss using the composition theorem of Dwork, Rothblum, and Vadhan. We derive a closed form expression for the optimal noise to add to each Fourier coefficient using convex programming duality. Our algorithm is very efficient -- it is essentially no more computationally expensive than a Fast Fourier Transform. To prove near optimality, we use the recent discrepancy lowerbounds of Muthukrishnan and Nikolov and derive a spectral lower bound using a characterization of discrepancy in terms of determinants

arXiv.org e-Print Archive

Crossref

Learning Coverage Functions and Private Release of Marginals

Author: Feldman Vitaly
Kothari Pravesh
Publication venue
Publication date: 27/05/2014
Field of study

We study the problem of approximating and learning coverage functions. A function

c: 2^{[n]} \rightarrow \mathbf{R}^{+}

is a coverage function, if there exists a universe

U

with non-negative weights

w(u)

for each

u \in U

and subsets

A_1, A_2, \ldots, A_n

U

such that

c(S) = \sum_{u \in \cup_{i \in S} A_i} w(u)

. Alternatively, coverage functions can be described as non-negative linear combinations of monotone disjunctions. They are a natural subclass of submodular functions and arise in a number of applications. We give an algorithm that for any

\gamma,\delta>0

, given random and uniform examples of an unknown coverage function

c

, finds a function

h

that approximates

c

within factor

1+\gamma

on all but

\delta

-fraction of the points in time

poly(n,1/\gamma,1/\delta)

. This is the first fully-polynomial algorithm for learning an interesting class of functions in the demanding PMAC model of Balcan and Harvey (2011). Our algorithms are based on several new structural properties of coverage functions. Using the results in (Feldman and Kothari, 2014), we also show that coverage functions are learnable agnostically with excess

\ell_1

-error

\epsilon

over all product and symmetric distributions in time

n^{\log(1/\epsilon)}

. In contrast, we show that, without assumptions on the distribution, learning coverage functions is at least as hard as learning polynomial-size disjoint DNF formulas, a class of functions for which the best known algorithm runs in time

2^{\tilde{O}(n^{1/3})}

(Klivans and Servedio, 2004). As an application of our learning results, we give simple differentially-private algorithms for releasing monotone conjunction counting queries with low average error. In particular, for any

k \leq n

, we obtain private release of

k

-way marginals with average error

\bar{\alpha}

in time

n^{O(\log(1/\bar{\alpha}))}

arXiv.org e-Print Archive

CiteSeerX

What Can We Learn Privately?

Author: Adam Smith
K. Lee
Kasiviswanathan Homin
Kobbi Nissim
Shiva Prasad
Sofya Raskhodnikova
Publication venue
Publication date: 01/01/2010
Field of study

Learning problems form an important category of computational tasks that generalizes many of the computations researchers apply to large real-life data sets. We ask: what concept classes can be learned privately, namely, by an algorithm whose output does not depend too heavily on any one input or specific training example? More precisely, we investigate learning algorithms that satisfy differential privacy, a notion that provides strong confidentiality guarantees in contexts where aggregate information is released about a database containing sensitive information about individuals. We demonstrate that, ignoring computational constraints, it is possible to privately agnostically learn any concept class using a sample size approximately logarithmic in the cardinality of the concept class. Therefore, almost anything learnable is learnable privately: specifically, if a concept class is learnable by a (non-private) algorithm with polynomial sample complexity and output size, then it can be learned privately using a polynomial number of samples. We also present a computationally efficient private PAC learner for the class of parity functions. Local (or randomized response) algorithms are a practical class of private algorithms that have received extensive investigation. We provide a precise characterization of local private learning algorithms. We show that a concept class is learnable by a local algorithm if and only if it is learnable in the statistical query (SQ) model. Finally, we present a separation between the power of interactive and noninteractive local learning algorithms.Comment: 35 pages, 2 figure

arXiv.org e-Print Archive

CiteSeerX