Search CORE

6 research outputs found

Private Incremental Regression

Author: Chaudhuri K.
Chaudhuri K.
Fard M. M.
Gordon Y.
Jain P.
Jain P.
Jain P.
Kabáan A.
Kasiviswanathan S.
Kifer D.
Ledoux M.
Maillard O.
Mishra N.
Shalev-Shwartz S.
Talwar K.
Thakurta A. G.
Thakurta A. G.
Vapnik V.
Williams O.
Publication venue
Publication date: 04/01/2017
Field of study

Data is continuously generated by modern data sources, and a recent challenge in machine learning has been to develop techniques that perform well in an incremental (streaming) setting. In this paper, we investigate the problem of private machine learning, where as common in practice, the data is not given at once, but rather arrives incrementally over time. We introduce the problems of private incremental ERM and private incremental regression where the general goal is to always maintain a good empirical risk minimizer for the history observed under differential privacy. Our first contribution is a generic transformation of private batch ERM mechanisms into private incremental ERM mechanisms, based on a simple idea of invoking the private batch ERM procedure at some regular time intervals. We take this construction as a baseline for comparison. We then provide two mechanisms for the private incremental regression problem. Our first mechanism is based on privately constructing a noisy incremental gradient function, which is then used in a modified projected gradient procedure at every timestep. This mechanism has an excess empirical risk of

\approx\sqrt{d}

, where

d

is the dimensionality of the data. While from the results of [Bassily et al. 2014] this bound is tight in the worst-case, we show that certain geometric properties of the input and constraint set can be used to derive significantly better results for certain interesting regression problems.Comment: To appear in PODS 201

arXiv.org e-Print Archive

Crossref

On the Power of Multiple Anonymous Messages

Author: Ghazi Badih
Golowich Noah
Kumar Ravi
Pagh Rasmus
Velingker Ameya
Publication venue
Publication date: 19/05/2020
Field of study

An exciting new development in differential privacy is the shuffled model, in which an anonymous channel enables non-interactive, differentially private protocols with error much smaller than what is possible in the local model, while relying on weaker trust assumptions than in the central model. In this paper, we study basic counting problems in the shuffled model and establish separations between the error that can be achieved in the single-message shuffled model and in the shuffled model with multiple messages per user. For the problem of frequency estimation for

n

users and a domain of size

B

, we obtain: - A nearly tight lower bound of

\tilde{\Omega}( \min(\sqrt[4]{n}, \sqrt{B}))

on the error in the single-message shuffled model. This implies that the protocols obtained from the amplification via shuffling work of Erlingsson et al. (SODA 2019) and Balle et al. (Crypto 2019) are essentially optimal for single-message protocols. A key ingredient in the proof is a lower bound on the error of locally-private frequency estimation in the low-privacy (aka high

\epsilon

) regime. - Protocols in the multi-message shuffled model with

poly(\log{B}, \log{n})

bits of communication per user and

poly\log{B}

error, which provide an exponential improvement on the error compared to what is possible with single-message algorithms. For the related selection problem on a domain of size

B

, we prove: - A nearly tight lower bound of

\Omega(B)

on the number of users in the single-message shuffled model. This significantly improves on the

\Omega(B^{1/17})

lower bound obtained by Cheu et al. (Eurocrypt 2019), and when combined with their

\tilde{O}(\sqrt{B})

-error multi-message protocol, implies the first separation between single-message and multi-message protocols for this problem.Comment: 70 pages, 2 figures, 3 table

arXiv.org e-Print Archive

Cryptology ePrint Archive