2 research outputs found
Interaction is necessary for distributed learning with privacy or communication constraints
Local differential privacy (LDP) is a model where users send privatized data
to an untrusted central server whose goal it to solve some data analysis task.
In the non-interactive version of this model the protocol consists of a single
round in which a server sends requests to all users then receives their
responses. This version is deployed in industry due to its practical advantages
and has attracted significant research interest. Our main result is an
exponential lower bound on the number of samples necessary to solve the
standard task of learning a large-margin linear separator in the
non-interactive LDP model. Via a standard reduction this lower bound implies an
exponential lower bound for stochastic convex optimization and specifically,
for learning linear models with a convex, Lipschitz and smooth loss. These
results answer the questions posed in \citep{SmithTU17,DanielyF18}. Our lower
bound relies on a new technique for constructing pairs of distributions with
nearly matching moments but whose supports can be nearly separated by a large
margin hyperplane. These lower bounds also hold in the model where
communication from each user is limited and follow from a lower bound on
learning using non-adaptive \emph{statistical queries}
Locally Private Hypothesis Selection
We initiate the study of hypothesis selection under local differential
privacy. Given samples from an unknown probability distribution and a set
of probability distributions , we aim to output, under the
constraints of -local differential privacy, a distribution from
whose total variation distance to is comparable to the best
such distribution. This is a generalization of the classic problem of -wise
simple hypothesis testing, which corresponds to when , and
we wish to identify . Absent privacy constraints, this problem requires
samples from , and it was recently shown that the same
complexity is achievable under (central) differential privacy. However, the
naive approach to this problem under local differential privacy would require
samples.
We first show that the constraint of local differential privacy incurs an
exponential increase in cost: any algorithm for this problem requires at least
samples. Second, for the special case of -wise simple hypothesis
testing, we provide a non-interactive algorithm which nearly matches this
bound, requiring samples. Finally, we provide sequentially
interactive algorithms for the general case, requiring samples
and only rounds of interactivity. Our algorithms are achieved
through a reduction to maximum selection with adversarial comparators, a
problem of independent interest for which we initiate study in the parallel
setting. For this problem, we provide a family of algorithms for each number of
allowed rounds of interaction , as well as lower bounds showing that they
are near-optimal for every . Notably, our algorithms result in exponential
improvements on the round complexity of previous methods.Comment: To appear in COLT 202