2 research outputs found

    Interaction is necessary for distributed learning with privacy or communication constraints

    Full text link
    Local differential privacy (LDP) is a model where users send privatized data to an untrusted central server whose goal it to solve some data analysis task. In the non-interactive version of this model the protocol consists of a single round in which a server sends requests to all users then receives their responses. This version is deployed in industry due to its practical advantages and has attracted significant research interest. Our main result is an exponential lower bound on the number of samples necessary to solve the standard task of learning a large-margin linear separator in the non-interactive LDP model. Via a standard reduction this lower bound implies an exponential lower bound for stochastic convex optimization and specifically, for learning linear models with a convex, Lipschitz and smooth loss. These results answer the questions posed in \citep{SmithTU17,DanielyF18}. Our lower bound relies on a new technique for constructing pairs of distributions with nearly matching moments but whose supports can be nearly separated by a large margin hyperplane. These lower bounds also hold in the model where communication from each user is limited and follow from a lower bound on learning using non-adaptive \emph{statistical queries}

    Locally Private Hypothesis Selection

    Full text link
    We initiate the study of hypothesis selection under local differential privacy. Given samples from an unknown probability distribution pp and a set of kk probability distributions Q\mathcal{Q}, we aim to output, under the constraints of ε\varepsilon-local differential privacy, a distribution from Q\mathcal{Q} whose total variation distance to pp is comparable to the best such distribution. This is a generalization of the classic problem of kk-wise simple hypothesis testing, which corresponds to when pQp \in \mathcal{Q}, and we wish to identify pp. Absent privacy constraints, this problem requires O(logk)O(\log k) samples from pp, and it was recently shown that the same complexity is achievable under (central) differential privacy. However, the naive approach to this problem under local differential privacy would require O~(k2)\tilde O(k^2) samples. We first show that the constraint of local differential privacy incurs an exponential increase in cost: any algorithm for this problem requires at least Ω(k)\Omega(k) samples. Second, for the special case of kk-wise simple hypothesis testing, we provide a non-interactive algorithm which nearly matches this bound, requiring O~(k)\tilde O(k) samples. Finally, we provide sequentially interactive algorithms for the general case, requiring O~(k)\tilde O(k) samples and only O(loglogk)O(\log \log k) rounds of interactivity. Our algorithms are achieved through a reduction to maximum selection with adversarial comparators, a problem of independent interest for which we initiate study in the parallel setting. For this problem, we provide a family of algorithms for each number of allowed rounds of interaction tt, as well as lower bounds showing that they are near-optimal for every tt. Notably, our algorithms result in exponential improvements on the round complexity of previous methods.Comment: To appear in COLT 202
    corecore