15 research outputs found

    Heavy Hitters and the Structure of Local Privacy

    Full text link
    We present a new locally differentially private algorithm for the heavy hitters problem which achieves optimal worst-case error as a function of all standardly considered parameters. Prior work obtained error rates which depend optimally on the number of users, the size of the domain, and the privacy parameter, but depend sub-optimally on the failure probability. We strengthen existing lower bounds on the error to incorporate the failure probability, and show that our new upper bound is tight with respect to this parameter as well. Our lower bound is based on a new understanding of the structure of locally private protocols. We further develop these ideas to obtain the following general results beyond heavy hitters. ∙\bullet Advanced Grouposition: In the local model, group privacy for kk users degrades proportionally to ≈k\approx \sqrt{k}, instead of linearly in kk as in the central model. Stronger group privacy yields improved max-information guarantees, as well as stronger lower bounds (via "packing arguments"), over the central model. ∙\bullet Building on a transformation of Bassily and Smith (STOC 2015), we give a generic transformation from any non-interactive approximate-private local protocol into a pure-private local protocol. Again in contrast with the central model, this shows that we cannot obtain more accurate algorithms by moving from pure to approximate local privacy

    Differentially Private Heatmaps

    Full text link
    We consider the task of producing heatmaps from users' aggregated data while protecting their privacy. We give a differentially private (DP) algorithm for this task and demonstrate its advantages over previous algorithms on real-world datasets. Our core algorithmic primitive is a DP procedure that takes in a set of distributions and produces an output that is close in Earth Mover's Distance to the average of the inputs. We prove theoretical bounds on the error of our algorithm under a certain sparsity assumption and that these are near-optimal.Comment: To appear in AAAI 202

    Unbounded Differentially Private Quantile and Maximum Estimation

    Full text link
    In this work we consider the problem of differentially private computation of quantiles for the data, especially the highest quantiles such as maximum, but with an unbounded range for the dataset. We show that this can be done efficiently through a simple invocation of AboveThreshold\texttt{AboveThreshold}, a subroutine that is iteratively called in the fundamental Sparse Vector Technique, even when there is no upper bound on the data. In particular, we show that this procedure can give more accurate and robust estimates on the highest quantiles with applications towards clipping that is essential for differentially private sum and mean estimation. In addition, we show how two invocations can handle the fully unbounded data setting. Within our study, we show that an improved analysis of AboveThreshold\texttt{AboveThreshold} can improve the privacy guarantees for the widely used Sparse Vector Technique that is of independent interest. We give a more general characterization of privacy loss for AboveThreshold\texttt{AboveThreshold} which we immediately apply to our method for improved privacy guarantees. Our algorithm only requires one O(n)O(n) pass through the data, which can be unsorted, and each subsequent query takes O(1)O(1) time. We empirically compare our unbounded algorithm with the state-of-the-art algorithms in the bounded setting. For inner quantiles, we find that our method often performs better on non-synthetic datasets. For the maximal quantiles, which we apply to differentially private sum computation, we find that our method performs significantly better

    Empirical Risk Minimization in the Non-interactive Local Model of Differential Privacy

    Get PDF
    In this paper, we study the Empirical Risk Minimization (ERM) problem in the non-interactive Local Differential Privacy (LDP) model. Previous research on this problem \citep{smith2017interaction} indicates that the sample complexity, to achieve error α\alpha, needs to be exponentially depending on the dimensionality pp for general loss functions. In this paper, we make two attempts to resolve this issue by investigating conditions on the loss functions that allow us to remove such a limit. In our first attempt, we show that if the loss function is (∞,T)(\infty, T)-smooth, by using the Bernstein polynomial approximation we can avoid the exponential dependency in the term of α\alpha. We then propose player-efficient algorithms with 11-bit communication complexity and O(1)O(1) computation cost for each player. The error bound of these algorithms is asymptotically the same as the original one. With some additional assumptions, we also give an algorithm which is more efficient for the server. In our second attempt, we show that for any 11-Lipschitz generalized linear convex loss function, there is an (ϵ,δ)(\epsilon, \delta)-LDP algorithm whose sample complexity for achieving error α\alpha is only linear in the dimensionality pp. Our results use a polynomial of inner product approximation technique. Finally, motivated by the idea of using polynomial approximation and based on different types of polynomial approximations, we propose (efficient) non-interactive locally differentially private algorithms for learning the set of k-way marginal queries and the set of smooth queries.Comment: Appeared at Journal of Machine Learning Research. The journal version of arXiv:1802.04085, fixed a bug in arXiv:1812.0682
    corecore