25,789 research outputs found

    Efficient Privacy Preserving Distributed Clustering Based on Secret Sharing

    Get PDF
    In this paper, we propose a privacy preserving distributed clustering protocol for horizontally partitioned data based on a very efficient homomorphic additive secret sharing scheme. The model we use for the protocol is novel in the sense that it utilizes two non-colluding third parties. We provide a brief security analysis of our protocol from information theoretic point of view, which is a stronger security model. We show communication and computation complexity analysis of our protocol along with another protocol previously proposed for the same problem. We also include experimental results for computation and communication overhead of these two protocols. Our protocol not only outperforms the others in execution time and communication overhead on data holders, but also uses a more efficient model for many data mining applications

    Tight Lower Bounds for Differentially Private Selection

    Full text link
    A pervasive task in the differential privacy literature is to select the kk items of "highest quality" out of a set of dd items, where the quality of each item depends on a sensitive dataset that must be protected. Variants of this task arise naturally in fundamental problems like feature selection and hypothesis testing, and also as subroutines for many sophisticated differentially private algorithms. The standard approaches to these tasks---repeated use of the exponential mechanism or the sparse vector technique---approximately solve this problem given a dataset of n=O(klogd)n = O(\sqrt{k}\log d) samples. We provide a tight lower bound for some very simple variants of the private selection problem. Our lower bound shows that a sample of size n=Ω(klogd)n = \Omega(\sqrt{k} \log d) is required even to achieve a very minimal accuracy guarantee. Our results are based on an extension of the fingerprinting method to sparse selection problems. Previously, the fingerprinting method has been used to provide tight lower bounds for answering an entire set of dd queries, but often only some much smaller set of kk queries are relevant. Our extension allows us to prove lower bounds that depend on both the number of relevant queries and the total number of queries
    corecore