87 research outputs found

    Efficient Differentially Private F? Linear Sketching

    Get PDF

    Differential Privacy in Distributed Settings

    Get PDF

    Differentially Private Fractional Frequency Moments Estimation with Polylogarithmic Space

    Get PDF
    We prove that Fp sketch, a well-celebrated streaming algorithm for frequency moments estimation, is differentially private as is when p ∈ (0, 1]. Fp sketch uses only polylogarithmic space, exponentially better than existing DP baselines and only worse than the optimal non-private baseline by a logarithmic factor. The evaluation shows that Fp sketch can achieve reasonable accuracy with differential privacy guarantee. The evaluation code is included in the supplementary material

    Improved Differentially Private Euclidean Distance Approximation

    Get PDF

    The Flajolet-Martin sketch itself preserves differential privacy: private counting with minimal space

    Full text link
    https://proceedings.neurips.cc/paper/2020/file/e3019767b1b23f82883c9850356b71d6-Paper.pd

    Counting Distinct Elements in the Turnstile Model with Differential Privacy under Continual Observation

    Full text link
    Privacy is a central challenge for systems that learn from sensitive data sets, especially when a system's outputs must be continuously updated to reflect changing data. We consider the achievable error for differentially private continual release of a basic statistic -- the number of distinct items -- in a stream where items may be both inserted and deleted (the turnstile model). With only insertions, existing algorithms have additive error just polylogarithmic in the length of the stream TT. We uncover a much richer landscape in the turnstile model, even without considering memory restrictions. We show that every differentially private mechanism that handles insertions and deletions has worst-case additive error at least T1/4T^{1/4} even under a relatively weak, event-level privacy definition. Then, we identify a parameter of the input stream, its maximum flippancy, that is low for natural data streams and for which we give tight parameterized error guarantees. Specifically, the maximum flippancy is the largest number of times that the contribution of a single item to the distinct elements count changes over the course of the stream. We present an item-level differentially private mechanism that, for all turnstile streams with maximum flippancy ww, continually outputs the number of distinct elements with an O(wpolylogT)O(\sqrt{w} \cdot poly\log T) additive error, without requiring prior knowledge of ww. We prove that this is the best achievable error bound that depends only on ww, for a large range of values of ww. When ww is small, the error of our mechanism is similar to the polylogarithmic in TT error in the insertion-only setting, bypassing the hardness in the turnstile model

    On Distributed Differential Privacy and Counting Distinct Elements

    Get PDF
    We study the setup where each of n users holds an element from a discrete set, and the goal is to count the number of distinct elements across all users, under the constraint of (?,?)-differentially privacy: - In the non-interactive local setting, we prove that the additive error of any protocol is ?(n) for any constant ? and for any ? inverse polynomial in n. - In the single-message shuffle setting, we prove a lower bound of ??(n) on the error for any constant ? and for some ? inverse quasi-polynomial in n. We do so by building on the moment-matching method from the literature on distribution estimation. - In the multi-message shuffle setting, we give a protocol with at most one message per user in expectation and with an error of O?(?n) for any constant ? and for any ? inverse polynomial in n. Our protocol is also robustly shuffle private, and our error of ?n matches a known lower bound for such protocols. Our proof technique relies on a new notion, that we call dominated protocols, and which can also be used to obtain the first non-trivial lower bounds against multi-message shuffle protocols for the well-studied problems of selection and learning parity. Our first lower bound for estimating the number of distinct elements provides the first ?(?n) separation between global sensitivity and error in local differential privacy, thus answering an open question of Vadhan (2017). We also provide a simple construction that gives ??(n) separation between global sensitivity and error in two-party differential privacy, thereby answering an open question of McGregor et al. (2011)

    Privacy-Preserving Cloud-Assisted Data Analytics

    Get PDF
    Nowadays industries are collecting a massive and exponentially growing amount of data that can be utilized to extract useful insights for improving various aspects of our life. Data analytics (e.g., via the use of machine learning) has been extensively applied to make important decisions in various real world applications. However, it is challenging for resource-limited clients to analyze their data in an efficient way when its scale is large. Additionally, the data resources are increasingly distributed among different owners. Nonetheless, users\u27 data may contain private information that needs to be protected. Cloud computing has become more and more popular in both academia and industry communities. By pooling infrastructure and servers together, it can offer virtually unlimited resources easily accessible via the Internet. Various services could be provided by cloud platforms including machine learning and data analytics. The goal of this dissertation is to develop privacy-preserving cloud-assisted data analytics solutions to address the aforementioned challenges, leveraging the powerful and easy-to-access cloud. In particular, we propose the following systems. To address the problem of limited computation power at user and the need of privacy protection in data analytics, we consider geometric programming (GP) in data analytics, and design a secure, efficient, and verifiable outsourcing protocol for GP. Our protocol consists of a transform scheme that converts GP to DGP, a transform scheme with computationally indistinguishability, and an efficient scheme to solve the transformed DGP at the cloud side with result verification. Evaluation results show that the proposed secure outsourcing protocol can achieve significant time savings for users. To address the problem of limited data at individual users, we propose two distributed learning systems such that users can collaboratively train machine learning models without losing privacy. The first one is a differentially private framework to train logistic regression models with distributed data sources. We employ the relevance between input data features and the model output to significantly improve the learning accuracy. Moreover, we adopt an evaluation data set at the cloud side to suppress low-quality data sources and propose a differentially private mechanism to protect user\u27s data quality privacy. Experimental results show that the proposed framework can achieve high utility with low quality data, and strong privacy guarantee. The second one is an efficient privacy-preserving federated learning system that enables multiple edge users to collaboratively train their models without revealing dataset. To reduce the communication overhead, we select well-aligned and large-enough magnitude gradients for uploading which leads to quick convergence. To minimize the noise added and improve model utility, each user only adds a small amount of noise to his selected gradients, encrypts the noise gradients before uploading, and the cloud server will only get the aggregate gradients that contain enough noise to achieve differential privacy. Evaluation results show that the proposed system can achieve high accuracy, low communication overhead, and strong privacy guarantee. In future work, we plan to design a privacy-preserving data analytics with fair exchange, which ensures the payment fairness. We will also consider designing distributed learning systems with heterogeneous architectures
    corecore