7 research outputs found

    Communication Cost Reduction for Subgraph Counting under Local Differential Privacy via Hash Functions

    Full text link
    We suggest the use of hash functions to cut down the communication costs when counting subgraphs under edge local differential privacy. While various algorithms exist for computing graph statistics, including the count of subgraphs, under the edge local differential privacy, many suffer with high communication costs, making them less efficient for large graphs. Though data compression is a typical approach in differential privacy, its application in local differential privacy requires a form of compression that every node can reproduce. In our study, we introduce linear congruence hashing. With a sampling rate of ss, our method can cut communication costs by a factor of s2s^2, albeit at the cost of increasing variance in the published graph statistic by a factor of ss. The experimental results indicate that, when matched for communication costs, our method achieves a reduction in the â„“2\ell_2-error for triangle counts by up to 1000 times compared to the performance of leading algorithms.Comment: 13 pages, 3 figure

    Continuous Release of Data Streams under both Centralized and Local Differential Privacy

    Get PDF
    In this paper, we study the problem of publishing a stream of real-valued data satisfying differential privacy (DP). One major challenge is that the maximal possible value can be quite large; thus it is necessary to estimate a threshold so that numbers above it are truncated to reduce the amount of noise that is required to all the data. The estimation must be done based on the data in a private fashion. We develop such a method that uses the Exponential Mechanism with a quality function that approximates well the utility goal while maintaining a low sensitivity. Given the threshold, we then propose a novel online hierarchical method and several post-processing techniques. Building on these ideas, we formalize the steps into a framework for private publishing of stream data. Our framework consists of three components: a threshold optimizer that privately estimates the threshold, a perturber that adds calibrated noises to the stream, and a smoother that improves the result using post-processing. Within our framework, we design an algorithm satisfying the more stringent setting of DP called local DP (LDP). To our knowledge, this is the first LDP algorithm for publishing streaming data. Using four real-world datasets, we demonstrate that our mechanism outperforms the state-of-the-art by a factor of 6-10 orders of magnitude in terms of utility (measured by the mean squared error of answering a random range query)

    Intertwining Order Preserving Encryption and Differential Privacy

    Full text link
    Ciphertexts of an order-preserving encryption (OPE) scheme preserve the order of their corresponding plaintexts. However, OPEs are vulnerable to inference attacks that exploit this preserved order. At another end, differential privacy has become the de-facto standard for achieving data privacy. One of the most attractive properties of DP is that any post-processing (inferential) computation performed on the noisy output of a DP algorithm does not degrade its privacy guarantee. In this paper, we intertwine the two approaches and propose a novel differentially private order preserving encryption scheme, OPϵ\epsilon. Under OPϵ\epsilon, the leakage of order from the ciphertexts is differentially private. As a result, in the least, OPϵ\epsilon ensures a formal guarantee (specifically, a relaxed DP guarantee) even in the face of inference attacks. To the best of our knowledge, this is the first work to intertwine DP with a property-preserving encryption scheme. We demonstrate OPϵ\epsilon's practical utility in answering range queries via extensive empirical evaluation on four real-world datasets. For instance, OPϵ\epsilon misses only around 44 in every 10K10K correct records on average for a dataset of size ∼732K\sim732K with an attribute of domain size ∼18K\sim18K and ϵ=1\epsilon= 1

    Estimating numerical distributions under local differential privacy

    No full text
    When collecting information, local differential privacy (LDP) relieves the concern of privacy leakage from users' perspective, as user's private information is randomized before sent to the aggregator. We study the problem of recovering the distribution over a numerical domain while satisfying LDP. While one can discretize a numerical domain and then apply the protocols developed for categorical domains, we show that taking advantage of the numerical nature of the domain results in better trade-off of privacy and utility. We introduce a new reporting mechanism, called the square wave SW mechanism, which exploits the numerical nature in reporting. We also develop an Expectation Maximization with Smoothing (EMS) algorithm, which is applied to aggregated histograms from the SW mechanism to estimate the original distributions. Extensive experiments demonstrate that our proposed approach, SW with EMS, consistently outperforms other methods in a variety of utility metrics
    corecore