1,704 research outputs found

    Quantifying Differential Privacy under Temporal Correlations

    Full text link
    Differential Privacy (DP) has received increased attention as a rigorous privacy framework. Existing studies employ traditional DP mechanisms (e.g., the Laplace mechanism) as primitives, which assume that the data are independent, or that adversaries do not have knowledge of the data correlations. However, continuously generated data in the real world tend to be temporally correlated, and such correlations can be acquired by adversaries. In this paper, we investigate the potential privacy loss of a traditional DP mechanism under temporal correlations in the context of continuous data release. First, we model the temporal correlations using Markov model and analyze the privacy leakage of a DP mechanism when adversaries have knowledge of such temporal correlations. Our analysis reveals that the privacy leakage of a DP mechanism may accumulate and increase over time. We call it temporal privacy leakage. Second, to measure such privacy leakage, we design an efficient algorithm for calculating it in polynomial time. Although the temporal privacy leakage may increase over time, we also show that its supremum may exist in some cases. Third, to bound the privacy loss, we propose mechanisms that convert any existing DP mechanism into one against temporal privacy leakage. Experiments with synthetic data confirm that our approach is efficient and effective.Comment: appears at ICDE 201

    Differentially Private Algorithms for Graphs Under Continual Observation

    Get PDF
    Differentially private algorithms protect individuals in data analysis scenarios by ensuring that there is only a weak correlation between the existence of the user in the data and the result of the analysis. Dynamic graph algorithms maintain the solution to a problem (e.g., a matching) on an evolving input, i.e., a graph where nodes or edges are inserted or deleted over time. They output the value of the solution after each update operation, i.e., continuously. We study (event-level and user-level) differentially private algorithms for graph problems under continual observation, i.e., differentially private dynamic graph algorithms. We present event-level private algorithms for partially dynamic counting-based problems such as triangle count that improve the additive error by a polynomial factor (in the length TT of the update sequence) on the state of the art, resulting in the first algorithms with additive error polylogarithmic in TT. We also give Δ\varepsilon-differentially private and partially dynamic algorithms for minimum spanning tree, minimum cut, densest subgraph, and maximum matching. The additive error of our improved MST algorithm is O(Wlog⁥3/2T/Δ)O(W \log^{3/2}T / \varepsilon), where WW is the maximum weight of any edge, which, as we show, is tight up to a (log⁥T/Δ)(\sqrt{\log T} / \varepsilon)-factor. For the other problems, we present a partially-dynamic algorithm with multiplicative error (1+ÎČ)(1+\beta) for any constant ÎČ>0\beta > 0 and additive error O(Wlog⁥(nW)log⁥(T)/(ΔÎČ))O(W \log(nW) \log(T) / (\varepsilon \beta) ). Finally, we show that the additive error for a broad class of dynamic graph algorithms with user-level privacy must be linear in the value of the output solution's range

    Quantifying Differential Privacy in Continuous Data Release under Temporal Correlations

    Get PDF
    Differential Privacy (DP) has received increasing attention as a rigorous privacy framework. Many existing studies employ traditional DP mechanisms (e.g., the Laplace mechanism) as primitives to continuously release private data for protecting privacy at each time point (i.e., event-level privacy), which assume that the data at different time points are independent, or that adversaries do not have knowledge of correlation between data. However, continuously generated data tend to be temporally correlated, and such correlations can be acquired by adversaries. In this paper, we investigate the potential privacy loss of a traditional DP mechanism under temporal correlations. First, we analyze the privacy leakage of a DP mechanism under temporal correlation that can be modeled using Markov Chain. Our analysis reveals that, the event-level privacy loss of a DP mechanism may \textit{increase over time}. We call the unexpected privacy loss \textit{temporal privacy leakage} (TPL). Although TPL may increase over time, we find that its supremum may exist in some cases. Second, we design efficient algorithms for calculating TPL. Third, we propose data releasing mechanisms that convert any existing DP mechanism into one against TPL. Experiments confirm that our approach is efficient and effective.Comment: accepted in TKDE special issue "Best of ICDE 2017". arXiv admin note: substantial text overlap with arXiv:1610.0754

    Continuous Release of Data Streams under both Centralized and Local Differential Privacy

    Get PDF
    In this paper, we study the problem of publishing a stream of real-valued data satisfying differential privacy (DP). One major challenge is that the maximal possible value can be quite large; thus it is necessary to estimate a threshold so that numbers above it are truncated to reduce the amount of noise that is required to all the data. The estimation must be done based on the data in a private fashion. We develop such a method that uses the Exponential Mechanism with a quality function that approximates well the utility goal while maintaining a low sensitivity. Given the threshold, we then propose a novel online hierarchical method and several post-processing techniques. Building on these ideas, we formalize the steps into a framework for private publishing of stream data. Our framework consists of three components: a threshold optimizer that privately estimates the threshold, a perturber that adds calibrated noises to the stream, and a smoother that improves the result using post-processing. Within our framework, we design an algorithm satisfying the more stringent setting of DP called local DP (LDP). To our knowledge, this is the first LDP algorithm for publishing streaming data. Using four real-world datasets, we demonstrate that our mechanism outperforms the state-of-the-art by a factor of 6-10 orders of magnitude in terms of utility (measured by the mean squared error of answering a random range query)

    Swellfish Privacy: Exploiting Time-Dependent Relevance for Continuous Differential Privacy : Technical Report

    Get PDF
    Today, continuous publishing of differentially private query results is the de-facto standard. The challenge hereby is adding enough noise to satisfy a given privacy level, and adding as little noise as necessary to keep high data utility. In this context, we observe that privacy goals of individuals vary significantly over time. For instance, one might aim to hide whether one is on vacation only during school holidays. This observation, named time-dependent relevance, implies two effects which – properly exploited – allow to tune data utility. The effects are time-variant sensitivity (TEAS) and time-variant number of affected query results (TINAR). As today’s DP frameworks, by design, cannot exploit these effects, we propose Swellfish privacy. There, with policy collections, individuals can specify combinations of time-dependent privacy goals. Then, query results are Swellfish-private, if the streams are indistinguishable with respect to such a collection.We propose two tools for designing Swellfish-private mechanisms, namely, temporal sensitivity and a composition theorem, each allowing to exploit one of the effects. In a realistic case study, we show empirically that exploiting both effects improves data utility by one to three orders of magnitude compared to state-of-the-art w-event DP mechanisms. Finally, we generalize the case study by showing how to estimate the strength of the effects for arbitrary use cases

    Privacy, Space and Time: a Survey on Privacy-Preserving Continuous Data Publishing

    Get PDF
    Sensors, portable devices, and location-based services, generate massive amounts of geo-tagged, and/or location- and user-related data on a daily basis. The manipulation of such data is useful in numerous application domains, e.g., healthcare, intelligent buildings, and traffic monitoring, to name a few. A high percentage of these data carry information of users\u27 activities and other personal details, and thus their manipulation and sharing arise concerns about the privacy of the individuals involved. To enable the secure‚Äüfrom the users\u27 privacy perspective‚Äüdata sharing, researchers have already proposed various seminal techniques for the protection of users\u27 privacy. However, the continuous fashion in which data are generated nowadays, and the high availability of external sources of information, pose more threats and add extra challenges to the problem. In this survey, we visit the works done on data privacy for continuous data publishing, and report on the proposed solutions, with a special focus on solutions concerning location or geo-referenced data

    Benchmarking the Utility of -event Differential Privacy Mechanisms – When Baselines Become Mighty Competitors

    Get PDF
    The -event framework is the current standard for ensuring differential privacy on continuously monitored data streams. Following the proposition of-event differential privacy, various mechanisms to implement the framework were proposed. Their comparability in empirical studies is vital for both practitioners to choose a suitable mechanism and researchers to identify current limitations and propose novel mechanisms. By conducting a literature survey, we observe that the results of existing studies are hardly comparable and partially intrinsically inconsistent. To this end, we formalize an empirical study of -event mechanisms by a four-tuple containing re-occurring elements found in our survey. We introduce requirements on these elements that ensure the comparability of experimental results. Moreover, we propose a benchmark that meets all requirements and establishes a new way to evaluate existing and newly proposed mechanisms. Conducting a large-scale empirical study, we gain valuable new insights into the strengths and weaknesses of existing mechanisms. An unexpected – yet explainable – result is a baseline supremacy, i.e., using one of the two baseline mechanisms is expected to deliver good or even the best utility. Finally, we provide guidelines for practitioners to select suitable mechanisms and improvement options for researchers to break the baseline supremacy
    • 

    corecore