2,601 research outputs found
Preserving differential privacy under finite-precision semantics
International audienceThe approximation introduced by finite-precision representation of continuous data can induce arbitrarily large information leaks even when the computation using exact semantics is secure. Such leakage can thus undermine design efforts aimed at protecting sensitive information. We focus here on differential privacy, an approach to privacy that emerged from the area of statistical databases and is now widely applied also in other domains. In this approach, privacy is protected by adding noise to the values correlated to the private data. The typical mechanisms used to achieve differential privacy have been proved correct in the ideal case in which computations are made using infinite-precision semantics. In this paper, we analyze the situation at the implementation level, where the semantics is necessarily limited by finite precision, i.e., the representation of real numbers and the operations on them are rounded according to some level of precision. We show that in general there are violations of the differential privacy property, and we study the conditions under which we can still guarantee a limited (but, arguably, acceptable) variant of the property, under only a minor degradation of the privacy level. Finally, we illustrate our results on two examples: the standard Laplacian mechanism commonly used in differential privacy, and a bivariate version of it recently introduced in the setting of privacy-aware geolocation
An Economic Analysis of Privacy Protection and Statistical Accuracy as Social Choices
Statistical agencies face a dual mandate to publish accurate statistics while protecting respondent privacy. Increasing privacy protection requires decreased accuracy. Recognizing this as a resource allocation problem, we propose an economic solution: operate where the marginal cost of increasing privacy equals the marginal benefit. Our model of production, from computer science, assumes data are published using an efficient differentially private algorithm. Optimal choice weighs the demand for accurate statistics against the demand for privacy. Examples from U.S. statistical programs show how our framework can guide decision-making. Further progress requires a better understanding of willingness-to-pay for privacy and statistical accuracy
TraVaS: Differentially Private Trace Variant Selection for Process Mining
In the area of industrial process mining, privacy-preserving event data
publication is becoming increasingly relevant. Consequently, the trade-off
between high data utility and quantifiable privacy poses new challenges.
State-of-the-art research mainly focuses on differentially private trace
variant construction based on prefix expansion methods. However, these
algorithms face several practical limitations such as high computational
complexity, introducing fake variants, removing frequent variants, and a
bounded variant length. In this paper, we introduce a new approach for direct
differentially private trace variant release which uses anonymized
\textit{partition selection} strategies to overcome the aforementioned
restraints. Experimental results on real-life event data show that our
algorithm outperforms state-of-the-art methods in terms of both plain data
utility and result utility preservation
TraVaG: Differentially Private Trace Variant Generation Using GANs
Process mining is rapidly growing in the industry. Consequently, privacy
concerns regarding sensitive and private information included in event data,
used by process mining algorithms, are becoming increasingly relevant.
State-of-the-art research mainly focuses on providing privacy guarantees, e.g.,
differential privacy, for trace variants that are used by the main process
mining techniques, e.g., process discovery. However, privacy preservation
techniques for releasing trace variants still do not fulfill all the
requirements of industry-scale usage. Moreover, providing privacy guarantees
when there exists a high rate of infrequent trace variants is still a
challenge. In this paper, we introduce TraVaG as a new approach for releasing
differentially private trace variants based on \text{Generative Adversarial
Networks} (GANs) that provides industry-scale benefits and enhances the level
of privacy guarantees when there exists a high ratio of infrequent variants.
Moreover, TraVaG overcomes shortcomings of conventional privacy preservation
techniques such as bounding the length of variants and introducing fake
variants. Experimental results on real-life event data show that our approach
outperforms state-of-the-art techniques in terms of privacy guarantees, plain
data utility preservation, and result utility preservation
- …