829 research outputs found
Differential Privacy: on the trade-off between Utility and Information Leakage
Differential privacy is a notion of privacy that has become very popular in
the database community. Roughly, the idea is that a randomized query mechanism
provides sufficient privacy protection if the ratio between the probabilities
that two adjacent datasets give the same answer is bound by e^epsilon. In the
field of information flow there is a similar concern for controlling
information leakage, i.e. limiting the possibility of inferring the secret
information from the observables. In recent years, researchers have proposed to
quantify the leakage in terms of R\'enyi min mutual information, a notion
strictly related to the Bayes risk. In this paper, we show how to model the
query system in terms of an information-theoretic channel, and we compare the
notion of differential privacy with that of mutual information. We show that
differential privacy implies a bound on the mutual information (but not
vice-versa). Furthermore, we show that our bound is tight. Then, we consider
the utility of the randomization mechanism, which represents how close the
randomized answers are, in average, to the real ones. We show that the notion
of differential privacy implies a bound on utility, also tight, and we propose
a method that under certain conditions builds an optimal randomization
mechanism, i.e. a mechanism which provides the best utility while guaranteeing
differential privacy.Comment: 30 pages; HAL repositor
Emerging privacy challenges and approaches in CAV systems
The growth of Internet-connected devices, Internet-enabled services and Internet of Things systems continues at a rapid pace, and their application to transport systems is heralded as game-changing. Numerous developing CAV (Connected and Autonomous Vehicle) functions, such as traffic planning, optimisation, management, safety-critical and cooperative autonomous driving applications, rely on data from various sources. The efficacy of these functions is highly dependent on the dimensionality, amount and accuracy of the data being shared. It holds, in general, that the greater the amount of data available, the greater the efficacy of the function. However, much of this data is privacy-sensitive, including personal, commercial and research data. Location data and its correlation with identity and temporal data can help infer other personal information, such as home/work locations, age, job, behavioural features, habits, social relationships. This work categorises the emerging privacy challenges and solutions for CAV systems and identifies the knowledge gap for future research, which will minimise and mitigate privacy concerns without hampering the efficacy of the functions
- …