13,228 research outputs found
Privacy-Aware Processing of Biometric Templates by Means of Secure Two-Party Computation
The use of biometric data for person identification and access control is gaining more and more popularity. Handling biometric data, however, requires particular care, since biometric data is indissolubly tied to the identity of the owner hence raising important security and privacy issues. This chapter focuses on the latter, presenting an innovative approach that, by relying on tools borrowed from Secure Two Party Computation (STPC) theory, permits to process the biometric data in encrypted form, thus eliminating any risk that private biometric information is leaked during an identification process. The basic concepts behind STPC are reviewed together with the basic cryptographic primitives needed to achieve privacy-aware processing of biometric data in a STPC context. The two main approaches proposed so far, namely homomorphic encryption and garbled circuits, are discussed and the way such techniques can be used to develop a full biometric matching protocol described. Some general guidelines to be used in the design of a privacy-aware biometric system are given, so as to allow the reader to choose the most appropriate tools depending on the application at hand
When the signal is in the noise: Exploiting Diffix's Sticky Noise
Anonymized data is highly valuable to both businesses and researchers. A
large body of research has however shown the strong limits of the
de-identification release-and-forget model, where data is anonymized and
shared. This has led to the development of privacy-preserving query-based
systems. Based on the idea of "sticky noise", Diffix has been recently proposed
as a novel query-based mechanism satisfying alone the EU Article~29 Working
Party's definition of anonymization. According to its authors, Diffix adds less
noise to answers than solutions based on differential privacy while allowing
for an unlimited number of queries.
This paper presents a new class of noise-exploitation attacks, exploiting the
noise added by the system to infer private information about individuals in the
dataset. Our first differential attack uses samples extracted from Diffix in a
likelihood ratio test to discriminate between two probability distributions. We
show that using this attack against a synthetic best-case dataset allows us to
infer private information with 89.4% accuracy using only 5 attributes. Our
second cloning attack uses dummy conditions that conditionally strongly affect
the output of the query depending on the value of the private attribute. Using
this attack on four real-world datasets, we show that we can infer private
attributes of at least 93% of the users in the dataset with accuracy between
93.3% and 97.1%, issuing a median of 304 queries per user. We show how to
optimize this attack, targeting 55.4% of the users and achieving 91.7%
accuracy, using a maximum of only 32 queries per user.
Our attacks demonstrate that adding data-dependent noise, as done by Diffix,
is not sufficient to prevent inference of private attributes. We furthermore
argue that Diffix alone fails to satisfy Art. 29 WP's definition of
anonymization. [...
Biometric Backdoors: A Poisoning Attack Against Unsupervised Template Updating
In this work, we investigate the concept of biometric backdoors: a template
poisoning attack on biometric systems that allows adversaries to stealthily and
effortlessly impersonate users in the long-term by exploiting the template
update procedure. We show that such attacks can be carried out even by
attackers with physical limitations (no digital access to the sensor) and zero
knowledge of training data (they know neither decision boundaries nor user
template). Based on the adversaries' own templates, they craft several
intermediate samples that incrementally bridge the distance between their own
template and the legitimate user's. As these adversarial samples are added to
the template, the attacker is eventually accepted alongside the legitimate
user. To avoid detection, we design the attack to minimize the number of
rejected samples.
We design our method to cope with the weak assumptions for the attacker and
we evaluate the effectiveness of this approach on state-of-the-art face
recognition pipelines based on deep neural networks. We find that in scenarios
where the deep network is known, adversaries can successfully carry out the
attack over 70% of cases with less than ten injection attempts. Even in
black-box scenarios, we find that exploiting the transferability of adversarial
samples from surrogate models can lead to successful attacks in around 15% of
cases. Finally, we design a poisoning detection technique that leverages the
consistent directionality of template updates in feature space to discriminate
between legitimate and malicious updates. We evaluate such a countermeasure
with a set of intra-user variability factors which may present the same
directionality characteristics, obtaining equal error rates for the detection
between 7-14% and leading to over 99% of attacks being detected after only two
sample injections.Comment: 12 page
Mining Frequent Graph Patterns with Differential Privacy
Discovering frequent graph patterns in a graph database offers valuable
information in a variety of applications. However, if the graph dataset
contains sensitive data of individuals such as mobile phone-call graphs and
web-click graphs, releasing discovered frequent patterns may present a threat
to the privacy of individuals. {\em Differential privacy} has recently emerged
as the {\em de facto} standard for private data analysis due to its provable
privacy guarantee. In this paper we propose the first differentially private
algorithm for mining frequent graph patterns.
We first show that previous techniques on differentially private discovery of
frequent {\em itemsets} cannot apply in mining frequent graph patterns due to
the inherent complexity of handling structural information in graphs. We then
address this challenge by proposing a Markov Chain Monte Carlo (MCMC) sampling
based algorithm. Unlike previous work on frequent itemset mining, our
techniques do not rely on the output of a non-private mining algorithm.
Instead, we observe that both frequent graph pattern mining and the guarantee
of differential privacy can be unified into an MCMC sampling framework. In
addition, we establish the privacy and utility guarantee of our algorithm and
propose an efficient neighboring pattern counting technique as well.
Experimental results show that the proposed algorithm is able to output
frequent patterns with good precision
Privacy Preserving Utility Mining: A Survey
In big data era, the collected data usually contains rich information and
hidden knowledge. Utility-oriented pattern mining and analytics have shown a
powerful ability to explore these ubiquitous data, which may be collected from
various fields and applications, such as market basket analysis, retail,
click-stream analysis, medical analysis, and bioinformatics. However, analysis
of these data with sensitive private information raises privacy concerns. To
achieve better trade-off between utility maximizing and privacy preserving,
Privacy-Preserving Utility Mining (PPUM) has become a critical issue in recent
years. In this paper, we provide a comprehensive overview of PPUM. We first
present the background of utility mining, privacy-preserving data mining and
PPUM, then introduce the related preliminaries and problem formulation of PPUM,
as well as some key evaluation criteria for PPUM. In particular, we present and
discuss the current state-of-the-art PPUM algorithms, as well as their
advantages and deficiencies in detail. Finally, we highlight and discuss some
technical challenges and open directions for future research on PPUM.Comment: 2018 IEEE International Conference on Big Data, 10 page
Towards trajectory anonymization: a generalization-based approach
Trajectory datasets are becoming popular due to the massive usage of GPS and locationbased services. In this paper, we address privacy issues regarding the identification of individuals in static trajectory datasets. We first adopt the notion of k-anonymity to trajectories and propose a novel generalization-based approach for anonymization of trajectories. We further show that releasing
anonymized trajectories may still have some privacy leaks. Therefore we propose a randomization based reconstruction algorithm for releasing anonymized trajectory data and also present how the underlying techniques can be adapted to other anonymity standards. The experimental results on real and synthetic trajectory datasets show the effectiveness of the proposed techniques
- …
