24,081 research outputs found
On the Measurement of Privacy as an Attacker's Estimation Error
A wide variety of privacy metrics have been proposed in the literature to
evaluate the level of protection offered by privacy enhancing-technologies.
Most of these metrics are specific to concrete systems and adversarial models,
and are difficult to generalize or translate to other contexts. Furthermore, a
better understanding of the relationships between the different privacy metrics
is needed to enable more grounded and systematic approach to measuring privacy,
as well as to assist systems designers in selecting the most appropriate metric
for a given application.
In this work we propose a theoretical framework for privacy-preserving
systems, endowed with a general definition of privacy in terms of the
estimation error incurred by an attacker who aims to disclose the private
information that the system is designed to conceal. We show that our framework
permits interpreting and comparing a number of well-known metrics under a
common perspective. The arguments behind these interpretations are based on
fundamental results related to the theories of information, probability and
Bayes decision.Comment: This paper has 18 pages and 17 figure
A look ahead approach to secure multi-party protocols
Secure multi-party protocols have been proposed to enable non-colluding parties to cooperate without a trusted server. Even though such protocols prevent information disclosure other than the objective function, they are quite costly
in computation and communication. Therefore, the high overhead makes it necessary for parties to estimate the utility that can be achieved as a result of the protocol beforehand. In this paper, we propose a look ahead approach, specifically for secure multi-party protocols to achieve distributed
k-anonymity, which helps parties to decide if the utility benefit from the protocol is within an acceptable range before initiating the protocol. Look ahead operation is highly localized and its accuracy depends on the amount of information the parties are willing to share. Experimental results show
the effectiveness of the proposed methods
User's Privacy in Recommendation Systems Applying Online Social Network Data, A Survey and Taxonomy
Recommender systems have become an integral part of many social networks and
extract knowledge from a user's personal and sensitive data both explicitly,
with the user's knowledge, and implicitly. This trend has created major privacy
concerns as users are mostly unaware of what data and how much data is being
used and how securely it is used. In this context, several works have been done
to address privacy concerns for usage in online social network data and by
recommender systems. This paper surveys the main privacy concerns, measurements
and privacy-preserving techniques used in large-scale online social networks
and recommender systems. It is based on historical works on security,
privacy-preserving, statistical modeling, and datasets to provide an overview
of the technical difficulties and problems associated with privacy preserving
in online social networks.Comment: 26 pages, IET book chapter on big data recommender system
Quantifying Privacy: A Novel Entropy-Based Measure of Disclosure Risk
It is well recognised that data mining and statistical analysis pose a
serious treat to privacy. This is true for financial, medical, criminal and
marketing research. Numerous techniques have been proposed to protect privacy,
including restriction and data modification. Recently proposed privacy models
such as differential privacy and k-anonymity received a lot of attention and
for the latter there are now several improvements of the original scheme, each
removing some security shortcomings of the previous one. However, the challenge
lies in evaluating and comparing privacy provided by various techniques. In
this paper we propose a novel entropy based security measure that can be
applied to any generalisation, restriction or data modification technique. We
use our measure to empirically evaluate and compare a few popular methods,
namely query restriction, sampling and noise addition.Comment: 20 pages, 4 figure
Privacy Preservation by Disassociation
In this work, we focus on protection against identity disclosure in the
publication of sparse multidimensional data. Existing multidimensional
anonymization techniquesa) protect the privacy of users either by altering the
set of quasi-identifiers of the original data (e.g., by generalization or
suppression) or by adding noise (e.g., using differential privacy) and/or (b)
assume a clear distinction between sensitive and non-sensitive information and
sever the possible linkage. In many real world applications the above
techniques are not applicable. For instance, consider web search query logs.
Suppressing or generalizing anonymization methods would remove the most
valuable information in the dataset: the original query terms. Additionally,
web search query logs contain millions of query terms which cannot be
categorized as sensitive or non-sensitive since a term may be sensitive for a
user and non-sensitive for another. Motivated by this observation, we propose
an anonymization technique termed disassociation that preserves the original
terms but hides the fact that two or more different terms appear in the same
record. We protect the users' privacy by disassociating record terms that
participate in identifying combinations. This way the adversary cannot
associate with high probability a record with a rare combination of terms. To
the best of our knowledge, our proposal is the first to employ such a technique
to provide protection against identity disclosure. We propose an anonymization
algorithm based on our approach and evaluate its performance on real and
synthetic datasets, comparing it against other state-of-the-art methods based
on generalization and differential privacy.Comment: VLDB201
Publishing Microdata with a Robust Privacy Guarantee
Today, the publication of microdata poses a privacy threat. Vast research has
striven to define the privacy condition that microdata should satisfy before it
is released, and devise algorithms to anonymize the data so as to achieve this
condition. Yet, no method proposed to date explicitly bounds the percentage of
information an adversary gains after seeing the published data for each
sensitive value therein. This paper introduces beta-likeness, an appropriately
robust privacy model for microdata anonymization, along with two anonymization
schemes designed therefor, the one based on generalization, and the other based
on perturbation. Our model postulates that an adversary's confidence on the
likelihood of a certain sensitive-attribute (SA) value should not increase, in
relative difference terms, by more than a predefined threshold. Our techniques
aim to satisfy a given beta threshold with little information loss. We
experimentally demonstrate that (i) our model provides an effective privacy
guarantee in a way that predecessor models cannot, (ii) our generalization
scheme is more effective and efficient in its task than methods adapting
algorithms for the k-anonymity model, and (iii) our perturbation method
outperforms a baseline approach. Moreover, we discuss in detail the resistance
of our model and methods to attacks proposed in previous research.Comment: VLDB201
Anonymization of Sensitive Quasi-Identifiers for l-diversity and t-closeness
A number of studies on privacy-preserving data mining have been proposed. Most of them assume that they can separate quasi-identifiers (QIDs) from sensitive attributes. For instance, they assume that address, job, and age are QIDs but are not sensitive attributes and that a disease name is a sensitive attribute but is not a QID. However, all of these attributes can have features that are both sensitive attributes and QIDs in practice. In this paper, we refer to these attributes as sensitive QIDs and we propose novel privacy models, namely, (l1, ..., lq)-diversity and (t1, ..., tq)-closeness, and a method that can treat sensitive QIDs. Our method is composed of two algorithms: an anonymization algorithm and a reconstruction algorithm. The anonymization algorithm, which is conducted by data holders, is simple but effective, whereas the reconstruction algorithm, which is conducted by data analyzers, can be conducted according to each data analyzer’s objective. Our proposed method was experimentally evaluated using real data sets
- …