18 research outputs found
On the Complexity of -Closeness Anonymization and Related Problems
An important issue in releasing individual data is to protect the sensitive
information from being leaked and maliciously utilized. Famous privacy
preserving principles that aim to ensure both data privacy and data integrity,
such as -anonymity and -diversity, have been extensively studied both
theoretically and empirically. Nonetheless, these widely-adopted principles are
still insufficient to prevent attribute disclosure if the attacker has partial
knowledge about the overall sensitive data distribution. The -closeness
principle has been proposed to fix this, which also has the benefit of
supporting numerical sensitive attributes. However, in contrast to
-anonymity and -diversity, the theoretical aspect of -closeness has
not been well investigated.
We initiate the first systematic theoretical study on the -closeness
principle under the commonly-used attribute suppression model. We prove that
for every constant such that , it is NP-hard to find an optimal
-closeness generalization of a given table. The proof consists of several
reductions each of which works for different values of , which together
cover the full range. To complement this negative result, we also provide exact
and fixed-parameter algorithms. Finally, we answer some open questions
regarding the complexity of -anonymity and -diversity left in the
literature.Comment: An extended abstract to appear in DASFAA 201
Almost Perfect Privacy for Additive Gaussian Privacy Filters
We study the maximal mutual information about a random variable
(representing non-private information) displayed through an additive Gaussian
channel when guaranteeing that only bits of information is leaked
about a random variable (representing private information) that is
correlated with . Denoting this quantity by , we show that
for perfect privacy, i.e., , one has for any pair of
absolutely continuous random variables and then derive a second-order
approximation for for small . This approximation is
shown to be related to the strong data processing inequality for mutual
information under suitable conditions on the joint distribution . Next,
motivated by an operational interpretation of data privacy, we formulate the
privacy-utility tradeoff in the same setup using estimation-theoretic
quantities and obtain explicit bounds for this tradeoff when is
sufficiently small using the approximation formula derived for
.Comment: 20 pages. To appear in Springer-Verla
Privacy-preserving enhanced collaborative tagging
Collaborative tagging is one of the most popular services available online, and it allows end user to loosely classify either online or offline resources based on their feedback, expressed in the form of free-text labels (i.e., tags). Although tags may not be per se sensitive information, the wide use of collaborative tagging services increases the risk of cross referencing, thereby seriously compromising user privacy. In this paper, we make a first contribution toward the development of a privacy-preserving collaborative tagging service, by showing how a specific privacy-enhancing technology, namely tag suppression, can be used to protect end-user privacy. Moreover, we analyze how our approach can affect the effectiveness of a policy-based collaborative tagging system that supports enhanced web access functionalities, like content filtering and discovery, based on preferences specified by end users