18 research outputs found

    On the Complexity of tt-Closeness Anonymization and Related Problems

    Full text link
    An important issue in releasing individual data is to protect the sensitive information from being leaked and maliciously utilized. Famous privacy preserving principles that aim to ensure both data privacy and data integrity, such as kk-anonymity and ll-diversity, have been extensively studied both theoretically and empirically. Nonetheless, these widely-adopted principles are still insufficient to prevent attribute disclosure if the attacker has partial knowledge about the overall sensitive data distribution. The tt-closeness principle has been proposed to fix this, which also has the benefit of supporting numerical sensitive attributes. However, in contrast to kk-anonymity and ll-diversity, the theoretical aspect of tt-closeness has not been well investigated. We initiate the first systematic theoretical study on the tt-closeness principle under the commonly-used attribute suppression model. We prove that for every constant tt such that 0t<10\leq t<1, it is NP-hard to find an optimal tt-closeness generalization of a given table. The proof consists of several reductions each of which works for different values of tt, which together cover the full range. To complement this negative result, we also provide exact and fixed-parameter algorithms. Finally, we answer some open questions regarding the complexity of kk-anonymity and ll-diversity left in the literature.Comment: An extended abstract to appear in DASFAA 201

    Almost Perfect Privacy for Additive Gaussian Privacy Filters

    Full text link
    We study the maximal mutual information about a random variable YY (representing non-private information) displayed through an additive Gaussian channel when guaranteeing that only ϵ\epsilon bits of information is leaked about a random variable XX (representing private information) that is correlated with YY. Denoting this quantity by gϵ(X,Y)g_\epsilon(X,Y), we show that for perfect privacy, i.e., ϵ=0\epsilon=0, one has g0(X,Y)=0g_0(X,Y)=0 for any pair of absolutely continuous random variables (X,Y)(X,Y) and then derive a second-order approximation for gϵ(X,Y)g_\epsilon(X,Y) for small ϵ\epsilon. This approximation is shown to be related to the strong data processing inequality for mutual information under suitable conditions on the joint distribution PXYP_{XY}. Next, motivated by an operational interpretation of data privacy, we formulate the privacy-utility tradeoff in the same setup using estimation-theoretic quantities and obtain explicit bounds for this tradeoff when ϵ\epsilon is sufficiently small using the approximation formula derived for gϵ(X,Y)g_\epsilon(X,Y).Comment: 20 pages. To appear in Springer-Verla

    Privacy-preserving enhanced collaborative tagging

    No full text
    Collaborative tagging is one of the most popular services available online, and it allows end user to loosely classify either online or offline resources based on their feedback, expressed in the form of free-text labels (i.e., tags). Although tags may not be per se sensitive information, the wide use of collaborative tagging services increases the risk of cross referencing, thereby seriously compromising user privacy. In this paper, we make a first contribution toward the development of a privacy-preserving collaborative tagging service, by showing how a specific privacy-enhancing technology, namely tag suppression, can be used to protect end-user privacy. Moreover, we analyze how our approach can affect the effectiveness of a policy-based collaborative tagging system that supports enhanced web access functionalities, like content filtering and discovery, based on preferences specified by end users
    corecore