2,682 research outputs found
Privacy Preservation by Disassociation
In this work, we focus on protection against identity disclosure in the
publication of sparse multidimensional data. Existing multidimensional
anonymization techniquesa) protect the privacy of users either by altering the
set of quasi-identifiers of the original data (e.g., by generalization or
suppression) or by adding noise (e.g., using differential privacy) and/or (b)
assume a clear distinction between sensitive and non-sensitive information and
sever the possible linkage. In many real world applications the above
techniques are not applicable. For instance, consider web search query logs.
Suppressing or generalizing anonymization methods would remove the most
valuable information in the dataset: the original query terms. Additionally,
web search query logs contain millions of query terms which cannot be
categorized as sensitive or non-sensitive since a term may be sensitive for a
user and non-sensitive for another. Motivated by this observation, we propose
an anonymization technique termed disassociation that preserves the original
terms but hides the fact that two or more different terms appear in the same
record. We protect the users' privacy by disassociating record terms that
participate in identifying combinations. This way the adversary cannot
associate with high probability a record with a rare combination of terms. To
the best of our knowledge, our proposal is the first to employ such a technique
to provide protection against identity disclosure. We propose an anonymization
algorithm based on our approach and evaluate its performance on real and
synthetic datasets, comparing it against other state-of-the-art methods based
on generalization and differential privacy.Comment: VLDB201
Quantification of De-anonymization Risks in Social Networks
The risks of publishing privacy-sensitive data have received considerable
attention recently. Several de-anonymization attacks have been proposed to
re-identify individuals even if data anonymization techniques were applied.
However, there is no theoretical quantification for relating the data utility
that is preserved by the anonymization techniques and the data vulnerability
against de-anonymization attacks.
In this paper, we theoretically analyze the de-anonymization attacks and
provide conditions on the utility of the anonymized data (denoted by anonymized
utility) to achieve successful de-anonymization. To the best of our knowledge,
this is the first work on quantifying the relationships between anonymized
utility and de-anonymization capability. Unlike previous work, our
quantification analysis requires no assumptions about the graph model, thus
providing a general theoretical guide for developing practical
de-anonymization/anonymization techniques.
Furthermore, we evaluate state-of-the-art de-anonymization attacks on a
real-world Facebook dataset to show the limitations of previous work. By
comparing these experimental results and the theoretically achievable
de-anonymization capability derived in our analysis, we further demonstrate the
ineffectiveness of previous de-anonymization attacks and the potential of more
powerful de-anonymization attacks in the future.Comment: Published in International Conference on Information Systems Security
and Privacy, 201
Preserving Both Privacy and Utility in Network Trace Anonymization
As network security monitoring grows more sophisticated, there is an
increasing need for outsourcing such tasks to third-party analysts. However,
organizations are usually reluctant to share their network traces due to
privacy concerns over sensitive information, e.g., network and system
configuration, which may potentially be exploited for attacks. In cases where
data owners are convinced to share their network traces, the data are typically
subjected to certain anonymization techniques, e.g., CryptoPAn, which replaces
real IP addresses with prefix-preserving pseudonyms. However, most such
techniques either are vulnerable to adversaries with prior knowledge about some
network flows in the traces, or require heavy data sanitization or
perturbation, both of which may result in a significant loss of data utility.
In this paper, we aim to preserve both privacy and utility through shifting the
trade-off from between privacy and utility to between privacy and computational
cost. The key idea is for the analysts to generate and analyze multiple
anonymized views of the original network traces; those views are designed to be
sufficiently indistinguishable even to adversaries armed with prior knowledge,
which preserves the privacy, whereas one of the views will yield true analysis
results privately retrieved by the data owner, which preserves the utility. We
present the general approach and instantiate it based on CryptoPAn. We formally
analyze the privacy of our solution and experimentally evaluate it using real
network traces provided by a major ISP. The results show that our approach can
significantly reduce the level of information leakage (e.g., less than 1\% of
the information leaked by CryptoPAn) with comparable utility
A look ahead approach to secure multi-party protocols
Secure multi-party protocols have been proposed to enable non-colluding parties to cooperate without a trusted server. Even though such protocols prevent information disclosure other than the objective function, they are quite costly
in computation and communication. Therefore, the high overhead makes it necessary for parties to estimate the utility that can be achieved as a result of the protocol beforehand. In this paper, we propose a look ahead approach, specifically for secure multi-party protocols to achieve distributed
k-anonymity, which helps parties to decide if the utility benefit from the protocol is within an acceptable range before initiating the protocol. Look ahead operation is highly localized and its accuracy depends on the amount of information the parties are willing to share. Experimental results show
the effectiveness of the proposed methods
- …