Search CORE

213 research outputs found

A GA-Based Approach to Hide Sensitive High Utility Itemsets

Author: Chun-Wei Lin
Guo-Cheng Lan
Jia-Wei Wong
Tzung-Pei Hong
Wen-Yang Lin
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2014
Field of study

A GA-based privacy preserving utility mining method is proposed to find appropriate transactions to be inserted into the database for hiding sensitive high utility itemsets. It maintains the low information loss while providing information to the data demanders and protects the high-risk information in the database. A flexible evaluation function with three factors is designed in the proposed approach to evaluate whether the processed transactions are required to be inserted. Three different weights are, respectively, assigned to the three factors according to users. Moreover, the downward closure property and the prelarge concept are adopted in the proposed approach to reduce the cost of rescanning database, thus speeding up the evaluation process of chromosomes

Directory of Open Access Journals

Privacy-by-design in big data analytics and social mining

Author: Giannotti Fosca
Monreale Anna
Pedreschi Dino
Pratesi Francesca
Rinzivillo Salvatore
Publication venue
Publication date: 01/01/2014
Field of study

Open Access Repository

Privacy Preservation by Disassociation

Author: Liagouris John
Mamoulis Nikos
Skiadopoulos Spiros
Terrovitis Manolis
Publication venue
Publication date: 01/01/2012
Field of study

In this work, we focus on protection against identity disclosure in the publication of sparse multidimensional data. Existing multidimensional anonymization techniquesa) protect the privacy of users either by altering the set of quasi-identifiers of the original data (e.g., by generalization or suppression) or by adding noise (e.g., using differential privacy) and/or (b) assume a clear distinction between sensitive and non-sensitive information and sever the possible linkage. In many real world applications the above techniques are not applicable. For instance, consider web search query logs. Suppressing or generalizing anonymization methods would remove the most valuable information in the dataset: the original query terms. Additionally, web search query logs contain millions of query terms which cannot be categorized as sensitive or non-sensitive since a term may be sensitive for a user and non-sensitive for another. Motivated by this observation, we propose an anonymization technique termed disassociation that preserves the original terms but hides the fact that two or more different terms appear in the same record. We protect the users' privacy by disassociating record terms that participate in identifying combinations. This way the adversary cannot associate with high probability a record with a rare combination of terms. To the best of our knowledge, our proposal is the first to employ such a technique to provide protection against identity disclosure. We propose an anonymization algorithm based on our approach and evaluate its performance on real and synthetic datasets, comparing it against other state-of-the-art methods based on generalization and differential privacy.Comment: VLDB201

arXiv.org e-Print Archive

CiteSeerX

HKU Scholars Hub