Search CORE

3 research outputs found

Privacy Preserving K-means Clustering with Chaotic Distortion

Author: Chu Chao
Li Jie
Wang Yunfeng
Xu Yong
Publication venue: AIS Electronic Library (AISeL)
Publication date: 02/12/2007
Field of study

Randomized data distortion is a popular method used to mask the data for preserving the privacy. But the appropriateness of this method was questioned because of its possibility of disclosing original data. In this paper, the chaos system, with its unique characteristics of sensitivity on initial condition and unpredictability, is advocated to distort the original data with sensitive information for privacy preserving k-means clustering. The chaotic distortion procedure is proposed and three performance metrics specifically for k-means clustering are developed. We use a large scale experiment (with 4 real world data sets and corresponding reproduced 40 data sets) to evaluate its performance. Our study shows that the proposed approach is effective; it not only can protect individual privacy but also maintain original information of cluster cente

AIS Electronic Library (AISeL)

Privacy Preserving Clustering In Data Mining

Author: Kumar J
Sinha B K
Publication venue
Publication date: 14/05/2010
Field of study

Huge volume of detailed personal data is regularly collected and sharing of these data is proved to be beneficial for data mining application. Such data include shopping habits, criminal records, medical history, credit records etc .On one hand such data is an important asset to business organization and governments for decision making by analyzing it .On the other hand privacy regulations and other privacy concerns may prevent data owners from sharing information for data analysis. In order to share data while preserving privacy data owner must come up with a solution which achieves the dual goal of privacy preservation as well as accurate clustering result. Trying to give solution for this we implemented vector quantization approach piecewise on the datasets which segmentize each row of datasets and quantization approach is performed on each segment using K means which later are again united to form a transformed data set. Some experimental results are presented which tries to finds the optimum value of segment size and quantization parameter which gives optimum in the tradeoff between clustering utility and data privacy in the input dataset

ethesis@nitr

Privacy-Preserving Clustering on Distributed Databases: A Review and Some Contributions

Author: Flavius L. Gorgônio
José Alfredo F. Costa
Publication venue: 'IntechOpen'
Publication date: 21/01/2011
Field of study

IntechOpen