117 research outputs found
Reconstruction Methods for Providing Privacy in Data Mining
Data mining is the process of finding correlations or patterns among the dozens of fields in large database. A fruitful direction for data mining research will be the development of techniques that incorporate privacy concerns. Since primary task in our paper is that accurate data which we retrieve should be somewhat changed while providing to users. For this reason, recently much research effort has been devoted for addressing the problem of providing security in data mining. We consider the concrete case of building a decision tree classifier from data in which the values of individual records have been reconstructed. The resulting data records look very different from the original records and the distribution of data values is also very different from the original distribution. By using these reconstructed distribution we are able to build classifiers whose accuracy is comparable to the accuracy of classifiers built with the original data
A New Class of Attacks on Time Series Data Mining
Traditional research on preserving privacy in data mining focuses on time-invariant privacy issues. With the emergence of time series data mining, traditional snapshot-based privacy issues need to be extended to be multi-dimensional with the addition of time dimension. We find current techniques to preserve privacy in data mining are not effective in preserving time-domain privacy. We present the data flow separation attack on privacy in time series data mining, which is based on blind source separation techniques from statistical signal processing. Our experiments with real data show that this attack is effective. By combining the data flow separation method and the frequency matching method, an attacker can identify data sources and compromise time-domain privacy. We propose possible countermeasures to the data flow separation attack in the paper
A New Class of Attacks on Time Series Data Mining
Traditional research on preserving privacy in data mining focuses on time-invariant privacy issues. With the emergence of time series data mining, traditional snapshot-based privacy issues need to be extended to be multi-dimensional with the addition of time dimension. We find current techniques to preserve privacy in data mining are not effective in preserving time-domain privacy. We present the data flow separation attack on privacy in time series data mining, which is based on blind source separation techniques from statistical signal processing. Our experiments with real data show that this attack is effective. By combining the data flow separation method and the frequency matching method, an attacker can identify data sources and compromise time-domain privacy. We propose possible countermeasures to the data flow separation attack in the paper
Identity Disclosure Protection: A Data Reconstruction Approach for Preserving Privacy in Data Mining
Quantifying Privacy: A Novel Entropy-Based Measure of Disclosure Risk
It is well recognised that data mining and statistical analysis pose a
serious treat to privacy. This is true for financial, medical, criminal and
marketing research. Numerous techniques have been proposed to protect privacy,
including restriction and data modification. Recently proposed privacy models
such as differential privacy and k-anonymity received a lot of attention and
for the latter there are now several improvements of the original scheme, each
removing some security shortcomings of the previous one. However, the challenge
lies in evaluating and comparing privacy provided by various techniques. In
this paper we propose a novel entropy based security measure that can be
applied to any generalisation, restriction or data modification technique. We
use our measure to empirically evaluate and compare a few popular methods,
namely query restriction, sampling and noise addition.Comment: 20 pages, 4 figure
Privacy Preserving Access of Outsourced Data in Heterogeneous Databases
- Privacy is main concern in the world, among present technological phase. Information security has become a dangerous issue since the information sharing has a common need. Recently, privacy issues have been increased enormously when internet is flourishing with forums, social media, blogs and e-commerce, etc. Hence research area is retaining privacy in data mining. The sensitive data of the data owners should not be known to the third parties and other data owners. To make it efficient, the horizontal partitioning is done on the heterogeneous databases is introduced to improve privacy and efficiency. we address the major issues of privacy preservation in information mining. In particular, we consider to provide protection between different data owners and to give privacy between them by partitioning the databases horizontally and the data2019;s are available in the heterogeneous databases. Our proposed work is to center around the study of security saving on unknown databases and conceiving private refresh methods to database frameworks that backings thoughts of obscurity assorted than k-secrecy. Symmetric homomorphic encryption scheme, which is significantly more efficient than the asymmetric schemes. Our proposed work helps the valid user can extract with key issue in partition data in automated approach and the data2019;s are partitioned horizontally
When and where do you want to hide? Recommendation of location privacy preferences with local differential privacy
In recent years, it has become easy to obtain location information quite
precisely. However, the acquisition of such information has risks such as
individual identification and leakage of sensitive information, so it is
necessary to protect the privacy of location information. For this purpose,
people should know their location privacy preferences, that is, whether or not
he/she can release location information at each place and time. However, it is
not easy for each user to make such decisions and it is troublesome to set the
privacy preference at each time. Therefore, we propose a method to recommend
location privacy preferences for decision making. Comparing to existing method,
our method can improve the accuracy of recommendation by using matrix
factorization and preserve privacy strictly by local differential privacy,
whereas the existing method does not achieve formal privacy guarantee. In
addition, we found the best granularity of a location privacy preference, that
is, how to express the information in location privacy protection. To evaluate
and verify the utility of our method, we have integrated two existing datasets
to create a rich information in term of user number. From the results of the
evaluation using this dataset, we confirmed that our method can predict
location privacy preferences accurately and that it provides a suitable method
to define the location privacy preference
- …