5 research outputs found

    Enabling Multi-level Trust in Privacy Preserving Data Mining

    Full text link
    Privacy Preserving Data Mining (PPDM) addresses the problem of developing accurate models about aggregated data without access to precise information in individual data record. A widely studied \emph{perturbation-based PPDM} approach introduces random perturbation to individual values to preserve privacy before data is published. Previous solutions of this approach are limited in their tacit assumption of single-level trust on data miners. In this work, we relax this assumption and expand the scope of perturbation-based PPDM to Multi-Level Trust (MLT-PPDM). In our setting, the more trusted a data miner is, the less perturbed copy of the data it can access. Under this setting, a malicious data miner may have access to differently perturbed copies of the same data through various means, and may combine these diverse copies to jointly infer additional information about the original data that the data owner does not intend to release. Preventing such \emph{diversity attacks} is the key challenge of providing MLT-PPDM services. We address this challenge by properly correlating perturbation across copies at different trust levels. We prove that our solution is robust against diversity attacks with respect to our privacy goal. That is, for data miners who have access to an arbitrary collection of the perturbed copies, our solution prevent them from jointly reconstructing the original data more accurately than the best effort using any individual copy in the collection. Our solution allows a data owner to generate perturbed copies of its data for arbitrary trust levels on-demand. This feature offers data owners maximum flexibility.Comment: 20 pages, 5 figures. Accepted for publication in IEEE Transactions on Knowledge and Data Engineerin

    A Data Mining Perspective in Privacy Preserving Data Mining Systems

    Get PDF
    Privacy Preserving Data Mining () presents a novel framework for extracting and deriving information when the data is distributed amongst the multiple parties. The privacy preservation of data and the use of efficient data mining algorithms in systems is a major issue that exists. Most of the existing systems employ the cryptographic key exchange process and the key computation process accomplished by means of certain trusted server or a third party. To eliminate the key exchange and key computation overheads this paper discusses the Key Distribution-Less Privacy Preserving Data Mining () system. The novelty of the system is that no data is published but only the association rules are published to achieve effective data mining results. The embodies the data mining algorithm for classification rule generation and data mining. The results discussed in this paper compare the based system with the based system and the efficiency in rule generation, overhead reduction and classification efficiency of the latter is proved

    INTERNATIONAL JOURNAL OF COMPUTER ENGINEERING & TECHNOLOGY (IJCET)

    Get PDF
    Cryptographic approaches are traditional and preferred methodologies used to preserve the privacy of data released for analysis. Privacy Preserving Data Mining (PPDM) is a new trend to derive knowledge when the data is available with multiple parties involved. The PPDM deployments that currently exist involve cryptographic key exchange and key computation achieved through a trusted server or a third party. The key computation over heads, key compromise in presence of dishonest parties and shared data integrity are the key challenges that exist. This research work discusses the provisioning of data privacy using commutative RSA algorithms eliminating the overheads of secure key distribution, storage and key update mechanisms generally used to secure the data to be used for analysis. Decision Tree algorithms are used for analysis of the data provided by the various parties involved. We have considered the C5. 0 data mining algorithm for analysis due to its efficiency over the currently prevalent algorithms like C4. 5 and ID3. In this paper the major emphasis is to provide a platform for secure communication, preserving privacy of the vertically partitioned data available with the parties involved in the semi-honest trust model. The proposed Key Distribution-Less Privacy Preserving Data Mining () model is compared with other protocols like Secure Lock and Access Control Polynomial to prove its efficiency in terms of the computational overheads observed in preserving privacy. The experiential evaluations proves the reduces the computational overheads by about 95.96% when compared to the Secure Lock model and is similar to the

    Effects of Information Filters: A Phenomenon on the Web

    Get PDF
    In the Internet era, information processing for personalization and relevance has been one of the key topics of research and development. It ranges from design of applications like search engines, web crawlers, learning engines to reverse image searches, audio processed search, auto complete, etc. Information retrieval plays a vital role in most of the above mentioned applications. A part of information retrieval which deals with personalization and rendering is often referred to as Information Filtering. The emphasis of this paper is to empirically analyze the information filters commonly seen and to analyze their correctness and effects. The measure of correctness is not in terms of percentage of correct results but instead a rational approach of analysis using a non mathematical argument is presented. Filters employed by Google’s search engine are used to analyse the effects of filtering on the web. A plausible

    Privacy-Preserving Frequent Pattern Mining Across Private Databases

    No full text
    Privacy consideration has much significance in the application of data mining. It is very important that the privacy of individual parties will not be exposed when data mining techniques are applied to a large collection of data about the parties. In many scenarios such as data warehousing or data integration, data from the different parties form a many-to-many schema. This paper addresses the problem of privacy-preserving frequent pattern mining in such a schema across two dimension sites. We assume that sites are not trusted and they are semi-honest. Our method is based on the concept of semi-join and does not involve data encryption which is used in most previous work. Experiments are conducted to study the efficiency of the proposed models.
    corecore