50 research outputs found

    Privacy Preserving ID3 over Horizontally, Vertically and Grid Partitioned Data

    Get PDF
    We consider privacy preserving decision tree induction via ID3 in the case where the training data is horizontally or vertically distributed. Furthermore, we consider the same problem in the case where the data is both horizontally and vertically distributed, a situation we refer to as grid partitioned data. We give an algorithm for privacy preserving ID3 over horizontally partitioned data involving more than two parties. For grid partitioned data, we discuss two different evaluation methods for preserving privacy ID3, namely, first merging horizontally and developing vertically or first merging vertically and next developing horizontally. Next to introducing privacy preserving data mining over grid-partitioned data, the main contribution of this paper is that we show, by means of a complexity analysis that the former evaluation method is the more efficient.Comment: 25 page

    A New Approach of Detecting Network Anomalies using Improved ID3 with Horizontal Partioning Based Decision Tree

    Get PDF
    In this paper we are proposing a new approach of Detecting Network Anomalies using improved ID3 with horizontal portioning based decision tree. Here we first apply different clustering algorithms and after that we apply horizontal partioning decision tree and then check the network anomalies from the decision tree. Here in this paper we find the comparative analysis of different clustering algorithms and existing id3 based decision tree

    Effects of Information Filters: A Phenomenon on the Web

    Get PDF
    In the Internet era, information processing for personalization and relevance has been one of the key topics of research and development. It ranges from design of applications like search engines, web crawlers, learning engines to reverse image searches, audio processed search, auto complete, etc. Information retrieval plays a vital role in most of the above mentioned applications. A part of information retrieval which deals with personalization and rendering is often referred to as Information Filtering. The emphasis of this paper is to empirically analyze the information filters commonly seen and to analyze their correctness and effects. The measure of correctness is not in terms of percentage of correct results but instead a rational approach of analysis using a non mathematical argument is presented. Filters employed by Google’s search engine are used to analyse the effects of filtering on the web. A plausible

    INTERNATIONAL JOURNAL OF COMPUTER ENGINEERING & TECHNOLOGY (IJCET)

    Get PDF
    Cryptographic approaches are traditional and preferred methodologies used to preserve the privacy of data released for analysis. Privacy Preserving Data Mining (PPDM) is a new trend to derive knowledge when the data is available with multiple parties involved. The PPDM deployments that currently exist involve cryptographic key exchange and key computation achieved through a trusted server or a third party. The key computation over heads, key compromise in presence of dishonest parties and shared data integrity are the key challenges that exist. This research work discusses the provisioning of data privacy using commutative RSA algorithms eliminating the overheads of secure key distribution, storage and key update mechanisms generally used to secure the data to be used for analysis. Decision Tree algorithms are used for analysis of the data provided by the various parties involved. We have considered the C5. 0 data mining algorithm for analysis due to its efficiency over the currently prevalent algorithms like C4. 5 and ID3. In this paper the major emphasis is to provide a platform for secure communication, preserving privacy of the vertically partitioned data available with the parties involved in the semi-honest trust model. The proposed Key Distribution-Less Privacy Preserving Data Mining () model is compared with other protocols like Secure Lock and Access Control Polynomial to prove its efficiency in terms of the computational overheads observed in preserving privacy. The experiential evaluations proves the reduces the computational overheads by about 95.96% when compared to the Secure Lock model and is similar to the

    Towards federated multivariate statistical process control (FedMSPC)

    Full text link
    The ongoing transition from a linear (produce-use-dispose) to a circular economy poses significant challenges to current state-of-the-art information and communication technologies. In particular, the derivation of integrated, high-level views on material, process, and product streams from (real-time) data produced along value chains is challenging for several reasons. Most importantly, sufficiently rich data is often available yet not shared across company borders because of privacy concerns which make it impossible to build integrated process models that capture the interrelations between input materials, process parameters, and key performance indicators along value chains. In the current contribution, we propose a privacy-preserving, federated multivariate statistical process control (FedMSPC) framework based on Federated Principal Component Analysis (PCA) and Secure Multiparty Computation to foster the incentive for closer collaboration of stakeholders along value chains. We tested our approach on two industrial benchmark data sets - SECOM and ST-AWFD. Our empirical results demonstrate the superior fault detection capability of the proposed approach compared to standard, single-party (multiway) PCA. Furthermore, we showcase the possibility of our framework to provide privacy-preserving fault diagnosis to each data holder in the value chain to underpin the benefits of secure data sharing and federated process modeling

    A Data Mining Perspective in Privacy Preserving Data Mining Systems

    Get PDF
    Privacy Preserving Data Mining () presents a novel framework for extracting and deriving information when the data is distributed amongst the multiple parties. The privacy preservation of data and the use of efficient data mining algorithms in systems is a major issue that exists. Most of the existing systems employ the cryptographic key exchange process and the key computation process accomplished by means of certain trusted server or a third party. To eliminate the key exchange and key computation overheads this paper discusses the Key Distribution-Less Privacy Preserving Data Mining () system. The novelty of the system is that no data is published but only the association rules are published to achieve effective data mining results. The embodies the data mining algorithm for classification rule generation and data mining. The results discussed in this paper compare the based system with the based system and the efficiency in rule generation, overhead reduction and classification efficiency of the latter is proved

    Parallel and Distributed Data Mining

    Get PDF
    corecore