3,794 research outputs found

    Novel Approach to Hide Sensitive Association Rules by Introducing Transaction Affinity

    Get PDF
    In this paper, a novel approach has been proposed for hiding sensitive association rules based on the affinity between the frequent items of the transaction. The affinity between the items is defined as Jaccard similarity. This work proposes five algorithms to ensure the minimum side-effects resulting after applying sanitization algorithms to hide sensitive knowledge. Transaction affinity has been introduced which is calculated by adding the affinity of frequent items present in the transaction with the victim-item (item to be modified). Transactions are selected either by increasing or decreasing value of affinity for data distortion to hide association rules. The first two algorithms, MaxaffinityDSR and MinaffinityDSR, hide the sensitive information by selecting the victim item as the right-hand side of the sensitive association rule. The next two algorithms, MaxaffinityDSL and MinaffinityDSL, select the victim item from the left-hand side of the rule whereas the Hybrid approach picks the victim item from either the left-hand side or right-hand side. The performance of proposed algorithms has been evaluated by comparison with state-of-art methods (Algo 1.a and Algo 1.b), MinFIA, MaxFIA and Naive algorithms. The experiments were performed using the dataset generated from IBM synthetic data generator, and implementation has been performed in R language

    DISTORTION-BASED HEURISTIC METHOD FOR SENSITIVE ASSOCIATION RULE HIDING

    Get PDF
    In the past few years, privacy issues in data mining have received considerable attention in the data mining literature. However, the problem of data security cannot simply be solved by restricting data collection or against unauthorized access, it should be dealt with by providing solutions that  not only protect sensitive information, but also not affect to the accuracy of the results in data mining and not violate the sensitive knowledge related with individual privacy or competitive advantage in businesses. Sensitive association rule hiding is an important issue in privacy preserving data mining. The aim of association rule hiding is to minimize the side effects on the sanitized database, which means to reduce the number of missing non-sensitive rules and the number of generated ghost rules. Current methods for hiding sensitive rules cause side effects and data loss. In this paper, we introduce a new distortion-based method to hide sensitive rules. This method proposes the determination of critical transactions based on the number of non-sensitive maximal frequent itemsets that contain at least one item to the consequent of the sensitive rule, they can be directly affected by the modified transactions. Using this set, the number of non-sensitive itemsets that need to be considered is reduced dramatically. We compute the smallest number of transactions for modification in advance to minimize the damage to the database. Comparative experimental results on real datasets showed that the proposed method can achieve better results than other methods with fewer side effects and data loss

    Association rule hiding using integer linear programming

    Get PDF
    Privacy preserving data mining has become the focus of attention of government statistical agencies and database security research community who are concerned with preventing privacy disclosure during data mining. Repositories of large datasets include sensitive rules that need to be concealed from unauthorized access. Hence, association rule hiding emerged as one of the powerful techniques for hiding sensitive knowledge that exists in data before it is published. In this paper, we present a constraint-based optimization approach for hiding a set of sensitive association rules, using a well-structured integer linear program formulation. The proposed approach reduces the database sanitization problem to an instance of the integer linear programming problem. The solution of the integer linear program determines the transactions that need to be sanitized in order to conceal the sensitive rules while minimizing the impact of sanitization on the non-sensitive rules. We also present a heuristic sanitization algorithm that performs hiding by reducing the support or the confidence of the sensitive rules. The results of the experimental evaluation of the proposed approach on real-life datasets indicate the promising performance of the approach in terms of side effects on the original database

    Investigations in Privacy Preserving Data Mining

    Get PDF
    Data Mining, Data Sharing and Privacy-Preserving are fast emerging as a field of the high level of the research study. A close review of the research based on Privacy Preserving Data Mining revealed the twin fold problems, first is the protection of private data (Data Hiding in Database) and second is the protection of sensitive rules (Knowledge) ingrained in data (Knowledge Hiding in the database). The first problem has its impetus on how to obtain accurate results even when private data is concealed. The second issue focuses on how to protect sensitive association rule contained in the database from being discovered, while non-sensitive association rules can still be mined with traditional data mining projects. Undoubtedly, performance is a major concern with knowledge hiding techniques. This paper focuses on the description of approaches for Knowledge Hiding in the database as well as discuss issues and challenges about the development of an integrated solution for Data Hiding in Database and Knowledge Hiding in Database. This study also highlights directions for the future studies so that suggestive pragmatic measures can be incorporated in ongoing research process on hiding sensitive association rules

    Introducing an algorithm for use to hide sensitive association rules through perturb technique

    Get PDF
    Due to the rapid growth of data mining technology, obtaining private data on users through this technology becomes easier. Association Rules Mining is one of the data mining techniques to extract useful patterns in the form of association rules. One of the main problems in applying this technique on databases is the disclosure of sensitive data by endangering security and privacy. Hiding the association rules is one of the methods to preserve privacy and it is a main subject in the field of data mining and database security, for which several algorithms with different approaches are presented so far. An algorithm to hide sensitive association rules with a heuristic approach is presented in this article, where the Perturb technique based on reducing confidence or support rules is applied with the attempt to remove the considered item from a transaction with the highest weight by allocating weight to the items and transactions. Efficiency is measured by the failure criteria of hiding, number of lost rules and ghost rules, and execution time. The obtained results of this study are assessed and compared with two known FHSAR and RRLR algorithms, based on two real databases (dense and sparse). The results indicate that the number of lost rules in all experiments are reduced by 47% in comparison with RRLR and reduced by 23% in comparison with FHSAR. Moreover, the other undesirable side effects, in this proposed algorithm in the worst case are equal to that of the base algorithms

    Data Hiding and Its Applications

    Get PDF
    Data hiding techniques have been widely used to provide copyright protection, data integrity, covert communication, non-repudiation, and authentication, among other applications. In the context of the increased dissemination and distribution of multimedia content over the internet, data hiding methods, such as digital watermarking and steganography, are becoming increasingly relevant in providing multimedia security. The goal of this book is to focus on the improvement of data hiding algorithms and their different applications (both traditional and emerging), bringing together researchers and practitioners from different research fields, including data hiding, signal processing, cryptography, and information theory, among others

    Investigation of Heterogeneous Approach to Fact Invention of Web Users’ Web Access Behaviour

    Get PDF
    World Wide Web consists of a huge volume of different types of data. Web mining is one of the fields of data mining wherein there are different web services and a large number of web users. Web user mining is also one of the fields of web mining. The web users’ information about the web access is collected through different ways. The most common technique to collect information about the web users is through web log file. There are several other techniques available to collect web users’ web access information; they are through browser agent, user authentication, web review, web rating, web ranking and tracking cookies. The web users find it difficult to retrieve their required information in time from the web because of the huge volume of unstructured and structured information which increases the complexity of the web. Web usage mining is very much important for various purposes such as organizing website, business and maintenance service, personalization of website and reducing the network bandwidth. This paper provides an analysis about the web usage mining techniques. Â
    • …
    corecore