56,610 research outputs found

    Literature Review on Secure Mining of Association Rules in Horizontally Distributed Databases

    Get PDF
    Data and knowledge Engineering is one of the area under data mining. Which can extract important knowledge from large database, but sometimes these database are divided among various parties. This paper addresses a fast distributed mining of association rules over horizontally distributed data. This paper presents different methods for secure mining of association rules in horizontally distributed databases. The main aim of this paper is protocol for secure mining of association rules in horizontally distributed databases. The current main protocol is that of Kantarcioglu and Clifton. This protocol, like theirs, is based on the Fast Distributed Mining (FDM) algorithm of Cheung et al., which is an unsecured distributed version of the Apriori algorithm. The main components in this protocol are two novel secure multi-party algorithms — one that computes the union of private subsets that each of the interacting players hold, and another that tests the inclusion of an element held by one player in a subset held by another. This protocol offers improved privacy with respect to the protocol in. In addition, it is simpler and is significantly more efficient in terms of communication rounds, communication cost and computational cost

    Set-oriented data mining in relational databases

    Get PDF
    Data mining is an important real-life application for businesses. It is critical to find efficient ways of mining large data sets. In order to benefit from the experience with relational databases, a set-oriented approach to mining data is needed. In such an approach, the data mining operations are expressed in terms of relational or set-oriented operations. Query optimization technology can then be used for efficient processing.\ud \ud In this paper, we describe set-oriented algorithms for mining association rules. Such algorithms imply performing multiple joins and thus may appear to be inherently less efficient than special-purpose algorithms. We develop new algorithms that can be expressed as SQL queries, and discuss optimization of these algorithms. After analytical evaluation, an algorithm named SETM emerges as the algorithm of choice. Algorithm SETM uses only simple database primitives, viz., sorting and merge-scan join. Algorithm SETM is simple, fast, and stable over the range of parameter values. It is easily parallelized and we suggest several additional optimizations. The set-oriented nature of Algorithm SETM makes it possible to develop extensions easily and its performance makes it feasible to build interactive data mining tools for large databases

    Adaptive Ttwo-phase spatial association rules mining method

    Get PDF
    Since huge amounts of spatial data can be easily collected from various applications, ranging from remote sensing technology to geographical information system, the extraction and comprehension of spatial knowledge is a more and more important task. Many excellent studies on Remote Sensed Image (RSI) have been conducted for potential relationships of crop yield. However, most of them suffer from the performance problem because their techniques for mining association rules are based on Apriori algorithm. In this paper, two efficient algorithms, two-phase spatial association rules mining and adaptive two-phase spatial association rules mining, are proposed for address the above problem. Both methods primarily conduct two phase algorithms by creating Histogram Generators for fast generating coarse-grained spatial association rules, and further mining the fine-grained spatial association rules w.r.t the coarse-grained frequently patterns obtained in the first phase. Adaptive two-phase spatial association rules mining method conducts the idea of partition on an image for efficiently quantizing out non-frequent patterns and thus facilitate the following two phase process. Such two-phase approaches save much computations and will be shown by lots of experimental results in the paper.Facultad de Informátic

    The Apriori Stochastic Dependency Detection (ASDD) algorithm for learning Stochastic logic rules

    Get PDF
    Apriori Stochastic Dependency Detection (ASDD) is an algorithm for fast induction of stochastic logic rules from a database of observations made by an agent situated in an environment. ASDD is based on features of the Apriori algorithm for mining association rules in large databases of sales transactions [1] and the MSDD algorithm for discovering stochastic dependencies in multiple streams of data [15]. Once these rules have been acquired the Precedence algorithm assigns operator precedence when two or more rules matching the input data are applicable to the same output variable. These algorithms currently learn propositional rules, with future extensions aimed towards learning first-order models. We show that stochastic rules produced by this algorithm are capable of reproducing an accurate world model in a simple predator-prey environment

    Mining Implicit Patterns of Customer Purchasing Behavior Based On The Consideration Of RFM Model

    Get PDF
    Association rules have been developed for years and applied successfully for market basket analysis and cross selling among other business applications. One of the most used approaches in association rules is the Apriori algorithm. However the Apriori algorithm, has long known for its weaknesses that generate enormous amount of rules and alreadyknown facts. In this study, we integrate the RFM attributes with the classical association rule mining, Apriori. Based on RFM model, two indicators, RF score and Sale ratio, are used as measure of interestingness. We propose two algorithms, DWRF and DWRFE, to mine for implicit pattern. In our experimental evaluation, the performance of Apriori, DWRF and DWRFE are compared. The result of our algorithms offers an effective measurement of interesting patterns. Moreover, the DWRF algorithm that uses the RF score as a measure of interestingness seems to be able to promptly reflect the fast-changing customer’s purchase patterns

    A Hash Based Frequent Item set Mining using Rehashing

    Get PDF
    Data mining is the use of automated data analysis techniques to uncover previously undetected relationships among data items. Mining frequent item sets is one of the most important concepts of data mining. Frequent item set mining has been a highly concerned field of data mining for researcher for over two decades. It plays an essential role in many data mining tasks that try to find interesting itemsets from databases, such as association rules, correlations, sequences, classifiers and clusters . In this paper, we propose a new association rule mining algorithm called Rehashing Based Frequent Item set (RBFI) in which hashing technology is used to store the database in vertical data format. To avoid hash collision and secondary clustering problem in hashing, rehashing technique is utilized here. The advantages of this new hashing technique are easy to compute the hash function, fast access of data and efficiency. This algorithm provides facilities to avoid unnecessary scans to the database
    • …
    corecore