359 research outputs found

    Closed Association Rules

    Get PDF
    In this paper we present a new basis for association rules called Closed Association Rules (CR). This basis contains all valid association rules that can be generated from frequent closed itemsets. CR is a lossless representation of all association rules. Regarding the number of rules, our basis is between all association rules (AR) and minimal non-redundant association rules (MNR), filling a gap between them. The new basis provides a framework for some other bases and we show that MNR is a subset of CR. Our experiments show that CR is a good alternative for all association rules. The number of generated rules can be much less, and beside frequent closed itemsets nothing else is required

    A Novel Algorithm for Discovering Frequent Closures and Generators

    Get PDF
    The Important construction of many association rules needs the calculation of Frequent Closed Item Sets and Frequent Generator Item Sets (FCIS/FGIS). However, these two odd jobs are joined very rarely. Most of the existing methods apply level wise Breadth-First search. Though the Depth-First search depends on different characteristics of data, it is often better than others. Hence, in this paper it is named as FCFG algorithm that combines the Frequent closed item sets and frequent generators. This proposed algorithm (FCFG) extracts frequent itemsets (FIs) in a Depth-First search method. Then this algorithm extracts FCIS and FGIS from FIs by a level wise approach. Then it associates the generators to their closures. In FCFG algorithm, a generic technique is extended from an arbitrary FI-miner algorithm in order to support the generation of minimal non-redundant association rules. Experimental results indicate that FCFG algorithm performs better when compared with other level wise methods in most of the cases

    Closed sets based discovery of small covers for association rules (extended version)

    Get PDF
    International audienceIn this paper, we address the problem of the usefulness of the set of discovered association rules. This problem is important since real-life databases yield most of the time several thousands of rules with high confidence. We propose new algorithms based on Galois closed sets to reduce the extraction to small covers (or bases) for exact and approximate rules, adapted from lattice theory and data analysis domain. Once frequent closed itemsets – which constitute a generating set for both frequent itemsets and association rules – have been discovered, no additional database pass is needed to derive these bases. Experiments conducted on real-life databases show that these algorithms are efficient and valuable in practice

    A genetic algorithm coupled with tree-based pruning for mining closed association rules

    Get PDF
    Due to the voluminous amount of itemsets that are generated, the association rules extracted from these itemsets contain redundancy, and designing an effective approach to address this issue is of paramount importance. Although multiple algorithms were proposed in recent years for mining closed association rules most of them underperform in terms of run time or memory. Another issue that remains challenging is the nature of the dataset. While some of the existing algorithms perform well on dense datasets others perform well on sparse datasets. This paper aims to handle these drawbacks by using a genetic algorithm for mining closed association rules. Recent studies have shown that genetic algorithms perform better than conventional algorithms due to their bitwise operations of crossover and mutation. Bitwise operations are predominantly faster than conventional approaches and bits consume lesser memory thereby improving the overall performance of the algorithm. To address the redundancy in the mined association rules a tree-based pruning algorithm has been designed here. This works on the principle of minimal antecedent and maximal consequent. Experiments have shown that the proposed approach works well on both dense and sparse datasets while surpassing existing techniques with regard to run time and memory
    corecore