3 research outputs found

    A Fast Algorithm For Data Mining

    Get PDF
    In the past few years, there has been a keen interest in mining frequent itemsets in large data repositories. Frequent itemsets correspond to the set of items that occur frequently in transactions in a database. Several novel algorithms have been developed recently to mine closed frequent itemsets - these itemsets are a subset of the frequent itemsets. These algorithms are of practical value: they can be applied to real-world applications to extract patterns of interest in data repositories. However, prior to using an algorithm in practice, it is necessary to know its performance as well implementation issues. In this project, we address such a need for the algorithm “Using Attribute Value Lattice to Find Frequent Itemsets” that was developed by Lin et. al. We clarify some aspects of the algorithm, develop an implementation of the algorithm, and present the results of a performance study. In our experiments we find that the running time of the algorithm for certain input datasets grows exponentially. To address this problem, we develop a novel procedure for binning the data. Our results show that with binned data, the running time of the algorithm grows linearly. This allows one to obtain trends for the dataset

    Using Attribute Value Lattice to Find Closed Frequent Itemsets

    No full text
    Finding all closed frequent itemsets is a key step of association rule mining since the non-redundant association rule can be inferred from all the closed frequent itemsets. In this paper we present a new method for finding closed frequent itemsets based on attribute value lattice. In the new method, we argue that vertical data representation and attribute value lattice can find all closed frequent itemsets efficiently, thus greatly improve the efficiency of association rule mining algorithm. We discuss how these techniques and methods are applied to find closed frequent itemsets. In our method, the data are represented vertically; each frequent attribute value is associated with its granule, which is represented as a hybrid bitmap. Based on the partial order defined between the attribute values among the databases, an attribute value lattice is constructed, which is much smaller compared with the original databases. Instead of searching all the items in the databases, which is adopted by almost all the association rule algorithms to find frequent itemsets, our method only searches the attribute-value lattice. A bottom-up breadth-first approach is employed to search the attribute value lattice to find the closed frequent itemsets

    <title>Using attribute value lattice to find closed frequent itemsets</title>

    No full text
    corecore