2,929 research outputs found

    Mining incremental association rules with generalized FP-tree.

    Get PDF
    Mining association rules among items in a large database have been recognized as one of the most important data mining problems. New transaction insertions and old transaction deletions may lead to previously discovered association rules no longer being interesting, and new interesting association rules may also appear. The process of generating association rules in the updated database using mostly only the updated part of the database and the previous association rules is called incremental association rules maintenance. The most straightforward approach for mining incremental association rules in the updated database starts from scratch to mine the entire database again when update occurs. This approach is very time consuming because it uses the entire database and has to repeat many computations previously done. Some algorithms that utilize the previously discovered frequent patterns have been presented in order to improve the maintenance efficiency by reducing the computation time. However, they still suffer some shortcomings which include: (1) scanning the updated part of the database several times (at each level) to confirm previous large itemsets still large; (2) scanning the entire database several times when some previous small itemsets now become large in the updated part of the database. This thesis proposes two new methods that use the frequent patterns tree (FP-tree) structure to reduce the required number of database scan. One is DB-tree algorithm which stores all the information in a tree structure and requires (1) no scanning of the original database, (2) to only scan the updated transactions once without involving candidate sets generation. Another method is FPUP algorithm, which predicts possible large itemsets for future mining. This FPUP approach also uses FP-tree structure to scan the old database less times than the existing FP algorithms for improved performance. Paper copy at Leddy Library: Theses & Major Papers - Basement, West Bldg. / Call Number: Thesis2001 .S864. Source: Masters Abstracts International, Volume: 40-06, page: 1555. Adviser: Christe Ezeite. Thesis (M.Sc.)--University of Windsor (Canada), 2001

    Frequent Subgraph Mining in Outerplanar Graphs

    Get PDF
    In recent years there has been an increased interest in frequent pattern discovery in large databases of graph structured objects. While the frequent connected subgraph mining problem for tree datasets can be solved in incremental polynomial time, it becomes intractable for arbitrary graph databases. Existing approaches have therefore resorted to various heuristic strategies and restrictions of the search space, but have not identified a practically relevant tractable graph class beyond trees. In this paper, we define the class of so called tenuous outerplanar graphs, a strict generalization of trees, develop a frequent subgraph mining algorithm for tenuous outerplanar graphs that works in incremental polynomial time, and evaluate the algorithm empirically on the NCI molecular graph dataset

    Frequent Subgraph Mining in Outerplanar Graphs

    Get PDF
    In recent years there has been an increased interest in frequent pattern discovery in large databases of graph structured objects. While the frequent connected subgraph mining problem for tree datasets can be solved in incremental polynomial time, it becomes intractable for arbitrary graph databases. Existing approaches have therefore resorted to various heuristic strategies and restrictions of the search space, but have not identified a practically relevant tractable graph class beyond trees. In this paper, we define the class of so called tenuous outerplanar graphs, a strict generalization of trees, develop a frequent subgraph mining algorithm for tenuous outerplanar graphs that works in incremental polynomial time, and evaluate the algorithm empirically on the NCI molecular graph dataset

    Improving Efficiency of Incremental Mining by Trie Structure and Pre-Large Itemsets

    Get PDF
    Incremental data mining has been discussed widely in recent years, as it has many practical applications, and various incremental mining algorithms have been proposed. Hong et al. proposed an efficient incremental mining algorithm for handling newly inserted transactions by using the concept of pre-large itemsets. The algorithm aimed to reduce the need to rescan the original database and also cut maintenance costs. Recently, Lin et al. proposed the Pre-FUFP algorithm to handle new transactions more efficiently, and make it easier to update the FP-tree. However, frequent itemsets must be mined from the FP-growth algorithm. In this paper, we propose a Pre-FUT algorithm (Fast-Update algorithm using the Trie data structure and the concept of pre-large itemsets), which not only builds and updates the trie structure when new transactions are inserted, but also mines all the frequent itemsets easily from the tree. Experimental results show the good performance of the proposed algorithm

    Discovering High Utility Itemsets using Hybrid Approach

    Get PDF
    Mining of high utility itemsets especially from the big transactional databases is time consuming task. For mining the high utility itemsets from large transactional datasets multiple methods are available and have some consequential limitations. In case of performance these methods need to be scrutinized under low memory based systems for mining high utility itemsets from transactional datasets as well as to address further measures. The proposed algorithm combines the High Utility Pattern Mining and Incremental Frequent Pattern Mining. Two algorithms used are Apriori and existing Parallel UP Growth for mining high utility itemsets using transactional databases. The information about high utility itemsets is maintained in a data structure called UP tree. These algorithms are not only used to scans the incremental database but also collects newly generated frequent itemsets support count. It provides fast execution because it includes new itemsets in tree and removes rare itemset from a utility pattern tree structure that reduces cost and time. From various Experimental analysis and results, this hybrid approach with existing Apriori and UP-Growth is proposed with aim of improving the performance
    • …
    corecore