research

Prime-based method for interactive mining of frequent patterns

Abstract

Over the past decade, an increasing number of efficient mining algorithms have been proposed to mine the frequent patterns by satisfying a user specified threshold called minimum support (minsup). However, determining an appropriate value for minsup to find proper frequent patterns in different applications is extremely difficult. Since rerunning the mining algorithms from scratch can be very time consuming, researchers have introduced interactive mining to find proper patterns by using the current mining model with various minsup. Thus far, a few efficient interactive mining algorithms have been proposed. However, their runtime do not fulfill the need of short runtime in real time applications especially where data is sparse and proper frequent patterns are mined with very low values of minsup. As response to the above-mentioned challenges, this study is devoted towards developing an interactive mining method based on prime number and its special characteristic “uniqueness” by which the content of the relevant data is transformed into a compact layout. At first, a general architecture for interactive mining is proposed consisting of two isolated components: mining model and mining process. Then, the proposed method is developed based on the architecture such that the mining model is constructed once, and it can be frequently mined by various minsup. In the mining model construction, the content of relevant data is captured by a novel tree structure called PC-tree with one database scan and mining materials are consequently formed. The PC-tree is a well-organized tree structure, which is systematically built based on descendant making introduced in this study. Moreover, this study introduces a mining algorithm called PC-miner to mine the mining model frequently with various values of minsup. It grows an effective candidate head set introduced in this study starting from the longest candidate patterns by using the Apriori principle. Meanwhile, during the growing of the candidate head set in each round, the longest candidate patterns are used to find maximal frequent patterns from which the frequent patterns can be derived. Moreover, the PC-miner reduces the number of candidate patterns and comparisons by using several pruning techniques. A comprehensive experimental analysis is conducted by several experiments and scenarios to evaluate the correctness and effectiveness of the proposed method especially for interactive mining. The experimental results verify that the proposed method constructs the mining model independent of minsup once and this enable the model to be frequently mined. The results also show that the proposed method mines frequent patterns correctly and efficiently. Moreover, the results verify that the proposed method speeds up interactive mining of frequent patterns over both sparse and dense datasets with more scalable total runtime for very low values of minsup over sparse datasets as compared to results from the previous work

    Similar works