[[abstract]]In this thesis, an algorithm called FP-AFI( FP-tree Approximate Frequent Itemsets mining ) is proposed for mining approximate frequent itemsets. The FP-AFI applies the characteristics of FP-tree structure to compress transaction data. Through analyzing the recursive relationship of the sets of transactions which fault tolerant contain an itemset, the projection operation on FP-tree is extended to obtain the conditional FP-tree which contains a specific item or not. The FP-AFI algorithm applies a depth-first search strategy to generate candidate itemsets by checking the threshold value of a core pattern from the item supports counted in the Header Table. For each candidate itemset, its corresponding fault-tolerant conditional FP-trees are constructed systematically. Accordingly, the approximate support of the candidate and the item supports of each item in the candidate are obtained easily from the counter stored in the root node and the coding vectors of the fault-tolerant conditional FP-trees. Such that, a candidate itemset is confirmed to be an approximate frequent itemset or not efficiently. The experimental results show that, when there are many transactions or the support is small, the performance efficiency of FP-AFI algorithm is better than the FT-Apriori and AFI algorithms proposed previously.
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.