2 research outputs found

    DiffNodesets: An Efficient Structure for Fast Mining Frequent Itemsets

    Full text link
    Mining frequent itemsets is an essential problem in data mining and plays an important role in many data mining applications. In recent years, some itemset representations based on node sets have been proposed, which have shown to be very efficient for mining frequent itemsets. In this paper, we propose DiffNodeset, a novel and more efficient itemset representation, for mining frequent itemsets. Based on the DiffNodeset structure, we present an efficient algorithm, named dFIN, to mining frequent itemsets. To achieve high efficiency, dFIN finds frequent itemsets using a set-enumeration tree with a hybrid search strategy and directly enumerates frequent itemsets without candidate generation under some case. For evaluating the performance of dFIN, we have conduct extensive experiments to compare it against with existing leading algorithms on a variety of real and synthetic datasets. The experimental results show that dFIN is significantly faster than these leading algorithms.Comment: 22 pages, 13 figure

    A Parallel Mining Algorithm for Maximum Erasable Itemset Based on Multi-core Processor

    Get PDF
    Mining the erasable itemset is an interesting research domain, which has been applied to solve the problem of how to efficiently use limited funds to optimise production in economic crisis. After the problem of mining the erasable itemset was posed, researchers have proposed many algorithms to solve it, among which mining the maximum erasable itemset is a significant direction for research. Since all subsets of the maximum erasable itemset are erasable itemsets, all erasable itemsets can be obtained by mining the maximum erasable itemset, which reduces both the quantity of candidate and resultant itemsets generated during the mining process. However, computing many itemset values still takes a lot of CPU time when mining huge amounts of data. And it is difficult to solve the problem quickly with sequential algorithms. Therefore, this proposed study presents a parallel algorithm for the mining of maximum erasable itemsets, called PAMMEI, based on a multi-core processor platform. The algorithm divides the entire mining task into multiple subtasks and assigns them to multiple processor cores for parallel execution, while using an efficient pruning strategy to downsize the space to be searched and increase the mining speed. To verify the efficiency of the PAMMEI algorithm, the paper compares it with most advanced algorithms. The experimental results show that PAMMEI is superior to the comparable algorithms with respect to runtime, memory usage and scalability
    corecore