91 research outputs found

    Frequent itemset mining on multiprocessor systems

    Get PDF
    Frequent itemset mining is an important building block in many data mining applications like market basket analysis, recommendation, web-mining, fraud detection, and gene expression analysis. In many of them, the datasets being mined can easily grow up to hundreds of gigabytes or even terabytes of data. Hence, efficient algorithms are required to process such large amounts of data. In recent years, there have been many frequent-itemset mining algorithms proposed, which however (1) often have high memory requirements and (2) do not exploit the large degrees of parallelism provided by modern multiprocessor systems. The high memory requirements arise mainly from inefficient data structures that have only been shown to be sufficient for small datasets. For large datasets, however, the use of these data structures force the algorithms to go out-of-core, i.e., they have to access secondary memory, which leads to serious performance degradations. Exploiting available parallelism is further required to mine large datasets because the serial performance of processors almost stopped increasing. Algorithms should therefore exploit the large number of available threads and also the other kinds of parallelism (e.g., vector instruction sets) besides thread-level parallelism. In this work, we tackle the high memory requirements of frequent itemset mining twofold: we (1) compress the datasets being mined because they must be kept in main memory during several mining invocations and (2) improve existing mining algorithms with memory-efficient data structures. For compressing the datasets, we employ efficient encodings that show a good compression performance on a wide variety of realistic datasets, i.e., the size of the datasets is reduced by up to 6.4x. The encodings can further be applied directly while loading the dataset from disk or network. Since encoding and decoding is repeatedly required for loading and mining the datasets, we reduce its costs by providing parallel encodings that achieve high throughputs for both tasks. For a memory-efficient representation of the mining algorithms’ intermediate data, we propose compact data structures and even employ explicit compression. Both methods together reduce the intermediate data’s size by up to 25x. The smaller memory requirements avoid or delay expensive out-of-core computation when large datasets are mined. For coping with the high parallelism provided by current multiprocessor systems, we identify the performance hot spots and scalability issues of existing frequent-itemset mining algorithms. The hot spots, which form basic building blocks of these algorithms, cover (1) counting the frequency of fixed-length strings, (2) building prefix trees, (3) compressing integer values, and (4) intersecting lists of sorted integer values or bitmaps. For all of them, we discuss how to exploit available parallelism and provide scalable solutions. Furthermore, almost all components of the mining algorithms must be parallelized to keep the sequential fraction of the algorithms as small as possible. We integrate the parallelized building blocks and components into three well-known mining algorithms and further analyze the impact of certain existing optimizations. Our algorithms are already single-threaded often up an order of magnitude faster than existing highly optimized algorithms and further scale almost linear on a large 32-core multiprocessor system. Although our optimizations are intended for frequent-itemset mining algorithms, they can be applied with only minor changes to algorithms that are used for mining of other types of itemsets

    Scalable frequent itemset mining on many-core processors

    Get PDF
    Frequent-itemset mining is an essential part of the association rule mining process, which has many application areas. It is a computation and memory intensive task with many opportunities for optimization. Many efficient sequential and parallel algorithms were proposed in the recent years. Most of the parallel algorithms, however, cannot cope with the huge number of threads that are provided by large multiprocessor or many-core systems. In this paper, we provide a highly parallel version of the well-known Eclat algorithm. It runs on both, multiprocessor systems and many-core coprocessors, and scales well up to a very large number of threads---244 in our experiments. To evaluate mcEclat's performance, we conducted many experiments on realistic datasets. mcEclat achieves high speedups of up to 11.5x and 100x on a 12-core multiprocessor system and a 61-core Xeon Phi many-core coprocessor, respectively. Furthermore, mcEclat is competitive with highly optimized existing frequent-itemset mining implementations taken from the FIMI repository

    Algorithms for Extracting Frequent Episodes in the Process of Temporal Data Mining

    Get PDF
    An important aspect in the data mining process is the discovery of patterns having a great influence on the studied problem. The purpose of this paper is to study the frequent episodes data mining through the use of parallel pattern discovery algorithms. Parallel pattern discovery algorithms offer better performance and scalability, so they are of a great interest for the data mining research community. In the following, there will be highlighted some parallel and distributed frequent pattern mining algorithms on various platforms and it will also be presented a comparative study of their main features. The study takes into account the new possibilities that arise along with the emerging novel Compute Unified Device Architecture from the latest generation of graphics processing units. Based on their high performance, low cost and the increasing number of features offered, GPU processors are viable solutions for an optimal implementation of frequent pattern mining algorithmsFrequent Pattern Mining, Parallel Computing, Dynamic Load Balancing, Temporal Data Mining, CUDA, GPU, Fermi, Thread

    pcApriori: Scalable apriori for multiprocessor systems

    Get PDF
    Frequent-itemset mining is an important part of data mining. It is a computational and memory intensive task and has a large number of scientific and statistical application areas. In many of them, the datasets can easily grow up to tens or even several hundred gigabytes of data. Hence, efficient algorithms are required to process such amounts of data. In the recent years, there have been proposed many efficient sequential mining algorithms, which however cannot exploit current and future systems providing large degrees of parallelism. Contrary, the number of parallel frequent-itemset mining algorithms is rather small and most of them do not scale well as the number of threads is largely increased. In this paper, we present a highly-scalable mining algorithm that is based on the well-known Apriori algorithm; it is optimized for processing very large datasets on multiprocessor systems. The key idea of pcApriori is to employ a modified producer--consumer processing scheme, which partitions the data during processing and distributes it to the available threads. We conduct many experiments on large datasets. pcApriori scales almost linear on our test system comprising 32 cores

    Efficient parallel mining of association rules on shared-memory multiple-processor machine

    Get PDF
    In this paper we consider the problem of parallel mining of association rules on a shared-memory multiprocessor system. Two efficient algorithms PSM and HSM have been proposed. PSM adopted two powerful candidate set pruning techniques distributed pruning and global pruning to reduce the size of candidates. HSM further utilized an I/O reduction strategy to enhance its performance. We have implemented PSM and HSM on a SGI Power Challenge parallel machine. The performance studies show that PSM and HSM out perform CD-SM, which is a shared-memory parallel version of the popular Apriori algorithm.published_or_final_versio

    Parallel FIM Approach on GPU using OpenCL

    Get PDF
    In this paper, we describe GPU-Eclat algorithm, a GPU (General Purpose Graphics Processing Unit) enhanced implementation of Frequent Item set Mining (FIM). The frequent itemsets are extracted from a transactional database as it is a essential assignment in data mining field because of its broad applications in mining association rules, time series, correlations etc. The Eclat approach is the typically generate-and-check approach to obtain frequent itemsets from a database with a given minimum support threshold value. OpenCL is a platform independent Open Computing Language for GPU computation. We tested our implementation with an Radeon Dual graphic processor and determine up to 68X speedup as compared with sequential Eclat algorithm on a CPU. In order to map the Eclat algorithm onto the SIMD (Single Instruction Multiple Data) execution model, an array data structure is used to represent the input database and standard dataset is converted to the vertical data layout. In our implementation, we perform a parallelized version of the candidate generation and support counting phases on the GPU. Experimental results show that GPU-Eclat consistently outperforms CPU-based Eclat implementations. Our results reveal the potential for GPGPUs in speeding up data mining algorithms

    Enhancing FP-Growth Performance Using Multi-threading based on Comparative Study

    Get PDF
    The time required for generating frequent patterns plays an important role in mining association rules, especially when there exist a large number of patterns and/or long patterns. Association rule mining has been focused as a major challenge within the field of data mining in research for over a decade. Although tremendous progress has been made, algorithms still need improvements since databases are growing larger and larger. In this research we present a performance comparison between two frequent pattern extraction algorithms implemented in Java, they are the Recursive Elimination (RElim) and FP-Growth, these algorithms are used in finding frequent itemsets in the transaction database. We found that FP-growth outperformed RElim in term of execution time. In this context, multithreading is used to enhance the time efficiency of FP-growth algorithm. The results showed that multithreaded FP-growth is more efficient compared to single threaded FP-growth

    CONTEXT-AWARE DEBUGGING FOR CONCURRENT PROGRAMS

    Get PDF
    Concurrency faults are difficult to reproduce and localize because they usually occur under specific inputs and thread interleavings. Most existing fault localization techniques focus on sequential programs but fail to identify faulty memory access patterns across threads, which are usually the root causes of concurrency faults. Moreover, existing techniques for sequential programs cannot be adapted to identify faulty paths in concurrent programs. While concurrency fault localization techniques have been proposed to analyze passing and failing executions obtained from running a set of test cases to identify faulty access patterns, they primarily focus on using statistical analysis. We present a novel approach to fault localization using feature selection techniques from machine learning. Our insight is that the concurrency access patterns obtained from a large volume of coverage data generally constitute high dimensional data sets, yet existing statistical analysis techniques for fault localization are usually applied to low dimensional data sets. Each additional failing or passing run can provide more diverse information, which can help localize faulty concurrency access patterns in code. The patterns with maximum feature diversity information can point to the most suspicious pattern. We then apply data mining technique and identify the interleaving patterns that are occurred most frequently and provide the possible faulty paths. We also evaluate the effectiveness of fault localization using test suites generated from different test adequacy criteria. We have evaluated Cadeco on 10 real-world multi-threaded Java applications. Results indicate that Cadeco outperforms state-of-the-art approaches for localizing concurrency faults
    corecore