210 research outputs found

    BIG DATA MINING FOR INTERESTING PATTERNS WITH MAP REDUCE TECHNIQUE

    Get PDF
    There are many algorithms available in data mining to search interesting patterns from transactional databases of precise data. Frequent pattern mining is a technique to find the frequently occurred items in data mining. Most of the techniques used to find all the interesting patterns from a collection of precise data, where items occurred in each transaction are certainly known to the system. As well as in many real-time applications, users are interested in a tiny portion of large frequent patterns. So the proposed user constrained mining approach, will help to find frequent patterns in which user is interested. This approach will efficiently find user interested frequent patterns by applying user constraints on the collections of uncertain data. The user can specify their own interest in the form of constraints and uses the Map Reduce model to find uncertain frequent pattern that satisfy the user-specified constraintsÂ

    FP-Growth Tree Based Algorithms Analysis: CP-Tree and K Map

    Get PDF
    We propose a novel frequent-pattern tree (FP-tree) structure; our performance study shows that the FP-growth method is efficient and scalable for mining both long and short frequent patterns, and is about an order of magnitude faster than the Apriori algorithm and also faster than some recently reported new frequent-pattern mining methods. FP-tree method is efficient algorithm in association mining to mine frequent patterns in data mining, in spite of long or short frequent data patterns. By using compact best tree structure and partitioning-based and divide-and-conquer data mining searching method, it can be reduces the costs searchsubstantially .it just as the analysis multi-CPU or reduce computer memory to solve problem. But this approach can be apparently decrease the costs for exchanging and combining control information and the algorithm complexity is also greatly decreased, solve this problem efficiently. Even if main adopting multi-CPU technique, raising the requirement is basically hardware, best performanceimprovement is still to be limited. Is there any other way that most one may it can reduce these costs in FP-tree construction, performance best improvement is still limited

    Item-centric mining of frequent patterns from big uncertain data

    Get PDF
    Item-centric mining of frequent patterns from big uncertain dat

    Model-based probabilistic frequent itemset mining

    Get PDF
    Data uncertainty is inherent in emerging applications such as location-based services, sensor monitoring systems, and data integration. To handle a large amount of imprecise information, uncertain databases have been recently developed. In this paper, we study how to efficiently discover frequent itemsets from large uncertain databases, interpreted under the Possible World Semantics. This is technically challenging, since an uncertain database induces an exponential number of possible worlds. To tackle this problem, we propose a novel methods to capture the itemset mining process as a probability distribution function taking two models into account: the Poisson distribution and the normal distribution. These model-based approaches extract frequent itemsets with a high degree of accuracy and support large databases. We apply our techniques to improve the performance of the algorithms for (1) finding itemsets whose frequentness probabilities are larger than some threshold and (2) mining itemsets with the {Mathematical expression} highest frequentness probabilities. Our approaches support both tuple and attribute uncertainty models, which are commonly used to represent uncertain databases. Extensive evaluation on real and synthetic datasets shows that our methods are highly accurate and four orders of magnitudes faster than previous approaches. In further theoretical and experimental studies, we give an intuition which model-based approach fits best to different types of data sets. © 2012 The Author(s).published_or_final_versio

    Mining Frequent Itemsets for Evolving Database Involving Insertion

    Get PDF
    Mining frequent itemsets is one of the popular task in data mining. There are many applications like location-based services, sensor monitoring systems, and data integration in which the content of transaction is uncertain in nature. This initiates the requirements of uncertain data mining. The frequent itemsets mining in uncertain transaction databases semantically and computationally differs from techniques applied to standard certain databases. The goal of proposed model is to deal with the problem of extracting frequent itemsets from evolving databases using Possible World Semantics (PWS). As evolving databases contains exponential number of possible worlds mining process can be modeled as Poisson Binomial Distribution (PBD). In this proposed work apriori-based PFI mining algorithm and approximate incremental mining algorithm are developed. An approximate incremental mining algorithm can efficiently and accurately discover frequent itemsets. Also, focus is on the issue of maintaining mining results for uncertain databases. DOI: 10.17762/ijritcc2321-8169.150615

    Towards Efficient Sequential Pattern Mining in Temporal Uncertain Databases

    Get PDF
    Uncertain sequence databases are widely used to model data with inaccurate or imprecise timestamps in many real world applications. In this paper, we use uniform distributions to model uncertain timestamps and adopt possible world semantics to interpret temporal uncertain database. We design an incremental approach to manage temporal uncertainty efficiently, which is integrated into the classic pattern-growth SPM algorithm to mine uncertain sequential patterns. Extensive experiments prove that our algorithm performs well in both efficiency and scalability
    corecore