2,228 research outputs found

    Evolving temporal association rules with genetic algorithms

    Get PDF
    A novel framework for mining temporal association rules by discovering itemsets with a genetic algorithm is introduced. Metaheuristics have been applied to association rule mining, we show the efficacy of extending this to another variant - temporal association rule mining. Our framework is an enhancement to existing temporal association rule mining methods as it employs a genetic algorithm to simultaneously search the rule space and temporal space. A methodology for validating the ability of the proposed framework isolates target temporal itemsets in synthetic datasets. The Iterative Rule Learning method successfully discovers these targets in datasets with varying levels of difficulty

    PFPM: Discovering Periodic Frequent Patterns with Novel Periodicity Measures

    Get PDF
    Periodic pattern mining is the task of discovering patterns that periodically appear in transactions. Typically, periodic pattern mining algorithms will discard a pattern as being nonperiodic if it has a single period greater than a maximal periodicity threshold, defined by the user. A major drawback of this approach is that it is not flexible, as a pattern can be discarded based on only one of its periods. In this chapter, we present a solution to this issue by proposing to discover periodic patterns using three measures: the minimum periodicity, the maximum periodicity, and the average periodicity. The combination of these measures has the advantage of being more flexible. Properties of these measures are studied. Moreover, an efficient algorithm named PFPM (Periodic Frequent Pattern Miner) is proposed to discover all frequent periodic patterns using these measures. An experimental evaluation on real data sets shows that the proposed PFPM algorithm is efficient and can filter a huge number of nonperiodic patterns to reveal only the desired periodic patterns

    Mining High Utility Itemsets with Regular Occurrence

    Get PDF
    High utility itemset mining (HUIM) plays an important role in the data mining community and in a wide range of applications. For example, in retail business it is used for finding sets of sold products that give high profit, low cost, etc. These itemsets can help improve marketing strategies, make promotions/ advertisements, etc. However, since HUIM only considers utility values of items/itemsets, it may not be sufficient to observe product-buying behavior of customers such as information related to "regular purchases of sets of products having a high profit margin". To address this issue, the occurrence behavior of itemsets (in the term of regularity) simultaneously with their utility values was investigated. Then, the problem of mining high utility itemsets with regular occurrence (MHUIR) to find sets of co-occurrence items with high utility values and regular occurrence in a database was considered. An efficient single-pass algorithm, called MHUIRA, was introduced. A new modified utility-list structure, called NUL, was designed to efficiently maintain utility values and occurrence information and to increase the efficiency of computing the utility of itemsets. Experimental studies on real and synthetic datasets and complexity analyses are provided to show the efficiency of MHUIRA combined with NUL in terms of time and space usage for mining interesting itemsets based on regularity and utility constraints

    Mining top-k regular episodes from sensor streams

    Get PDF
    International audienceThe monitoring of human activities plays an important role in health-care applications and for the data mining community. Existing approaches work on activities recognition occurring in sensor data streams. However, regular behaviors have not been studied. Thus, we here introduce a new approach to discover top-k most regular episodes from sensors streams, TKRES. The top-k approach allows us to control the size of the output, thus preventing overwhelming result analysis for the supervisor. TKRES is based on the use of a simple top-k list and a k-tree structure for maintaining the top-k episodes and their occurrence information. We also investigate and report the performances of TKRES on two real-life smart home datasets

    Market basket analysis : trend analysis of association rules in different time periods

    Get PDF
    Dissertation presented as the partial requirement for obtaining a Master's degree in Statistics and Information Management, specialization in Marketing Research e CRMMarket basket analysis (i.e. Data mining technique in the field of marketing) is the method to find the associations between the items / item sets and based on those associations we can analyze the consumer behavior. In this research we have presented the variability of time, because with the change in time the habits or behavior of the customer also changes. For example, people wear warm clothes in winter and light clothes in summer. Similarly, customers purchase behavior also changes with the change in time. We study the problem of discovering association rules that display regular cyclic variation over time. This problem will allow us to access the changing trends in the purchase behavior of customers in a retail market, and we will be able to analyze the results which will display the changing trends of the association rules. In this research we will study the interaction between association rules and time. We worked on transactional data of a Belgian retail company and analyzed the results which will help the company to build up time period specific marketing strategies, promotional strategies, etc. to increase the profit of their company

    The Minimum Description Length Principle for Pattern Mining: A Survey

    Full text link
    This is about the Minimum Description Length (MDL) principle applied to pattern mining. The length of this description is kept to the minimum. Mining patterns is a core task in data analysis and, beyond issues of efficient enumeration, the selection of patterns constitutes a major challenge. The MDL principle, a model selection method grounded in information theory, has been applied to pattern mining with the aim to obtain compact high-quality sets of patterns. After giving an outline of relevant concepts from information theory and coding, as well as of work on the theory behind the MDL and similar principles, we review MDL-based methods for mining various types of data and patterns. Finally, we open a discussion on some issues regarding these methods, and highlight currently active related data analysis problems

    Discovering E-commerce Sequential Data Sets and Sequential Patterns for Recommendation

    Get PDF
    In E-commerce recommendation system accuracy will be improved if more complex sequential patterns of user purchase behavior are learned and included in its user-item matrix input, to make it more informative before collaborative filtering. Existing recommendation systems that use mining techniques with some sequences are those referred to as LiuRec09, ChoiRec12, SuChenRec15, and HPCRec18. LiuRec09 system clusters users with similar clickstream sequence data, then uses association rule mining and segmentation based collaborative filtering to select Top-N neighbors from the cluster to which a target user belongs. ChoiRec12 derives a user’s rating for an item as the percentage of the user’s total number of purchases the user’s item purchase constitutes. SuChenRec15 system is based on clickstream sequence similarity using frequency of purchases of items, duration of time spent and clickstream path. HPCRec18 used historical item purchase frequency, consequential bond between clicks and purchases of items to enrich the user-item matrix qualitatively and quantitatively. None of these systems integrates sequential patterns of customer clicks or purchases to capture more complex sequential purchase behavior. This thesis proposes an algorithm called HSPRec (Historical Sequential Pattern Recommendation System), which first generates an E-Commerce sequential database from historical purchase data using another new algorithm SHOD (Sequential Historical Periodic Database Generation). Then, thesis mines frequent sequential purchase patterns before using these mined sequential patterns with consequential bonds between clicks and purchases to (i) improve the user-item matrix quantitatively, (ii) used historical purchase frequencies to further enrich ratings qualitatively. Thirdly, the improved matrix is used as input to collaborative filtering algorithm for better recommendations. Experimental results with mean absolute error, precision and recall show that the proposed sequential pattern mining-based recommendation system, HSPRec provides more accurate recommendations than the tested existing systems

    Web Usage Mining with Evolutionary Extraction of Temporal Fuzzy Association Rules

    Get PDF
    In Web usage mining, fuzzy association rules that have a temporal property can provide useful knowledge about when associations occur. However, there is a problem with traditional temporal fuzzy association rule mining algorithms. Some rules occur at the intersection of fuzzy sets' boundaries where there is less support (lower membership), so the rules are lost. A genetic algorithm (GA)-based solution is described that uses the flexible nature of the 2-tuple linguistic representation to discover rules that occur at the intersection of fuzzy set boundaries. The GA-based approach is enhanced from previous work by including a graph representation and an improved fitness function. A comparison of the GA-based approach with a traditional approach on real-world Web log data discovered rules that were lost with the traditional approach. The GA-based approach is recommended as complementary to existing algorithms, because it discovers extra rules. (C) 2013 Elsevier B.V. All rights reserved
    corecore