2,457 research outputs found

    Market basket analysis : trend analysis of association rules in different time periods

    Get PDF
    Dissertation presented as the partial requirement for obtaining a Master's degree in Statistics and Information Management, specialization in Marketing Research e CRMMarket basket analysis (i.e. Data mining technique in the field of marketing) is the method to find the associations between the items / item sets and based on those associations we can analyze the consumer behavior. In this research we have presented the variability of time, because with the change in time the habits or behavior of the customer also changes. For example, people wear warm clothes in winter and light clothes in summer. Similarly, customers purchase behavior also changes with the change in time. We study the problem of discovering association rules that display regular cyclic variation over time. This problem will allow us to access the changing trends in the purchase behavior of customers in a retail market, and we will be able to analyze the results which will display the changing trends of the association rules. In this research we will study the interaction between association rules and time. We worked on transactional data of a Belgian retail company and analyzed the results which will help the company to build up time period specific marketing strategies, promotional strategies, etc. to increase the profit of their company

    Discovering E-commerce Sequential Data Sets and Sequential Patterns for Recommendation

    Get PDF
    In E-commerce recommendation system accuracy will be improved if more complex sequential patterns of user purchase behavior are learned and included in its user-item matrix input, to make it more informative before collaborative filtering. Existing recommendation systems that use mining techniques with some sequences are those referred to as LiuRec09, ChoiRec12, SuChenRec15, and HPCRec18. LiuRec09 system clusters users with similar clickstream sequence data, then uses association rule mining and segmentation based collaborative filtering to select Top-N neighbors from the cluster to which a target user belongs. ChoiRec12 derives a user’s rating for an item as the percentage of the user’s total number of purchases the user’s item purchase constitutes. SuChenRec15 system is based on clickstream sequence similarity using frequency of purchases of items, duration of time spent and clickstream path. HPCRec18 used historical item purchase frequency, consequential bond between clicks and purchases of items to enrich the user-item matrix qualitatively and quantitatively. None of these systems integrates sequential patterns of customer clicks or purchases to capture more complex sequential purchase behavior. This thesis proposes an algorithm called HSPRec (Historical Sequential Pattern Recommendation System), which first generates an E-Commerce sequential database from historical purchase data using another new algorithm SHOD (Sequential Historical Periodic Database Generation). Then, thesis mines frequent sequential purchase patterns before using these mined sequential patterns with consequential bonds between clicks and purchases to (i) improve the user-item matrix quantitatively, (ii) used historical purchase frequencies to further enrich ratings qualitatively. Thirdly, the improved matrix is used as input to collaborative filtering algorithm for better recommendations. Experimental results with mean absolute error, precision and recall show that the proposed sequential pattern mining-based recommendation system, HSPRec provides more accurate recommendations than the tested existing systems

    Periodic Pattern Mining a Algorithms and Applications

    Get PDF
    Owing to a large number of applications periodic pattern mining has been extensively studied for over a decade Periodic pattern is a pattern that repeats itself with a specific period in a give sequence Periodic patterns can be mined from datasets like biological sequences continuous and discrete time series data spatiotemporal data and social networks Periodic patterns are classified based on different criteria Periodic patterns are categorized as frequent periodic patterns and statistically significant patterns based on the frequency of occurrence Frequent periodic patterns are in turn classified as perfect and imperfect periodic patterns full and partial periodic patterns synchronous and asynchronous periodic patterns dense periodic patterns approximate periodic patterns This paper presents a survey of the state of art research on periodic pattern mining algorithms and their application areas A discussion of merits and demerits of these algorithms was given The paper also presents a brief overview of algorithms that can be applied for specific types of datasets like spatiotemporal data and social network

    Web Usage Mining with Evolutionary Extraction of Temporal Fuzzy Association Rules

    Get PDF
    In Web usage mining, fuzzy association rules that have a temporal property can provide useful knowledge about when associations occur. However, there is a problem with traditional temporal fuzzy association rule mining algorithms. Some rules occur at the intersection of fuzzy sets' boundaries where there is less support (lower membership), so the rules are lost. A genetic algorithm (GA)-based solution is described that uses the flexible nature of the 2-tuple linguistic representation to discover rules that occur at the intersection of fuzzy set boundaries. The GA-based approach is enhanced from previous work by including a graph representation and an improved fitness function. A comparison of the GA-based approach with a traditional approach on real-world Web log data discovered rules that were lost with the traditional approach. The GA-based approach is recommended as complementary to existing algorithms, because it discovers extra rules. (C) 2013 Elsevier B.V. All rights reserved

    Literature Review on Efficient Algorithms for Mining High Utility Itemsets from Transactional Databases

    Get PDF
    This paper presenting a survey on finding itemsets with high utility. For finding itemsets there are many algorithms but those algorithms having a problem of producing a large number of candidate itemsets for high utility itemsets which reduces mining performance in terms of execution. Here we mainly focus on two algorithms utility pattern growth (UP-Growth) and UP-Growth+. Those algorithms are used for mining high utility itemsets, where effective methods are used for pruning candidate itemsets. Mining high utility itemsets Keep in a special data structure called UP-Tree. This, compact tree structure, UP-Tree, is used for make possible the mining performance and avoid scanning original database repeatedly. In this for generation of candidate itemsets only two scans of database. Another proposed algorithms UP Growth+ reduces the number of candidates effectively. It also has better performance than other algorithms in terms of runtime, especially when databases contain huge amount of long transactions. Utility-based data mining is a new research area which is interested in all types of utility factors in data mining processes. In which utility factors are targeted at integrate utility considerations in both predictive and descriptive data mining tasks. High utility itemset mining is a research area of utility based descriptive data mining. Utility based data mining is used for finding itemsets that contribute most to the total utility in that database

    Utilizing Index‑Based Periodic High Utility Mining to Study Frequent Itemsets

    Get PDF
    The potential employability in diferent applications has garnered more signifcance for Periodic High-Utility Itemset Mining (PHUIM). It is to be noted that the conventional utility mining algorithms focus on an itemset’s utility value rather than that of its periodicity in the transaction. A MEAN periodicity measure is added to the minimum (MIN) and maximum (MAX) periodicity to incorporate the periodicity feature into PHUIM in this proposed work. The MEAN-periodicity measure brings a new dimension to the periodicity factor and is arrived at by dividing itemset’s period value by the total number of transactions in that dataset. Further, an algorithm to mine Index-Based Periodic High Utility Itemset Mining (IBPHUIM) from the database using an indexing approach is also proposed in this paper. The proposed IBPHUIM algorithm employs a projectionbased technique and indexing procedure to increase memory and execution speed efciency. The proposed model avoids redundant database scans by generating sub-databases using an indexing data structure. The proposed IBPHUIM model has experimented with test datasets, and the results drawn show that the proposed IBPHUIM model performs considerably better

    Efficiently Mining Temporal Patterns in Time Series Using Information Theory

    Get PDF

    Mining Profitable and Concise Patterns in Large-Scale Internet of Things Environments

    Get PDF
    In recent years, HUIM (or a.k.a. high-utility itemset mining) can be seen as investigated in an extensive manner and studied in many applications especially in basket-market analysis and its relevant applications. Since current basket-market scenario also involves IoT equipment to collect information, i.e., sensor or smart devices, it is necessary to consider the mining of HUIs (or a.k.a. high-utility itemsets) in a large-scale database especially with IoT situations. First, a GA-based MapReduce model is presented in this work known as GMR-Miner for mining closed patterns with high utilization in large-scale databases. The -means model is initially adopted to group transactions regarding their relevant correlation based on the frequency factor. A genetic algorithm (GA) is utilized in the developed MapReduce framework that can be used to explore the potential and possible candidates in a limited time. Also, the developed 3-tier MapReduce model can be easily deployed in Spark for the handlings of any database of large scale for knowledge discovery of closed patterns with high utilization. We created sets of extensive experimental environments for evaluating the results of the developed GMR-Miner compared to the well-known and state-of-the-art CLS-Miner. We present our in-depth results to show that the developed GMR-Miner outperforms CLS-Miner in many criteria, i.e., memory usage, scalability, and runtime.publishedVersio

    Improving E-Commerce Recommendations using High Utility Sequential Patterns of Historical Purchase and Click Stream Data

    Get PDF
    Recommendation systems not only aim to recommend products that suit the taste of consumers but also generate higher revenue and increase customer loyalty for e-commerce companies (such as Amazon, Netflix). Recommendation systems can be improved if user purchase behaviour are used to improve the user-item matrix input to Collaborative Filtering (CF). This matrix is mostly sparse as in real-life, a customer would have bought only very few products from the hundreds of thousands of products in the e-commerce shelf. Thus, existing systems like Kim11Rec, HPCRec18 and HSPRec19 systems use the customer behavior information to improve the accuracy of recommendations. Kim11Rec system used behavior and navigations patterns which were not used earlier. HPCRec18 system used purchase frequency and consequential bond between click and purchased data to improve the user-item frequency matrix. The HSPRec19 system converts historic click and purchase data to sequential data and enhances the user-item frequency matrix with the sequential pattern rules mined from the sequential data for input to the CF. HSPRec19 system generates recommendations based on frequent sequential purchase patterns and does not capture whether the recommended items are also of high utility to the seller (e.g., are more profitable?).The thesis proposes a system called High Utility Sequential Pattern Recommendation System (HUSRec System), which is an extension to the HSPRec19 system that replaces frequent sequential patterns with use of high utility sequential patterns. The proposed HUSRec generates a high utility sequential database from ACM RecSys Challenge dataset using the HUSDBG (High Utility Sequential Database Generator) and HUSPM (High Utility Sequential Pattern Miner) mines the high utility sequential pattern rules which can yield high sales profits for the seller based on quantity and price of items on daily basis, as they have at least the minimum sequence utility. This improves the accuracy of the recommendations. The proposed HUSRec mines clicks sequential data using PrefixSpan algorithm to give frequent sequential rules to suggest items where no purchase has happened, decreasing the sparsity of user-item matrix, improving the user-item matrix for input to the collaborative filtering. Experimental results with mean absolute error, precision and graphs show that the proposed HUSRec system provides more accurate recommendations and higher revenue than the tested existing systems. Keywords: Data mining, Sequential pattern mining, Collaborative filtering, High utility pattern mining, E-commerce recommendation systems
    corecore