313 research outputs found

    An Efficient Algorithm for Mining Frequent Sequence with Constraint Programming

    Full text link
    The main advantage of Constraint Programming (CP) approaches for sequential pattern mining (SPM) is their modularity, which includes the ability to add new constraints (regular expressions, length restrictions, etc). The current best CP approach for SPM uses a global constraint (module) that computes the projected database and enforces the minimum frequency; it does this with a filtering algorithm similar to the PrefixSpan method. However, the resulting system is not as scalable as some of the most advanced mining systems like Zaki's cSPADE. We show how, using techniques from both data mining and CP, one can use a generic constraint solver and yet outperform existing specialized systems. This is mainly due to two improvements in the module that computes the projected frequencies: first, computing the projected database can be sped up by pre-computing the positions at which an symbol can become unsupported by a sequence, thereby avoiding to scan the full sequence each time; and second by taking inspiration from the trailing used in CP solvers to devise a backtracking-aware data structure that allows fast incremental storing and restoring of the projected database. Detailed experiments show how this approach outperforms existing CP as well as specialized systems for SPM, and that the gain in efficiency translates directly into increased efficiency for other settings such as mining with regular expressions.Comment: frequent sequence mining, constraint programmin

    Problem-Solving Knowledge Mining from Users’\ud Actions in an Intelligent Tutoring System

    Get PDF
    In an intelligent tutoring system (ITS), the domain expert should provide\ud relevant domain knowledge to the tutor so that it will be able to guide the\ud learner during problem solving. However, in several domains, this knowledge is\ud not predetermined and should be captured or learned from expert users as well as\ud intermediate and novice users. Our hypothesis is that, knowledge discovery (KD)\ud techniques can help to build this domain intelligence in ITS. This paper proposes\ud a framework to capture problem-solving knowledge using a promising approach\ud of data and knowledge discovery based on a combination of sequential pattern\ud mining and association rules discovery techniques. The framework has been implemented\ud and is used to discover new meta knowledge and rules in a given domain\ud which then extend domain knowledge and serve as problem space allowing\ud the intelligent tutoring system to guide learners in problem-solving situations.\ud Preliminary experiments have been conducted using the framework as an alternative\ud to a path-planning problem solver in CanadarmTutor

    Constraint-based sequence mining using constraint programming

    Full text link
    The goal of constraint-based sequence mining is to find sequences of symbols that are included in a large number of input sequences and that satisfy some constraints specified by the user. Many constraints have been proposed in the literature, but a general framework is still missing. We investigate the use of constraint programming as general framework for this task. We first identify four categories of constraints that are applicable to sequence mining. We then propose two constraint programming formulations. The first formulation introduces a new global constraint called exists-embedding. This formulation is the most efficient but does not support one type of constraint. To support such constraints, we develop a second formulation that is more general but incurs more overhead. Both formulations can use the projected database technique used in specialised algorithms. Experiments demonstrate the flexibility towards constraint-based settings and compare the approach to existing methods.Comment: In Integration of AI and OR Techniques in Constraint Programming (CPAIOR), 201

    Improved PrefixSpan Algorithm for Efficient Processing of Large Data

    Get PDF
    PrefixSpan (Prefix-projected Sequential pattern mining) algorithm is very well known algorithm for sequential data mining. It extracts the sequential patterns through pattern growth method. The algorithm performs very well for small datasets. As the size of datasets increases the overall time for finding the sequential patterns also get increased. The efficiency of PrefixSpan algorithm gets reduced while processing the large data. The cost of constructing the projected dataset is also huge which ultimately affect the memory utilization. DOI: 10.17762/ijritcc2321-8169.15072

    Web Page Recommendation Approach Using Weighted Sequential Patterns And Markov Model

    Get PDF
    Web page recommendation aims to predict the user2019;s navigation through the help of web usage mining techniques. Currently, researchers focus their attention to develop a web page recommendation algorithm using the well known pattern mining techniques. Here, we have presented a web page recommendation algorithm using weighted sequential patterns and markov model. To mine the weighted sequential pattern, we have modified the prefixspan algorithm incorporating the weightage constraints such as, spending time and recent visiting. Then, the weighted sequential patterns are utilized to construct the recommendation model using the Patricia trie-based tree structure. Finally, the recommendation of the current users is done with the help of markov model that is the probability theory enabling the reasoning and computation as intractable. For experimentation, the synthetic dataset is utilized to analyze the performance of W-Prefixspan algorithm as well as web page recommendation algorithm. From the results, the memory required for the W-prefixSpan algorithm is less than 50% of memory needed for PrefixSpan algorithm

    Bidirectional Growth based Mining and Cyclic Behaviour Analysis of Web Sequential Patterns

    Get PDF
    Web sequential patterns are important for analyzing and understanding users behaviour to improve the quality of service offered by the World Wide Web. Web Prefetching is one such technique that utilizes prefetching rules derived through Cyclic Model Analysis of the mined Web sequential patterns. The more accurate the prediction and more satisfying the results of prefetching if we use a highly efficient and scalable mining technique such as the Bidirectional Growth based Directed Acyclic Graph. In this paper, we propose a novel algorithm called Bidirectional Growth based mining Cyclic behavior Analysis of web sequential Patterns (BGCAP) that effectively combines these strategies to generate prefetching rules in the form of 2-sequence patterns with Periodicity and threshold of Cyclic Behaviour that can be utilized to effectively prefetch Web pages, thus reducing the users perceived latency. As BGCAP is based on Bidirectional pattern growth, it performs only (log n+1) levels of recursion for mining n Web sequential patterns. Our experimental results show that prefetching rules generated using BGCAP is 5-10 percent faster for different data sizes and 10-15% faster for a fixed data size than TD-Mine. In addition, BGCAP generates about 5-15 percent more prefetching rules than TD-Mine.Comment: 19 page
    • …
    corecore