220,999 research outputs found

    Sequential Patterns Post-processing for Structural Relation Patterns Mining

    Get PDF
    Sequential patterns mining is an important data-mining technique used to identify frequently observed sequential occurrence of items across ordered transactions over time. It has been extensively studied in the literature, and there exists a diversity of algorithms. However, more complex structural patterns are often hidden behind sequences. This article begins with the introduction of a model for the representation of sequential patterns—Sequential Patterns Graph—which motivates the search for new structural relation patterns. An integrative framework for the discovery of these patterns–Postsequential Patterns Mining–is then described which underpins the postprocessing of sequential patterns. A corresponding data-mining method based on sequential patterns postprocessing is proposed and shown to be effective in the search for concurrent patterns. From experiments conducted on three component algorithms, it is demonstrated that sequential patterns-based concurrent patterns mining provides an efficient method for structural knowledge discover

    Problem-Solving Knowledge Mining from Users’\ud Actions in an Intelligent Tutoring System

    Get PDF
    In an intelligent tutoring system (ITS), the domain expert should provide\ud relevant domain knowledge to the tutor so that it will be able to guide the\ud learner during problem solving. However, in several domains, this knowledge is\ud not predetermined and should be captured or learned from expert users as well as\ud intermediate and novice users. Our hypothesis is that, knowledge discovery (KD)\ud techniques can help to build this domain intelligence in ITS. This paper proposes\ud a framework to capture problem-solving knowledge using a promising approach\ud of data and knowledge discovery based on a combination of sequential pattern\ud mining and association rules discovery techniques. The framework has been implemented\ud and is used to discover new meta knowledge and rules in a given domain\ud which then extend domain knowledge and serve as problem space allowing\ud the intelligent tutoring system to guide learners in problem-solving situations.\ud Preliminary experiments have been conducted using the framework as an alternative\ud to a path-planning problem solver in CanadarmTutor

    Market Basket Analysis in the Financial Sector – A Customer Centric Approach

    Get PDF
    Organizations often struggle with their efforts to implement data mining projects successfully. This is often due to the fact that they are influenced by success stories of others that glamorize the outcome of successful initiatives, while understating the persistent rigour and diligence required. Although process models exist for the knowledge discovery process their focus is often on outlining the activities that must be done and not on describing how they should be done. While there is some research in addressing how to carry out the various tasks in the phases, the data preparation phase is thought to be the most challenging and is often described as an art rather than a science. In this study we apply a multi-phased integrated knowledge discovery and data mining process model (IKDDM) to a data set from the financial sector and a present a new approach to data preparation for Sequential Patterns (SP) that facilitated the identification of customer focused patterns rather than products focussed patterns in the modelling phase

    Efficient K-Mean Algorithm for Large Dataset

    Get PDF
    The term data mining is used to discover knowledge from large amount of data. For knowledge discovery many software haven developed, that is known as data mining tools these are statistical, machine learning, And neural networks. K-means and K-medoids are widely used simplest partition based unsupervised learning algorithms that solve the well known clustering problem. The procedure follows a simple and easy way to classify a given data set through a certain number of clusters; technically, data mining is the process of finding correlations or patterns among dozens of fields in large relational databases. Stored data is used to locate data in predetermined groups called class. Data items are grouped according to logical relationships or consumer preferences called cluster. Data can be mined to identify association. Data is mined to anticipate behavior patterns and trends called sequential patterns

    Data Mining Approach for Amino Acid Sequence Classification

    Get PDF
    Computerized applications are employed all around the world, an enormous amount of data is collected. The essential information contained in large amounts of data is attracting scholars from a variety of disciplines to examine how to extract the hidden knowledge inside them. The technique of obtaining or mining usable and valuable knowledge from enormous amounts of data is known as data mining. Text mining, picture mining, sequential pattern mining, web mining, and so on are all examples of data mining fields. Sequencing mining is one of the most important technologies in this field, as it aids in the discovery of sequential connections in data. Sequence mining is used in a variety of applications, including customers' buying trends analysis, web access trends analysis, atmospheric observation, amino acid sequences, Gene sequencing, and so on. Sequence mining techniques are utilized in protein and DNA analysis for sequence alignment, pattern searching, and pattern categorization. Researchers are exhibiting an interest in the subject of amino acid sequence categorization in the field of amino acid sequence analysis. It has the ability to find recurrent patterns in homologous proteins. This study describes the numerous methods used by numerous studies to categories proteins and gives an overview of the most important sequence classification techniques

    Causal discovery with Point of Sales data

    Full text link
    [ES] GfK owns the world’s largest retail panel within the tech and durable good industries. The panel consists of weekly Point of Sales (PoS) data, such as price and sales units data at store level. From PoS data and other data, GfK derives insights and indicators to generate recommendations with regards to e.g. pricing, distribution or assortment optimization of tech and durable good products. By combining PoS data and business domain knowledge, we show how causal discovery can be done by applying the method of invariant causal prediction (ICP). Causal discovery, in essence, means to learn the actual cause and effect relations between the involved variables from data. After finding such a causal structure, one can try to further specify the function classes between those identified cause-effect pairs. Such a model could then be used to predict under intervention (predict when the underlying data generating mechanism changes) and to optimize and calculate counterfactual effects, given current and past data. In our development, we combine recent achievements in causal discovery research with PoS data structure and business domain knowledge (in the form of business rules). The key delivery of this presentation is to show fundamental differences between a causal model and a machine learning model. We further explain the advantages of combining a causal model with a machine learning model and why causal information is key to provide explainable prescriptive analytics. Furthermore, we demonstrate how to apply ICP (for sequential data) to context-specific PoS data to achieve improved models for sales unit predictions. As a result, we obtain a model for sales units that is on the one hand derived from observed data and on the other hand driven by business knowledge. Such a refined prediction model could then be used to stabilize and support other machine learning models that can be used for generating prescriptive analytics.Gmeiner, P. (2020). Causal discovery with Point of Sales data. Editorial Universitat Politècnica de València. http://hdl.handle.net/10251/149590OC
    • …
    corecore