352 research outputs found

    Ensemble-based prediction of business processes bottlenecks with recurrent concept drifts

    Get PDF
    Bottleneck prediction is an important sub-task of process mining that aims at optimizing the discovered process models by avoiding such congestions. This paper discusses an ongoing work on incorporating recurrent concept drift in bottleneck prediction when applied to a real-world scenario. In the field of process mining, we develop a method of predicting whether and which bottlenecks will likely appear based on data known before a case starts. We next introduce GRAEC, a carefully-designed weighting mechanism to deal with concept drifts. The weighting decays over time and is extendable to adapt to seasonality in data. The methods are then applied to a simulation, and an invoicing process in the field of installation services in real-world settings. The results show an improvement to prediction accuracy compared to retraining a model on the most recent data.</p

    Adaptive Algorithms For Classification On High-Frequency Data Streams: Application To Finance

    Get PDF
    Mención Internacional en el título de doctorIn recent years, the problem of concept drift has gained importance in the financial domain. The succession of manias, panics and crashes have stressed the nonstationary nature and the likelihood of drastic structural changes in financial markets. The most recent literature suggests the use of conventional machine learning and statistical approaches for this. However, these techniques are unable or slow to adapt to non-stationarities and may require re-training over time, which is computationally expensive and brings financial risks. This thesis proposes a set of adaptive algorithms to deal with high-frequency data streams and applies these to the financial domain. We present approaches to handle different types of concept drifts and perform predictions using up-to-date models. These mechanisms are designed to provide fast reaction times and are thus applicable to high-frequency data. The core experiments of this thesis are based on the prediction of the price movement direction at different intraday resolutions in the SPDR S&P 500 exchange-traded fund. The proposed algorithms are benchmarked against other popular methods from the data stream mining literature and achieve competitive results. We believe that this thesis opens good research prospects for financial forecasting during market instability and structural breaks. Results have shown that our proposed methods can improve prediction accuracy in many of these scenarios. Indeed, the results obtained are compatible with ideas against the efficient market hypothesis. However, we cannot claim that we can beat consistently buy and hold; therefore, we cannot reject it.Programa de Doctorado en Ciencia y Tecnología Informática por la Universidad Carlos III de MadridPresidente: Gustavo Recio Isasi.- Secretario: Pedro Isasi Viñuela.- Vocal: Sandra García Rodrígue

    CONDA-PM -- A Systematic Review and Framework for Concept Drift Analysis in Process Mining

    Get PDF
    Business processes evolve over time to adapt to changing business environments. This requires continuous monitoring of business processes to gain insights into whether they conform to the intended design or deviate from it. The situation when a business process changes while being analysed is denoted as Concept Drift. Its analysis is concerned with studying how a business process changes, in terms of detecting and localising changes and studying the effects of the latter. Concept drift analysis is crucial to enable early detection and management of changes, that is, whether to promote a change to become part of an improved process, or to reject the change and make decisions to mitigate its effects. Despite its importance, there exists no comprehensive framework for analysing concept drift types, affected process perspectives, and granularity levels of a business process. This article proposes the CONcept Drift Analysis in Process Mining (CONDA-PM) framework describing phases and requirements of a concept drift analysis approach. CONDA-PM was derived from a Systematic Literature Review (SLR) of current approaches analysing concept drift. We apply the CONDA-PM framework on current approaches to concept drift analysis and evaluate their maturity. Applying CONDA-PM framework highlights areas where research is needed to complement existing efforts.Comment: 45 pages, 11 tables, 13 figure

    A Deep Learning Approach to Business Process Mining

    Get PDF
    Competing and evolving markets force organisations to continuously monitor, evaluate, and optimise their business processes. To do the task at scale, organisations often turn to automatic mining of process execution logs constantly generated by various information systems. Many open-source and commercial tools have been developed in recent years to help organisations perform various process mining tasks using process execution logs (often called event logs), such as process discovery, conformance checking, and detecting drifts in processes. Compared to traditional process mining techniques such as Petri nets and Business Process Model and Notation (BPMN), deep learning methods such as Recurrent Neural Networks and Long Short-Term Memory (LSTM) in particular have proven to achieve better performance in terms of accuracy and generalising ability when predicting sequences of activities performed as part of business processes based on event logs. However, unlike traditional network-based process mining techniques that can be used to visually present all activity sequences of the discovered business process, existing deep learning-based methods for process mining lack a mechanism explaining how the activity sequence predictions are made. To address this limitation, this thesis proposes an extensible process mining solution that combines the benefits of interpretable graph-based methods and more accurate but implicit deep learning methods. The main contributions of this research are: (i) building an LSTM model for predicting business process activity sequences from event logs that outperforms existing state-of-the-art deep learning solutions; (ii) proposing a graph-based approach to explaining the decision-making process of the LSTM model when predicting business process activity sequences; and (iii) developing methods for detecting and localising sudden concept drift in event logs (i.e., offline) and event streams (i.e., online) using deep learning and graph-based approaches. The proposed methods have been extensively evaluated by conducting experiments using real-life and artificial event logs and have been demonstrated to outperform existing state-of-the-art solutions in many cases

    Data analytics 2016: proceedings of the fifth international conference on data analytics

    Get PDF

    Best of Both Worlds: Combining Predictive Power with Interpretable and Explainable Results for Patient Pathway Prediction

    Get PDF
    Proactively analyzing patient pathways can help healthcare providers to anticipate treatment-related risks, detect undesired outcomes, and allocate resources quickly. For this purpose, modern methods from the field of predictive business process monitoring can be applied to create data-driven models that capture patterns from past behavior to provide predictions about running process instances. Recent methods increasingly focus on deep neural networks (DNN) due to their superior prediction performances and their independence from process knowledge. However, DNNs generally have the disadvantage of showing black-box characteristics, which hampers the dissemination in critical environments such as healthcare. To this end, we propose the design of HIXPred, a novel artifact combining predictive power with explainable results for patient pathway predictions. We instantiate HIXPred and apply it to a real-life healthcare use case for evaluation and demonstration purposes and conduct interviews with medical experts. Our results confirm high predictive performance while ensuring sufficient interpretability and explainability to provide comprehensible decision support

    Process cubes : slicing, dicing, rolling up and drilling down event data for process mining

    Get PDF
    Recent breakthroughs in process mining research make it possible to discover, analyze, and improve business processes based on event data. The growth of event data provides many opportunities but also imposes new challenges. Process mining is typically done for an isolated well-defined process in steady-state. However, the boundaries of a process may be fluid and there is a need to continuously view event data from different angles. This paper proposes the notion of process cubes where events and process models are organized using different dimensions. Each cell in the process cube corresponds to a set of events and can be used to discover a process model, to check conformance with respect to some process model, or to discover bottlenecks. The idea is related to the well-known OLAP (Online Analytical Processing) data cubes and associated operations such as slice, dice, roll-up, and drill-down. However, there are also signicant differences because of the process-related nature of event data. For example, process discovery based on events is incomparable to computing the average or sum over a set of numerical values. Moreover, dimensions related to process instances (e.g. cases are split into gold and silver customers), subprocesses (e.g. acquisition versus delivery), organizational entities (e.g. backoffice versus frontoffice), and time (e.g., 2010, 2011, 2012, and 2013) are semantically different and it is challenging to slice, dice, roll-up, and drill-down process mining results efficiently.Keywords: OLAP, Process Mining, Big Data, Process Discovery, Conformance Checkin
    corecore