32,971 research outputs found

    Handling Concept Drift for Predictions in Business Process Mining

    Get PDF
    Predictive services nowadays play an important role across all business sectors. However, deployed machine learning models are challenged by changing data streams over time which is described as concept drift. Prediction quality of models can be largely influenced by this phenomenon. Therefore, concept drift is usually handled by retraining of the model. However, current research lacks a recommendation which data should be selected for the retraining of the machine learning model. Therefore, we systematically analyze different data selection strategies in this work. Subsequently, we instantiate our findings on a use case in process mining which is strongly affected by concept drift. We can show that we can improve accuracy from 0.5400 to 0.7010 with concept drift handling. Furthermore, we depict the effects of the different data selection strategies

    Handling Concept Drift for Predictions in Business Process Mining

    Get PDF
    Predictive services nowadays play an important role across all business sectors. However, deployed machine learning models are challenged by changing data streams over time which is described as concept drift. Prediction quality of models can be largely influenced by this phenomenon. Therefore, concept drift is usually handled by retraining of the model. However, current research lacks a recommendation which data should be selected for the retraining of the machine learning model. Therefore, we systematically analyze different data selection strategies in this work. Subsequently, we instantiate our findings on a use case in process mining which is strongly affected by concept drift. We can show that we can improve accuracy from 0.5400 to 0.7010 with concept drift handling. Furthermore, we depict the effects of the different data selection strategies

    Measuring, Predicting and Visualizing Short-Term Change in Word Representation and Usage in VKontakte Social Network

    Full text link
    Language in social media is extremely dynamic: new words emerge, trend and disappear, while the meaning of existing words can fluctuate over time. Such dynamics are especially notable during a period of crisis. This work addresses several important tasks of measuring, visualizing and predicting short term text representation shift, i.e. the change in a word's contextual semantics, and contrasting such shift with surface level word dynamics, or concept drift, observed in social media streams. Unlike previous approaches on learning word representations from text, we study the relationship between short-term concept drift and representation shift on a large social media corpus - VKontakte posts in Russian collected during the Russia-Ukraine crisis in 2014-2015. Our novel contributions include quantitative and qualitative approaches to (1) measure short-term representation shift and contrast it with surface level concept drift; (2) build predictive models to forecast short-term shifts in meaning from previous meaning as well as from concept drift; and (3) visualize short-term representation shift for example keywords to demonstrate the practical use of our approach to discover and track meaning of newly emerging terms in social media. We show that short-term representation shift can be accurately predicted up to several weeks in advance. Our unique approach to modeling and visualizing word representation shifts in social media can be used to explore and characterize specific aspects of the streaming corpus during crisis events and potentially improve other downstream classification tasks including real-time event detection

    Genetic programming-based regression for temporal data

    Get PDF
    Various machine learning techniques exist to perform regression on temporal data with concept drift occurring. However, there are numerous nonstationary environments where these techniques may fail to either track or detect the changes. This study develops a genetic programming-based predictive model for temporal data with a numerical target that tracks changes in a dataset due to concept drift. When an environmental change is evident, the proposed algorithm reacts to the change by clustering the data and then inducing nonlinear models that describe generated clusters. Nonlinear models become terminal nodes of genetic programming model trees. Experiments were carried out using seven nonstationary datasets and the obtained results suggest that the proposed model yields high adaptation rates and accuracy to several types of concept drifts. Future work will consider strengthening the adaptation to concept drift and the fast implementation of genetic programming on GPUs to provide fast learning for high-speed temporal data.http://link.springer.com/journal/107102022-05-09hj2021Computer Scienc

    A heterogeneous online learning ensemble for non-stationary environments

    Get PDF
    Learning in non-stationary environments is a challenging task which requires the updating of predictive models to deal with changes in the underlying probability distribution of the problem, i.e., dealing with concept drift. Most work in this area is concerned with updating the learning system so that it can quickly recover from concept drift, while little work has been dedicated to investigating what type of predictive model is most suitable at any given time. This paper aims to investigate the benefits of online model selection for predictive modelling in non-stationary environments. A novel heterogeneous ensemble approach is proposed to intelligently switch between different types of base models in an ensemble to increase the predictive performance of online learning in non-stationary environments. This approach is Heterogeneous Dynamic Weighted Majority (HDWM). It makes use of “seed” learners of different types to maintain ensemble diversity, overcoming problems of existing dynamic ensembles that may undergo loss of diversity due to the exclusion of base learners. The algorithm has been evaluated on artificial and real-world data streams against existing well-known approaches such as a heterogeneous Weighted Majority Algorithm (WMA) and a homogeneous Dynamic Weighted Majority (DWM). The results show that HDWM performed significantly better than WMA in non-stationary environments. Also, when recurring concept drifts were present, the predictive performance of HDWM showed an improvement over DWM

    Incremental Predictive Process Monitoring: How to Deal with the Variability of Real Environments

    Full text link
    A characteristic of existing predictive process monitoring techniques is to first construct a predictive model based on past process executions, and then use it to predict the future of new ongoing cases, without the possibility of updating it with new cases when they complete their execution. This can make predictive process monitoring too rigid to deal with the variability of processes working in real environments that continuously evolve and/or exhibit new variant behaviors over time. As a solution to this problem, we propose the use of algorithms that allow the incremental construction of the predictive model. These incremental learning algorithms update the model whenever new cases become available so that the predictive model evolves over time to fit the current circumstances. The algorithms have been implemented using different case encoding strategies and evaluated on a number of real and synthetic datasets. The results provide a first evidence of the potential of incremental learning strategies for predicting process monitoring in real environments, and of the impact of different case encoding strategies in this setting

    Predictive Maintenance on the Machining Process and Machine Tool

    Get PDF
    This paper presents the process required to implement a data driven Predictive Maintenance (PdM) not only in the machine decision making, but also in data acquisition and processing. A short review of the different approaches and techniques in maintenance is given. The main contribution of this paper is a solution for the predictive maintenance problem in a real machining process. Several steps are needed to reach the solution, which are carefully explained. The obtained results show that the Preventive Maintenance (PM), which was carried out in a real machining process, could be changed into a PdM approach. A decision making application was developed to provide a visual analysis of the Remaining Useful Life (RUL) of the machining tool. This work is a proof of concept of the methodology presented in one process, but replicable for most of the process for serial productions of pieces
    • …
    corecore