32,971 research outputs found
Handling Concept Drift for Predictions in Business Process Mining
Predictive services nowadays play an important role across all business
sectors. However, deployed machine learning models are challenged by changing
data streams over time which is described as concept drift. Prediction quality
of models can be largely influenced by this phenomenon. Therefore, concept
drift is usually handled by retraining of the model. However, current research
lacks a recommendation which data should be selected for the retraining of the
machine learning model. Therefore, we systematically analyze different data
selection strategies in this work. Subsequently, we instantiate our findings on
a use case in process mining which is strongly affected by concept drift. We
can show that we can improve accuracy from 0.5400 to 0.7010 with concept drift
handling. Furthermore, we depict the effects of the different data selection
strategies
Handling Concept Drift for Predictions in Business Process Mining
Predictive services nowadays play an important role across all business sectors. However, deployed machine learning models are challenged by changing data streams over time which is described as concept drift. Prediction quality of models can be largely influenced by this phenomenon. Therefore, concept drift is usually handled by retraining of the model. However, current research lacks a recommendation which data should be selected for the retraining of the machine learning model. Therefore, we systematically analyze different data selection strategies in this work. Subsequently, we instantiate our findings on a use case in process mining which is strongly affected by concept drift. We can show that we can improve accuracy from 0.5400 to 0.7010 with concept drift handling. Furthermore, we depict the effects of the different data selection strategies
Measuring, Predicting and Visualizing Short-Term Change in Word Representation and Usage in VKontakte Social Network
Language in social media is extremely dynamic: new words emerge, trend and
disappear, while the meaning of existing words can fluctuate over time. Such
dynamics are especially notable during a period of crisis. This work addresses
several important tasks of measuring, visualizing and predicting short term
text representation shift, i.e. the change in a word's contextual semantics,
and contrasting such shift with surface level word dynamics, or concept drift,
observed in social media streams. Unlike previous approaches on learning word
representations from text, we study the relationship between short-term concept
drift and representation shift on a large social media corpus - VKontakte posts
in Russian collected during the Russia-Ukraine crisis in 2014-2015. Our novel
contributions include quantitative and qualitative approaches to (1) measure
short-term representation shift and contrast it with surface level concept
drift; (2) build predictive models to forecast short-term shifts in meaning
from previous meaning as well as from concept drift; and (3) visualize
short-term representation shift for example keywords to demonstrate the
practical use of our approach to discover and track meaning of newly emerging
terms in social media. We show that short-term representation shift can be
accurately predicted up to several weeks in advance. Our unique approach to
modeling and visualizing word representation shifts in social media can be used
to explore and characterize specific aspects of the streaming corpus during
crisis events and potentially improve other downstream classification tasks
including real-time event detection
Genetic programming-based regression for temporal data
Various machine learning techniques exist to perform regression on temporal data with concept drift occurring. However, there are numerous nonstationary environments where these techniques may fail to either track or detect the changes. This study develops a genetic programming-based predictive model for temporal data with a numerical target that tracks changes in a dataset due to concept drift. When an environmental change is evident, the proposed algorithm reacts to the change by clustering the data and then inducing nonlinear models that describe generated clusters. Nonlinear models become terminal nodes of genetic programming model trees. Experiments were carried out using seven nonstationary datasets and the obtained results suggest that the proposed model yields high adaptation rates and accuracy to several types of concept drifts. Future work will consider strengthening the adaptation to concept drift and the fast implementation of genetic programming on GPUs to provide fast learning for high-speed temporal data.http://link.springer.com/journal/107102022-05-09hj2021Computer Scienc
A heterogeneous online learning ensemble for non-stationary environments
Learning in non-stationary environments is a challenging task which requires the updating of predictive models to deal with changes in the underlying probability distribution of the problem, i.e., dealing with concept drift. Most work in this area is concerned with updating the learning system so that it can quickly recover from concept drift, while little work has been dedicated to investigating what type of predictive model is most suitable at any given time. This paper aims to investigate the benefits of online model selection for predictive modelling in non-stationary environments. A novel heterogeneous ensemble approach is proposed to intelligently switch between different types of base models in an ensemble to increase the predictive performance of online learning in non-stationary environments. This approach is Heterogeneous Dynamic Weighted Majority (HDWM). It makes use of “seed” learners of different types to maintain ensemble diversity, overcoming problems of existing dynamic ensembles that may undergo loss of diversity due to the exclusion of base learners. The algorithm has been evaluated on artificial and real-world data streams against existing well-known approaches such as a heterogeneous Weighted Majority Algorithm (WMA) and a homogeneous Dynamic Weighted Majority (DWM). The results show that HDWM performed significantly better than WMA in non-stationary environments. Also, when recurring concept drifts were present, the predictive performance of HDWM showed an improvement over DWM
Incremental Predictive Process Monitoring: How to Deal with the Variability of Real Environments
A characteristic of existing predictive process monitoring techniques is to
first construct a predictive model based on past process executions, and then
use it to predict the future of new ongoing cases, without the possibility of
updating it with new cases when they complete their execution. This can make
predictive process monitoring too rigid to deal with the variability of
processes working in real environments that continuously evolve and/or exhibit
new variant behaviors over time. As a solution to this problem, we propose the
use of algorithms that allow the incremental construction of the predictive
model. These incremental learning algorithms update the model whenever new
cases become available so that the predictive model evolves over time to fit
the current circumstances. The algorithms have been implemented using different
case encoding strategies and evaluated on a number of real and synthetic
datasets. The results provide a first evidence of the potential of incremental
learning strategies for predicting process monitoring in real environments, and
of the impact of different case encoding strategies in this setting
Predictive Maintenance on the Machining Process and Machine Tool
This paper presents the process required to implement a data driven Predictive Maintenance (PdM) not only in the machine decision making, but also in data acquisition and processing. A short review of the different approaches and techniques in maintenance is given. The main contribution of this paper is a solution for the predictive maintenance problem in a real machining process. Several steps are needed to reach the solution, which are carefully explained. The obtained results show that the Preventive Maintenance (PM), which was carried out in a real machining process, could be changed into a PdM approach. A decision making application was developed to provide a visual analysis of the Remaining Useful Life (RUL) of the machining tool. This work is a proof of concept of the methodology presented in one process, but replicable for most of the process for serial productions of pieces
- …