Predictive services nowadays play an important role across all business
sectors. However, deployed machine learning models are challenged by changing
data streams over time which is described as concept drift. Prediction quality
of models can be largely influenced by this phenomenon. Therefore, concept
drift is usually handled by retraining of the model. However, current research
lacks a recommendation which data should be selected for the retraining of the
machine learning model. Therefore, we systematically analyze different data
selection strategies in this work. Subsequently, we instantiate our findings on
a use case in process mining which is strongly affected by concept drift. We
can show that we can improve accuracy from 0.5400 to 0.7010 with concept drift
handling. Furthermore, we depict the effects of the different data selection
strategies