1 research outputs found

    Towards a real-time unsupervised estimation of predictive model degradation

    No full text
    Automating predictive machine learning entails the capability of properly triggering the update of the trained models. To this aim, the degradation of predictive models has to be continuously evaluated over time to detect data distribution drifts between the original training set and the new data. Traditionally, prediction performance is used as a degradation metric. However, prediction quality indices require ground-truth class labels to be known for the newly classified data, making them unsuitable for real-time applications, as ground-truth labels might be totally absent or be available only later. In this paper, we propose a novel unsupervised methodology to automatically detect prediction-quality degradation of machine learning models. Thanks to the unsupervised approach and a novel scalable estimation technique, we provide an effective and efficient solution to the above-mentioned problem with soft real-time constraints. Specifically, our approach is able to detect class-based concept drift, i.e., when new data contain samples that do not fit the set of class labels known by the currently-trained predictive model. Experiments on synthetic and real-world public datasets show the effectiveness of the proposed methodology in automatically detecting and describing concept drift caused by changes in the class-label data distributions. Thanks to its scalability performance, the proposed approach is suitable for soft real-time applications such as predictive maintenance, Industry 4.0, and text mining
    corecore