Abstract — Concept drift is believed to be prevalent in most data gathered from naturally occurring processes and thus warrants research by the machine learning community. There are a myriad of approaches to concept drift handling which have been shown to handle concept drift with varying degrees of success. However, most approaches make the key assumption that the labelled data will be available at no labelling cost shortly after classification, an assumption which is often violated. The high labelling cost in many domains provides a strong motivation to reduce the number of labelled instances required to handle concept drift. Explicit detection approaches that do not require labelled instances to detect concept drift show great promise for achieving this. Our approach Confidence Distribution Batch Detection (CDBD) provides a signal correlated to changes in concept without using labelled data. We also show how this signal combined with a trigger and a rebuild policy can maintain classifier accuracy while using a limited amount of labelled data. Keywords-concept drift; explicit drift detection; labelling cost; classifier confidence; I
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.