unknown

Adding diversity to rank examples in anytime nearest neighbor classification

Abstract

In the last decade we have witnessed a huge increase of interest in data stream learning algorithms. A stream is na ordered sequence of data records. It is characterized by properties such as the potentially infinite and rapid flow of instances. However, a property that is common to various application domains and is frequently disregarded is the very high fluctuating data rates. In domains with fluctuating data rates, the events do not occur with a fixed frequency. This imposes an additional\ud challenge for the classifiers since the next event can occur at any time after the previous one. Anytime classification provides a very convenient approach for fluctuating data rates. In summary, an anytime classifier can be interrupted at any time before its completion and still be able to provide an intermediate solution. The popular k-nearest neighbor (k-NN) classifier can be easily made anytime by introducing a ranking of the training examples. A classification is achieved by scanning the training examples according to this ranking. In this paper, we show how the\ud current state-of-the-art k-NN anytime classifier can be made more accurate by introducing diversity in the training set ranking. Our results show that, with this simple modification, the performance of the anytime version of the k-NN algorithm is consistently improved for a large number of datasets

    Similar works