1 research outputs found
k-Nearest Neighbors by Means of Sequence to Sequence Deep Neural Networks and Memory Networks
k-Nearest Neighbors is one of the most fundamental but effective
classification models. In this paper, we propose two families of models built
on a sequence to sequence model and a memory network model to mimic the
k-Nearest Neighbors model, which generate a sequence of labels, a sequence of
out-of-sample feature vectors and a final label for classification, and thus
they could also function as oversamplers. We also propose 'out-of-core'
versions of our models which assume that only a small portion of data can be
loaded into memory. Computational experiments show that our models on
structured datasets outperform k-Nearest Neighbors, a feed-forward neural
network, XGBoost, lightGBM, random forest and a memory network, due to the fact
that our models must produce additional output and not just the label. On image
and text datasets, the performance of our model is close to many
state-of-the-art deep models. As an oversampler on imbalanced datasets, the
sequence to sequence kNN model often outperforms Synthetic Minority
Over-sampling Technique and Adaptive Synthetic Sampling