Learning data representations for robust neighbour-based inference

Abstract

The recently proposed Boundary Trees algorithm by Mathy et al. (2015) enables fast neighbour-based classification, regression and retrieval in large datasets. While boundary trees use a Euclidean measure of similarity, the Differentiable Boundary Tree (DBT) algorithm by Zoran et al. (2017) was introduced to learn low-dimensional representations of complex input data, on which semantic similarity can be measured, to train boundary trees. The DBT approach contains a few limitations that prevents it from scaling to large datasets. In this thesis, we introduce Differentiable Boundary Sets, an algorithm that overcomes the computational issues of the DBT scheme and also improves its classification accuracy and data representability. Our algorithm is efficiently implementable with existing tools and offers a significant reduction in training time. We test and compare the proposed algorithm on the well known MNIST handwritten digits dataset and the newer Fashion-MNIST dataset by Xiao et al. (2017).M.A.S

    Similar works