25,653 research outputs found

    Nearest Neighbour with Bandit Feedback

    Full text link
    In this paper we adapt the nearest neighbour rule to the contextual bandit problem. Our algorithm handles the fully adversarial setting in which no assumptions at all are made about the data-generation process. When combined with a sufficiently fast data-structure for (perhaps approximate) adaptive nearest neighbour search, such as a navigating net, our algorithm is extremely efficient - having a per trial running time polylogarithmic in both the number of trials and actions, and taking only quasi-linear space

    Learning to hash for large scale image retrieval

    Get PDF
    This thesis is concerned with improving the effectiveness of nearest neighbour search. Nearest neighbour search is the problem of finding the most similar data-points to a query in a database, and is a fundamental operation that has found wide applicability in many fields. In this thesis the focus is placed on hashing-based approximate nearest neighbour search methods that generate similar binary hashcodes for similar data-points. These hashcodes can be used as the indices into the buckets of hashtables for fast search. This work explores how the quality of search can be improved by learning task specific binary hashcodes. The generation of a binary hashcode comprises two main steps carried out sequentially: projection of the image feature vector onto the normal vectors of a set of hyperplanes partitioning the input feature space followed by a quantisation operation that uses a single threshold to binarise the resulting projections to obtain the hashcodes. The degree to which these operations preserve the relative distances between the datapoints in the input feature space has a direct influence on the effectiveness of using the resulting hashcodes for nearest neighbour search. In this thesis I argue that the retrieval effectiveness of existing hashing-based nearest neighbour search methods can be increased by learning the thresholds and hyperplanes based on the distribution of the input data. The first contribution is a model for learning multiple quantisation thresholds. I demonstrate that the best threshold positioning is projection specific and introduce a novel clustering algorithm for threshold optimisation. The second contribution extends this algorithm by learning the optimal allocation of quantisation thresholds per hyperplane. In doing so I argue that some hyperplanes are naturally more effective than others at capturing the distribution of the data and should therefore attract a greater allocation of quantisation thresholds. The third contribution focuses on the complementary problem of learning the hashing hyperplanes. I introduce a multi-step iterative model that, in the first step, regularises the hashcodes over a data-point adjacency graph, which encourages similar data-points to be assigned similar hashcodes. In the second step, binary classifiers are learnt to separate opposing bits with maximum margin. This algorithm is extended to learn hyperplanes that can generate similar hashcodes for similar data-points in two different feature spaces (e.g. text and images). Individually the performance of these algorithms is often superior to competitive baselines. I unify my contributions by demonstrating that learning hyperplanes and thresholds as part of the same model can yield an additive increase in retrieval effectiveness

    Inexact Gradient Projection and Fast Data Driven Compressed Sensing

    Get PDF
    We study convergence of the iterative projected gradient (IPG) algorithm for arbitrary (possibly nonconvex) sets and when both the gradient and projection oracles are computed approximately. We consider different notions of approximation of which we show that the Progressive Fixed Precision (PFP) and the (1+ϵ)(1+\epsilon)-optimal oracles can achieve the same accuracy as for the exact IPG algorithm. We show that the former scheme is also able to maintain the (linear) rate of convergence of the exact algorithm, under the same embedding assumption. In contrast, the (1+ϵ)(1+\epsilon)-approximate oracle requires a stronger embedding condition, moderate compression ratios and it typically slows down the convergence. We apply our results to accelerate solving a class of data driven compressed sensing problems, where we replace iterative exhaustive searches over large datasets by fast approximate nearest neighbour search strategies based on the cover tree data structure. For datasets with low intrinsic dimensions our proposed algorithm achieves a complexity logarithmic in terms of the dataset population as opposed to the linear complexity of a brute force search. By running several numerical experiments we conclude similar observations as predicted by our theoretical analysis

    Deep Metric Learning and Image Classification with Nearest Neighbour Gaussian Kernels

    Full text link
    We present a Gaussian kernel loss function and training algorithm for convolutional neural networks that can be directly applied to both distance metric learning and image classification problems. Our method treats all training features from a deep neural network as Gaussian kernel centres and computes loss by summing the influence of a feature's nearby centres in the feature embedding space. Our approach is made scalable by treating it as an approximate nearest neighbour search problem. We show how to make end-to-end learning feasible, resulting in a well formed embedding space, in which semantically related instances are likely to be located near one another, regardless of whether or not the network was trained on those classes. Our approach outperforms state-of-the-art deep metric learning approaches on embedding learning challenges, as well as conventional softmax classification on several datasets.Comment: Accepted in the International Conference on Image Processing (ICIP) 2018. Formerly titled Nearest Neighbour Radial Basis Function Solvers for Deep Neural Network
    • …
    corecore