12 research outputs found

    Learning Nearest Neighbor Graphs from Noisy Distance Samples

    Get PDF
    We consider the problem of learning the nearest neighbor graph of a dataset of n items. The metric is unknown, but we can query an oracle to obtain a noisy estimate of the distance between any pair of items. This framework applies to problem domains where one wants to learn people\u27s preferences from responses commonly modeled as noisy distance judgments. In this paper, we propose an active algorithm to find the graph with high probability and analyze its query complexity. In contrast to existing work that forces Euclidean structure, our method is valid for general metrics, assuming only symmetry and the triangle inequality. Furthermore, we demonstrate efficiency of our method empirically and theoretically, needing only O(n log(n)Δ-2) queries in favorable settings, where Δ-2 accounts for the effect of noise. Using crowd-sourced data collected for a subset of the UT Zappos50K dataset, we apply our algorithm to learn which shoes people believe are most similar and show that it beats both an active baseline and ordinal embedding

    Robust classification with feature selection using alternating minimization and Douglas-Rachford splitting method

    Get PDF
    This paper deals with supervised classification and feature selection. A classical approach is to project data on a low dimensional space with a strict control on sparsity. This results in an optimization problem minimizing the within sum of squares in the clusters (Frobenius norm) with an 1 penalty in order to promote sparsity. It is well known though that the Frobenius norm is not robust to outliers. In this paper, we propose an alternative approach with an 1 norm minimization both for the constraint and the loss function. Since the 1 criterion is only convex and not gradient Lipschitz, we advocate the use a Douglas-Rachford approach. We take advantage of the particular form of the cost and, using a change of variable, we provide a new efficient tailored primal Douglas-Rachford splitting algorithm. We also provide an efficient classifier in the projected space based on medoid modeling. The resulting algorithm, based on alternating minimization and primal Douglas-Rachford splitting, is coined ADRS. Experiments on biological data sets and computer vision dataset show that our method significantly improves the results obtained with a quadratic loss function

    Learning Nearest Neighbor Graphs from Noisy Distance Samples

    Get PDF
    We consider the problem of learning the nearest neighbor graph of a dataset of n items. The metric is unknown, but we can query an oracle to obtain a noisy estimate of the distance between any pair of items. This framework applies to problem domains where one wants to learn people's preferences from responses commonly modeled as noisy distance judgments. In this paper, we propose an active algorithm to find the graph with high probability and analyze its query complexity. In contrast to existing work that forces Euclidean structure, our method is valid for general metrics, assuming only symmetry and the triangle inequality. Furthermore, we demonstrate efficiency of our method empirically and theoretically, needing only O(n log(n)Delta^-2) queries in favorable settings, where Delta^-2 accounts for the effect of noise. Using crowd-sourced data collected for a subset of the UT Zappos50K dataset, we apply our algorithm to learn which shoes people believe are most similar and show that it beats both an active baseline and ordinal embedding.Comment: 21 total pages (8 main pages + appendices), 7 figures, submitted to NeurIPS 201
    corecore