72,561 research outputs found

    Algorithms for Stable Matching and Clustering in a Grid

    Full text link
    We study a discrete version of a geometric stable marriage problem originally proposed in a continuous setting by Hoffman, Holroyd, and Peres, in which points in the plane are stably matched to cluster centers, as prioritized by their distances, so that each cluster center is apportioned a set of points of equal area. We show that, for a discretization of the problem to an n×nn\times n grid of pixels with kk centers, the problem can be solved in time O(n2log5n)O(n^2 \log^5 n), and we experiment with two slower but more practical algorithms and a hybrid method that switches from one of these algorithms to the other to gain greater efficiency than either algorithm alone. We also show how to combine geometric stable matchings with a kk-means clustering algorithm, so as to provide a geometric political-districting algorithm that views distance in economic terms, and we experiment with weighted versions of stable kk-means in order to improve the connectivity of the resulting clusters.Comment: 23 pages, 12 figures. To appear (without the appendices) at the 18th International Workshop on Combinatorial Image Analysis, June 19-21, 2017, Plovdiv, Bulgari

    Squarepants in a Tree: Sum of Subtree Clustering and Hyperbolic Pants Decomposition

    Full text link
    We provide efficient constant factor approximation algorithms for the problems of finding a hierarchical clustering of a point set in any metric space, minimizing the sum of minimimum spanning tree lengths within each cluster, and in the hyperbolic or Euclidean planes, minimizing the sum of cluster perimeters. Our algorithms for the hyperbolic and Euclidean planes can also be used to provide a pants decomposition, that is, a set of disjoint simple closed curves partitioning the plane minus the input points into subsets with exactly three boundary components, with approximately minimum total length. In the Euclidean case, these curves are squares; in the hyperbolic case, they combine our Euclidean square pants decomposition with our tree clustering method for general metric spaces.Comment: 22 pages, 14 figures. This version replaces the proof of what is now Lemma 5.2, as the previous proof was erroneou

    Maximum Inner-Product Search using Tree Data-structures

    Full text link
    The problem of {\em efficiently} finding the best match for a query in a given set with respect to the Euclidean distance or the cosine similarity has been extensively studied in literature. However, a closely related problem of efficiently finding the best match with respect to the inner product has never been explored in the general setting to the best of our knowledge. In this paper we consider this general problem and contrast it with the existing best-match algorithms. First, we propose a general branch-and-bound algorithm using a tree data structure. Subsequently, we present a dual-tree algorithm for the case where there are multiple queries. Finally we present a new data structure for increasing the efficiency of the dual-tree algorithm. These branch-and-bound algorithms involve novel bounds suited for the purpose of best-matching with inner products. We evaluate our proposed algorithms on a variety of data sets from various applications, and exhibit up to five orders of magnitude improvement in query time over the naive search technique.Comment: Under submission in KDD 201
    corecore