65,898 research outputs found

    An algorithm and a core set result for the weighted euclidean one-center problem

    Get PDF
    Given a set A of m points in n-dimensional space with corresponding positive weights, the weighted Euclidean one-center problem, which is a generalization of the minimum enclosing ball problem, involves the computation of a point c A n that minimizes the maximum weighted Euclidean distance from c A to each point in A In this paper, given Īµ > 0, we propose and analyze an algorithm that computes a (1 + Īµ)-approximate solution to the weighted Euclidean one-center problem. Our algorithm explicitly constructs a small subset X āŠ† A, called an Īµ-core set of A, for which the optimal solution of the corresponding weighted Euclidean one-center problem is a close approximation to that of A. In addition, we establish that \X\ depends only on Īµ and on the ratio of the smallest and largest weights, but is independent of the number of points m and the dimension n. This result subsumes and generalizes the previously known core set results for the minimum enclosing ball problem. Our algorithm computes a (1 + Īµ)-approximate solution to the weighted Euclidean one-center problem for A in O(mn\X\) arithmetic operations. Our computational results indicate that the size of the Īµ-core set computed by the algorithm is, in general, significantly smaller than the theoretical worst-case estimate, which contributes to the efficiency of the algorithm, especially for large-scale instances. We shed some light on the possible reasons for this discrepancy between the theoretical estimate and the practical performance. Ā© 2009 Informs

    An interior point algorithm for minimum sum-of-squares clustering

    Get PDF
    Copyright @ 2000 SIAM PublicationsAn exact algorithm is proposed for minimum sum-of-squares nonhierarchical clustering, i.e., for partitioning a given set of points from a Euclidean m-space into a given number of clusters in order to minimize the sum of squared distances from all points to the centroid of the cluster to which they belong. This problem is expressed as a constrained hyperbolic program in 0-1 variables. The resolution method combines an interior point algorithm, i.e., a weighted analytic center column generation method, with branch-and-bound. The auxiliary problem of determining the entering column (i.e., the oracle) is an unconstrained hyperbolic program in 0-1 variables with a quadratic numerator and linear denominator. It is solved through a sequence of unconstrained quadratic programs in 0-1 variables. To accelerate resolution, variable neighborhood search heuristics are used both to get a good initial solution and to solve quickly the auxiliary problem as long as global optimality is not reached. Estimated bounds for the dual variables are deduced from the heuristic solution and used in the resolution process as a trust region. Proved minimum sum-of-squares partitions are determined for the rst time for several fairly large data sets from the literature, including Fisher's 150 iris.This research was supported by the Fonds National de la Recherche Scientifique Suisse, NSERC-Canada, and FCAR-Quebec

    Algorithms for Stable Matching and Clustering in a Grid

    Full text link
    We study a discrete version of a geometric stable marriage problem originally proposed in a continuous setting by Hoffman, Holroyd, and Peres, in which points in the plane are stably matched to cluster centers, as prioritized by their distances, so that each cluster center is apportioned a set of points of equal area. We show that, for a discretization of the problem to an nƗnn\times n grid of pixels with kk centers, the problem can be solved in time O(n2logā”5n)O(n^2 \log^5 n), and we experiment with two slower but more practical algorithms and a hybrid method that switches from one of these algorithms to the other to gain greater efficiency than either algorithm alone. We also show how to combine geometric stable matchings with a kk-means clustering algorithm, so as to provide a geometric political-districting algorithm that views distance in economic terms, and we experiment with weighted versions of stable kk-means in order to improve the connectivity of the resulting clusters.Comment: 23 pages, 12 figures. To appear (without the appendices) at the 18th International Workshop on Combinatorial Image Analysis, June 19-21, 2017, Plovdiv, Bulgari

    Faster Clustering via Preprocessing

    Full text link
    We examine the efficiency of clustering a set of points, when the encompassing metric space may be preprocessed in advance. In computational problems of this genre, there is a first stage of preprocessing, whose input is a collection of points MM; the next stage receives as input a query set QāŠ‚MQ\subset M, and should report a clustering of QQ according to some objective, such as 1-median, in which case the answer is a point aāˆˆMa\in M minimizing āˆ‘qāˆˆQdM(a,q)\sum_{q\in Q} d_M(a,q). We design fast algorithms that approximately solve such problems under standard clustering objectives like pp-center and pp-median, when the metric MM has low doubling dimension. By leveraging the preprocessing stage, our algorithms achieve query time that is near-linear in the query size n=āˆ£Qāˆ£n=|Q|, and is (almost) independent of the total number of points m=āˆ£Māˆ£m=|M|.Comment: 24 page

    New Frameworks for Offline and Streaming Coreset Constructions

    Full text link
    A coreset for a set of points is a small subset of weighted points that approximately preserves important properties of the original set. Specifically, if PP is a set of points, QQ is a set of queries, and f:PƗQā†’Rf:P\times Q\to\mathbb{R} is a cost function, then a set SāŠ†PS\subseteq P with weights w:Pā†’[0,āˆž)w:P\to[0,\infty) is an Ļµ\epsilon-coreset for some parameter Ļµ>0\epsilon>0 if āˆ‘sāˆˆSw(s)f(s,q)\sum_{s\in S}w(s)f(s,q) is a (1+Ļµ)(1+\epsilon) multiplicative approximation to āˆ‘pāˆˆPf(p,q)\sum_{p\in P}f(p,q) for all qāˆˆQq\in Q. Coresets are used to solve fundamental problems in machine learning under various big data models of computation. Many of the suggested coresets in the recent decade used, or could have used a general framework for constructing coresets whose size depends quadratically on what is known as total sensitivity tt. In this paper we improve this bound from O(t2)O(t^2) to O(tlogā”t)O(t\log t). Thus our results imply more space efficient solutions to a number of problems, including projective clustering, kk-line clustering, and subspace approximation. Moreover, we generalize the notion of sensitivity sampling for sup-sampling that supports non-multiplicative approximations, negative cost functions and more. The main technical result is a generic reduction to the sample complexity of learning a class of functions with bounded VC dimension. We show that obtaining an (Ī½,Ī±)(\nu,\alpha)-sample for this class of functions with appropriate parameters Ī½\nu and Ī±\alpha suffices to achieve space efficient Ļµ\epsilon-coresets. Our result implies more efficient coreset constructions for a number of interesting problems in machine learning; we show applications to kk-median/kk-means, kk-line clustering, jj-subspace approximation, and the integer (j,k)(j,k)-projective clustering problem

    An ETH-Tight Exact Algorithm for Euclidean TSP

    Get PDF
    We study exact algorithms for {\sc Euclidean TSP} in Rd\mathbb{R}^d. In the early 1990s algorithms with nO(n)n^{O(\sqrt{n})} running time were presented for the planar case, and some years later an algorithm with nO(n1āˆ’1/d)n^{O(n^{1-1/d})} running time was presented for any dā‰„2d\geq 2. Despite significant interest in subexponential exact algorithms over the past decade, there has been no progress on {\sc Euclidean TSP}, except for a lower bound stating that the problem admits no 2O(n1āˆ’1/dāˆ’Ļµ)2^{O(n^{1-1/d-\epsilon})} algorithm unless ETH fails. Up to constant factors in the exponent, we settle the complexity of {\sc Euclidean TSP} by giving a 2O(n1āˆ’1/d)2^{O(n^{1-1/d})} algorithm and by showing that a 2o(n1āˆ’1/d)2^{o(n^{1-1/d})} algorithm does not exist unless ETH fails.Comment: To appear in FOCS 201
    • ā€¦
    corecore