8,647 research outputs found

    A bi-criteria approximation algorithm for kk Means

    Get PDF
    We consider the classical kk-means clustering problem in the setting bi-criteria approximation, in which an algoithm is allowed to output βk>k\beta k > k clusters, and must produce a clustering with cost at most α\alpha times the to the cost of the optimal set of kk clusters. We argue that this approach is natural in many settings, for which the exact number of clusters is a priori unknown, or unimportant up to a constant factor. We give new bi-criteria approximation algorithms, based on linear programming and local search, respectively, which attain a guarantee α(β)\alpha(\beta) depending on the number βk\beta k of clusters that may be opened. Our gurantee α(β)\alpha(\beta) is always at most 9+ϵ9 + \epsilon and improves rapidly with β\beta (for example: α(2)<2.59\alpha(2)<2.59, and α(3)<1.4\alpha(3) < 1.4). Moreover, our algorithms have only polynomial dependence on the dimension of the input data, and so are applicable in high-dimensional settings

    Maximum gradient embeddings and monotone clustering

    Full text link
    Let (X,d_X) be an n-point metric space. We show that there exists a distribution D over non-contractive embeddings into trees f:X-->T such that for every x in X, the expectation with respect to D of the maximum over y in X of the ratio d_T(f(x),f(y)) / d_X(x,y) is at most C (log n)^2, where C is a universal constant. Conversely we show that the above quadratic dependence on log n cannot be improved in general. Such embeddings, which we call maximum gradient embeddings, yield a framework for the design of approximation algorithms for a wide range of clustering problems with monotone costs, including fault-tolerant versions of k-median and facility location.Comment: 25 pages, 2 figures. Final version, minor revision of the previous one. To appear in "Combinatorica

    Fast Clustering with Lower Bounds: No Customer too Far, No Shop too Small

    Full text link
    We study the \LowerBoundedCenter (\lbc) problem, which is a clustering problem that can be viewed as a variant of the \kCenter problem. In the \lbc problem, we are given a set of points P in a metric space and a lower bound \lambda, and the goal is to select a set C \subseteq P of centers and an assignment that maps each point in P to a center of C such that each center of C is assigned at least \lambda points. The price of an assignment is the maximum distance between a point and the center it is assigned to, and the goal is to find a set of centers and an assignment of minimum price. We give a constant factor approximation algorithm for the \lbc problem that runs in O(n \log n) time when the input points lie in the d-dimensional Euclidean space R^d, where d is a constant. We also prove that this problem cannot be approximated within a factor of 1.8-\epsilon unless P = \NP even if the input points are points in the Euclidean plane R^2.Comment: 14 page

    Better Guarantees for k-Means and Euclidean k-Median by Primal-Dual Algorithms

    Get PDF
    bibsource: dblp computer science bibliography, http://dblp.org biburl: http://dblp.org/rec/bib/conf/focs/AhmadianNSW17 timestamp: Thu, 16 Nov 2017 15:01:42 +0100 bdsk-url-1: https://doi.org/10.1109/FOCS.2017.15 bdsk-url-2: http://dx.doi.org/10.1109/FOCS.2017.15bibsource: dblp computer science bibliography, http://dblp.org biburl: http://dblp.org/rec/bib/conf/focs/AhmadianNSW17 timestamp: Thu, 16 Nov 2017 15:01:42 +0100 bdsk-url-1: https://doi.org/10.1109/FOCS.2017.15 bdsk-url-2: http://dx.doi.org/10.1109/FOCS.2017.1
    • …
    corecore