26,192 research outputs found

    Center-based Clustering under Perturbation Stability

    Full text link
    Clustering under most popular objective functions is NP-hard, even to approximate well, and so unlikely to be efficiently solvable in the worst case. Recently, Bilu and Linial \cite{Bilu09} suggested an approach aimed at bypassing this computational barrier by using properties of instances one might hope to hold in practice. In particular, they argue that instances in practice should be stable to small perturbations in the metric space and give an efficient algorithm for clustering instances of the Max-Cut problem that are stable to perturbations of size O(n1/2)O(n^{1/2}). In addition, they conjecture that instances stable to as little as O(1) perturbations should be solvable in polynomial time. In this paper we prove that this conjecture is true for any center-based clustering objective (such as kk-median, kk-means, and kk-center). Specifically, we show we can efficiently find the optimal clustering assuming only stability to factor-3 perturbations of the underlying metric in spaces without Steiner points, and stability to factor 2+32+\sqrt{3} perturbations for general metrics. In particular, we show for such instances that the popular Single-Linkage algorithm combined with dynamic programming will find the optimal clustering. We also present NP-hardness results under a weaker but related condition

    Clustering Under Perturbation Stability in Near-Linear Time

    Get PDF
    We consider the problem of center-based clustering in low-dimensional Euclidean spaces under the perturbation stability assumption. An instance is ?-stable if the underlying optimal clustering continues to remain optimal even when all pairwise distances are arbitrarily perturbed by a factor of at most ?. Our main contribution is in presenting efficient exact algorithms for ?-stable clustering instances whose running times depend near-linearly on the size of the data set when ? ? 2 + ?3. For k-center and k-means problems, our algorithms also achieve polynomial dependence on the number of clusters, k, when ? ? 2 + ?3 + ? for any constant ? > 0 in any fixed dimension. For k-median, our algorithms have polynomial dependence on k for ? > 5 in any fixed dimension; and for ? ? 2 + ?3 in two dimensions. Our algorithms are simple, and only require applying techniques such as local search or dynamic programming to a suitably modified metric space, combined with careful choice of data structures

    Certified Algorithms: Worst-Case Analysis and Beyond

    Get PDF
    In this paper, we introduce the notion of a certified algorithm. Certified algorithms provide worst-case and beyond-worst-case performance guarantees. First, a ?-certified algorithm is also a ?-approximation algorithm - it finds a ?-approximation no matter what the input is. Second, it exactly solves ?-perturbation-resilient instances (?-perturbation-resilient instances model real-life instances). Additionally, certified algorithms have a number of other desirable properties: they solve both maximization and minimization versions of a problem (e.g. Max Cut and Min Uncut), solve weakly perturbation-resilient instances, and solve optimization problems with hard constraints. In the paper, we define certified algorithms, describe their properties, present a framework for designing certified algorithms, provide examples of certified algorithms for Max Cut/Min Uncut, Minimum Multiway Cut, k-medians and k-means. We also present some negative results
    • …
    corecore