53 research outputs found

    Quickshift++: Provably Good Initializations for Sample-Based Mean Shift

    Full text link
    We provide initial seedings to the Quick Shift clustering algorithm, which approximate the locally high-density regions of the data. Such seedings act as more stable and expressive cluster-cores than the singleton modes found by Quick Shift. We establish statistical consistency guarantees for this modification. We then show strong clustering performance on real datasets as well as promising applications to image segmentation.Comment: ICML 2018. Code release: https://github.com/google/quickshif

    Noisy k-Means++ Revisited

    Get PDF

    Generalized Markov Chain Monte Carlo Initialization for Clustering Gaussian Mixtures Using K-means

    Get PDF
    Gaussian mixtures are considered to be a good estimate of real life data. Any clustering algorithm that can efficiently cluster such mixtures is expected to work well in practical applications dealing with real life data. K-means is popular for such applications given its ease of implementation and scalability; yet it suffers from the plague of poor seeding. Moreover, if the Gaussian mixture has overlapping clusters, k-means is not able to separate them if initial conditions are not good. Kmeans++ is a good seeding method with high time complexity. It can be made fast by using Markov chain Monte Carlo sampling. This paper proposes a method that improves seed quality and retains speed of sampling technique. The desired effects are demonstrated on several Gaussian mixtures
    • …
    corecore