15,341 research outputs found

    Fast Hierarchical Clustering and Other Applications of Dynamic Closest Pairs

    Full text link
    We develop data structures for dynamic closest pair problems with arbitrary distance functions, that do not necessarily come from any geometric structure on the objects. Based on a technique previously used by the author for Euclidean closest pairs, we show how to insert and delete objects from an n-object set, maintaining the closest pair, in O(n log^2 n) time per update and O(n) space. With quadratic space, we can instead use a quadtree-like structure to achieve an optimal time bound, O(n) per update. We apply these data structures to hierarchical clustering, greedy matching, and TSP heuristics, and discuss other potential applications in machine learning, Groebner bases, and local improvement algorithms for partition and placement problems. Experiments show our new methods to be faster in practice than previously used heuristics.Comment: 20 pages, 9 figures. A preliminary version of this paper appeared at the 9th ACM-SIAM Symp. on Discrete Algorithms, San Francisco, 1998, pp. 619-628. For source code and experimental results, see http://www.ics.uci.edu/~eppstein/projects/pairs

    Nearness to Local Subspace Algorithm for Subspace and Motion Segmentation

    Get PDF
    There is a growing interest in computer science, engineering, and mathematics for modeling signals in terms of union of subspaces and manifolds. Subspace segmentation and clustering of high dimensional data drawn from a union of subspaces are especially important with many practical applications in computer vision, image and signal processing, communications, and information theory. This paper presents a clustering algorithm for high dimensional data that comes from a union of lower dimensional subspaces of equal and known dimensions. Such cases occur in many data clustering problems, such as motion segmentation and face recognition. The algorithm is reliable in the presence of noise, and applied to the Hopkins 155 Dataset, it generates the best results to date for motion segmentation. The two motion, three motion, and overall segmentation rates for the video sequences are 99.43%, 98.69%, and 99.24%, respectively
    • …
    corecore