6,517 research outputs found

    Fast multi-image matching via density-based clustering

    Full text link
    We consider the problem of finding consistent matches across multiple images. Previous state-of-the-art solutions use constraints on cycles of matches together with convex optimization, leading to computationally intensive iterative algorithms. In this paper, we propose a clustering-based formulation. We first rigorously show its equivalence with the previous one, and then propose QuickMatch, a novel algorithm that identifies multi-image matches from a density function in feature space. We use the density to order the points in a tree, and then extract the matches by breaking this tree using feature distances and measures of distinctiveness. Our algorithm outperforms previous state-of-the-art methods (such as MatchALS) in accuracy, and it is significantly faster (up to 62 times faster on some bechmarks), and can scale to large datasets (with more than twenty thousands features).Accepted manuscriptSupporting documentatio

    Efficient Computation of Multiple Density-Based Clustering Hierarchies

    Full text link
    HDBSCAN*, a state-of-the-art density-based hierarchical clustering method, produces a hierarchical organization of clusters in a dataset w.r.t. a parameter mpts. While the performance of HDBSCAN* is robust w.r.t. mpts in the sense that a small change in mpts typically leads to only a small or no change in the clustering structure, choosing a "good" mpts value can be challenging: depending on the data distribution, a high or low value for mpts may be more appropriate, and certain data clusters may reveal themselves at different values of mpts. To explore results for a range of mpts values, however, one has to run HDBSCAN* for each value in the range independently, which is computationally inefficient. In this paper, we propose an efficient approach to compute all HDBSCAN* hierarchies for a range of mpts values by replacing the graph used by HDBSCAN* with a much smaller graph that is guaranteed to contain the required information. An extensive experimental evaluation shows that with our approach one can obtain over one hundred hierarchies for the computational cost equivalent to running HDBSCAN* about 2 times.Comment: A short version of this paper appears at IEEE ICDM 2017. Corrected typos. Revised abstrac

    Fully adaptive density-based clustering

    Full text link
    The clusters of a distribution are often defined by the connected components of a density level set. However, this definition depends on the user-specified level. We address this issue by proposing a simple, generic algorithm, which uses an almost arbitrary level set estimator to estimate the smallest level at which there are more than one connected components. In the case where this algorithm is fed with histogram-based level set estimates, we provide a finite sample analysis, which is then used to show that the algorithm consistently estimates both the smallest level and the corresponding connected components. We further establish rates of convergence for the two estimation problems, and last but not least, we present a simple, yet adaptive strategy for determining the width-parameter of the involved density estimator in a data-depending way.Comment: Published at http://dx.doi.org/10.1214/15-AOS1331 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Stable and consistent density-based clustering

    Full text link
    We present a multiscale, consistent approach to density-based clustering that satisfies stability theorems -- in both the input data and in the parameters -- which hold without distributional assumptions. The stability in the input data is with respect to the Gromov--Hausdorff--Prokhorov distance on metric probability spaces and interleaving distances between (multi-parameter) hierarchical clusterings we introduce. We prove stability results for standard simplification procedures for hierarchical clusterings, which can be combined with our approach to yield a stable flat clustering algorithm. We illustrate the stability of the approach with computational examples. Our framework is based on the concepts of persistence and interleaving distance from Topological Data Analysis.Comment: 32 pages, 7 figures. v2: improves exposition, adds computational example
    • …
    corecore