327 research outputs found

    The larger the better: Analysis of a scalable spectral clustering algorithm with cosine similarity

    Get PDF
    Chen (2018) proposed a scalable spectral clustering algorithm for cosine similarity to handle the task of clustering large data sets. It runs extremely fast, with a linear complexity in the size of the data, and achieves state of the art accuracy. This paper conducts perturbation analysis of the algorithm to understand the effect of discarding a perturbation term in an eigendecomposition step. Our results show that the accuracy of the approximation by the scalable algorithm depends on the connectivity of the clusters, their separation and sizes, and is especially accurate for large data sets

    Guaranteed Sparse Signal Recovery with Highly Coherent Sensing Matrices

    Get PDF
    Compressive sensing is a methodology for the reconstruction of sparse or compressible signals using far fewer samples than required by the Nyquist criterion. However, many of the results in compressive sensing concern random sampling matrices such as Gaussian and Bernoulli matrices. In common physically feasible signal acquisition and reconstruction scenarios such as super-resolution of images, the sensing matrix has a non-random structure with highly correlated columns. Here we present a compressive sensing type recovery algorithm, called Partial Inversion (PartInv), that overcomes the correlations among the columns. We provide theoretical justification as well as empirical comparisons

    Multiscale Geometric Methods for Data Sets II: Geometric Multi-Resolution Analysis

    Get PDF
    Data sets are often modeled as point clouds in RDR^D, for DD large. It is often assumed that the data has some interesting low-dimensional structure, for example that of a dd-dimensional manifold MM, with dd much smaller than DD. When MM is simply a linear subspace, one may exploit this assumption for encoding efficiently the data by projecting onto a dictionary of dd vectors in RDR^D (for example found by SVD), at a cost (n+D)d(n+D)d for nn data points. When MM is nonlinear, there are no "explicit" constructions of dictionaries that achieve a similar efficiency: typically one uses either random dictionaries, or dictionaries obtained by black-box optimization. In this paper we construct data-dependent multi-scale dictionaries that aim at efficient encoding and manipulating of the data. Their construction is fast, and so are the algorithms that map data points to dictionary coefficients and vice versa. In addition, data points are guaranteed to have a sparse representation in terms of the dictionary. We think of dictionaries as the analogue of wavelets, but for approximating point clouds rather than functions.Comment: Re-formatted using AMS styl

    Foundations of a Multi-way Spectral Clustering Framework for Hybrid Linear Modeling

    Full text link
    The problem of Hybrid Linear Modeling (HLM) is to model and segment data using a mixture of affine subspaces. Different strategies have been proposed to solve this problem, however, rigorous analysis justifying their performance is missing. This paper suggests the Theoretical Spectral Curvature Clustering (TSCC) algorithm for solving the HLM problem, and provides careful analysis to justify it. The TSCC algorithm is practically a combination of Govindu's multi-way spectral clustering framework (CVPR 2005) and Ng et al.'s spectral clustering algorithm (NIPS 2001). The main result of this paper states that if the given data is sampled from a mixture of distributions concentrated around affine subspaces, then with high sampling probability the TSCC algorithm segments well the different underlying clusters. The goodness of clustering depends on the within-cluster errors, the between-clusters interaction, and a tuning parameter applied by TSCC. The proof also provides new insights for the analysis of Ng et al. (NIPS 2001).Comment: 40 pages. Minor changes to the previous version (mainly revised Sections 2.2 & 2.3, and added references). Accepted to the Journal of Foundations of Computational Mathematic

    TransPose: 6D Object Pose Estimation with Geometry-Aware Transformer

    Full text link
    Estimating the 6D object pose is an essential task in many applications. Due to the lack of depth information, existing RGB-based methods are sensitive to occlusion and illumination changes. How to extract and utilize the geometry features in depth information is crucial to achieve accurate predictions. To this end, we propose TransPose, a novel 6D pose framework that exploits Transformer Encoder with geometry-aware module to develop better learning of point cloud feature representations. Specifically, we first uniformly sample point cloud and extract local geometry features with the designed local feature extractor base on graph convolution network. To improve robustness to occlusion, we adopt Transformer to perform the exchange of global information, making each local feature contains global information. Finally, we introduce geometry-aware module in Transformer Encoder, which to form an effective constrain for point cloud feature learning and makes the global information exchange more tightly coupled with point cloud tasks. Extensive experiments indicate the effectiveness of TransPose, our pose estimation pipeline achieves competitive results on three benchmark datasets.Comment: 10 pages, 5 figures, IEEE Journa
    • …
    corecore