4,406 research outputs found

    Noisy Subspace Clustering via Thresholding

    Full text link
    We consider the problem of clustering noisy high-dimensional data points into a union of low-dimensional subspaces and a set of outliers. The number of subspaces, their dimensions, and their orientations are unknown. A probabilistic performance analysis of the thresholding-based subspace clustering (TSC) algorithm introduced recently in [1] shows that TSC succeeds in the noisy case, even when the subspaces intersect. Our results reveal an explicit tradeoff between the allowed noise level and the affinity of the subspaces. We furthermore find that the simple outlier detection scheme introduced in [1] provably succeeds in the noisy case.Comment: Presented at the IEEE Int. Symp. Inf. Theory (ISIT) 2013, Istanbul, Turkey. The version posted here corrects a minor error in the published version. Specifically, the exponent -c n_l in the success probability of Theorem 1 and in the corresponding proof outline has been corrected to -c(n_l-1

    Scalable Sparse Subspace Clustering by Orthogonal Matching Pursuit

    Full text link
    Subspace clustering methods based on â„“1\ell_1, â„“2\ell_2 or nuclear norm regularization have become very popular due to their simplicity, theoretical guarantees and empirical success. However, the choice of the regularizer can greatly impact both theory and practice. For instance, â„“1\ell_1 regularization is guaranteed to give a subspace-preserving affinity (i.e., there are no connections between points from different subspaces) under broad conditions (e.g., arbitrary subspaces and corrupted data). However, it requires solving a large scale convex optimization problem. On the other hand, â„“2\ell_2 and nuclear norm regularization provide efficient closed form solutions, but require very strong assumptions to guarantee a subspace-preserving affinity, e.g., independent subspaces and uncorrupted data. In this paper we study a subspace clustering method based on orthogonal matching pursuit. We show that the method is both computationally efficient and guaranteed to give a subspace-preserving affinity under broad conditions. Experiments on synthetic data verify our theoretical analysis, and applications in handwritten digit and face clustering show that our approach achieves the best trade off between accuracy and efficiency.Comment: 13 pages, 1 figure, 2 tables. Accepted to CVPR 2016 as an oral presentatio
    • …
    corecore