Search CORE

11 research outputs found

Randomized hybrid linear modeling by local best-fit flats

Author: Lerman Gilad
Szlam Arthur
Wang Yi
Zhang Teng
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 05/05/2010
Field of study

The hybrid linear modeling problem is to identify a set of d-dimensional affine sets in a D-dimensional Euclidean space. It arises, for example, in object tracking and structure from motion. The hybrid linear model can be considered as the second simplest (behind linear) manifold model of data. In this paper we will present a very simple geometric method for hybrid linear modeling based on selecting a set of local best fit flats that minimize a global l1 error measure. The size of the local neighborhoods is determined automatically by the Jones' l2 beta numbers; it is proven under certain geometric conditions that good local neighborhoods exist and are found by our method. We also demonstrate how to use this algorithm for fast determination of the number of affine subspaces. We give extensive experimental evidence demonstrating the state of the art accuracy and speed of the algorithm on synthetic and real hybrid linear data.Comment: To appear in the proceedings of CVPR 201

arXiv.org e-Print Archive

Crossref

CUR Decompositions, Similarity Matrices, and Subspace Clustering

Author: Aldroubi Akram
Hamm Keaton
Koku Ahmet Bugra
Sekmen Ali
Publication venue
Publication date: 11/12/2018
Field of study

A general framework for solving the subspace clustering problem using the CUR decomposition is presented. The CUR decomposition provides a natural way to construct similarity matrices for data that come from a union of unknown subspaces

\mathscr{U}=\underset{i=1}{\overset{M}\bigcup}S_i

. The similarity matrices thus constructed give the exact clustering in the noise-free case. Additionally, this decomposition gives rise to many distinct similarity matrices from a given set of data, which allow enough flexibility to perform accurate clustering of noisy data. We also show that two known methods for subspace clustering can be derived from the CUR decomposition. An algorithm based on the theoretical construction of similarity matrices is presented, and experiments on synthetic and real data are presented to test the method. Additionally, an adaptation of our CUR based similarity matrices is utilized to provide a heuristic algorithm for subspace clustering; this algorithm yields the best overall performance to date for clustering the Hopkins155 motion segmentation dataset.Comment: Approximately 30 pages. Current version contains improved algorithm and numerical experiments from the previous versio

arXiv.org e-Print Archive

Directory of Open Access Journals

Digital Scholarship @ Tennessee State University

OpenMETU (Middle East Technical University)

Reduced row echelon form and non-linear approximation for subspace segmentation and high-dimensional data clustering

Author: Aldroubi Akram
Sekmen Ali
Publication venue: Digital Scholarship @ Tennessee State University
Publication date: 16/12/2013
Field of study

Given a set of data W={w1,…,wN}∈RD drawn from a union of subspaces, we focus on determining a nonlinear model of the form U=⋃i∈ISi, where {Si⊂RD}i∈I is a set of subspaces, that is nearest to W. The model is then used to classify W into clusters. Our approach is based on the binary reduced row echelon form of data matrix, combined with an iterative scheme based on a non-linear approximation method. We prove that, in absence of noise, our approach can find the number of subspaces, their dimensions, and an orthonormal basis for each subspace Si. We provide a comprehensive analysis of our theory and determine its limitations and strengths in presence of outliers and noise

Digital Scholarship @ Tennessee State University

Non-Asymptotic Analysis of Tangent Space Perturbation

Author: Kaslovsky Daniel N.
Meyer Francois G.
Publication venue
Publication date: 05/12/2013
Field of study

Constructing an efficient parameterization of a large, noisy data set of points lying close to a smooth manifold in high dimension remains a fundamental problem. One approach consists in recovering a local parameterization using the local tangent plane. Principal component analysis (PCA) is often the tool of choice, as it returns an optimal basis in the case of noise-free samples from a linear subspace. To process noisy data samples from a nonlinear manifold, PCA must be applied locally, at a scale small enough such that the manifold is approximately linear, but at a scale large enough such that structure may be discerned from noise. Using eigenspace perturbation theory and non-asymptotic random matrix theory, we study the stability of the subspace estimated by PCA as a function of scale, and bound (with high probability) the angle it forms with the true tangent space. By adaptively selecting the scale that minimizes this bound, our analysis reveals an appropriate scale for local tangent plane recovery. We also introduce a geometric uncertainty principle quantifying the limits of noise-curvature perturbation for stable recovery. With the purpose of providing perturbation bounds that can be used in practice, we propose plug-in estimates that make it possible to directly apply the theoretical results to real data sets.Comment: 53 pages. Revised manuscript with new content addressing application of results to real data set

arXiv.org e-Print Archive

CiteSeerX

Riemannian Multi-Manifold Modeling

Author: Lerman Gilad
Slavakis Konstantinos
Wang Xu
Publication venue
Publication date: 30/09/2014
Field of study

This paper advocates a novel framework for segmenting a dataset in a Riemannian manifold

M

into clusters lying around low-dimensional submanifolds of

M

. Important examples of

M

, for which the proposed clustering algorithm is computationally efficient, are the sphere, the set of positive definite matrices, and the Grassmannian. The clustering problem with these examples of

M

is already useful for numerous application domains such as action identification in video sequences, dynamic texture clustering, brain fiber segmentation in medical imaging, and clustering of deformed images. The proposed clustering algorithm constructs a data-affinity matrix by thoroughly exploiting the intrinsic geometry and then applies spectral clustering. The intrinsic local geometry is encoded by local sparse coding and more importantly by directional information of local tangent spaces and geodesics. Theoretical guarantees are established for a simplified variant of the algorithm even when the clusters intersect. To avoid complication, these guarantees assume that the underlying submanifolds are geodesic. Extensive validation on synthetic and real data demonstrates the resiliency of the proposed method against deviations from the theoretical model as well as its superior performance over state-of-the-art techniques

arXiv.org e-Print Archive

CiteSeerX

Endogenous Sparse Recovery

Author: Dyer Eva L.
Publication venue
Publication date: 01/01/2012
Field of study

Sparsity has proven to be an essential ingredient in the development of efficient solutions to a number of problems in signal processing and machine learning. In all of these settings, sparse recovery methods are employed to recover signals that admit sparse representations in a pre-specified basis. Recently, sparse recovery methods have been employed in an entirely new way; instead of finding a sparse representation of a signal in a fixed basis, a sparse representation is formed "from within" the data. In this thesis, we study the utility of this endogenous sparse recovery procedure for learning unions of subspaces from collections of high-dimensional data. We provide new insights into the behavior of endogenous sparse recovery, develop sufficient conditions that describe when greedy methods will reveal local estimates of the subspaces in the ensemble, and introduce new methods to learn unions of overlapping subspaces from local subspace estimates

DSpace at Rice University