Search CORE

5 research outputs found

Scalable and Robust Sparse Subspace Clustering Using Randomized Clustering and Multilayer Graphs

Author: Abdolali Maryam
Gillis Nicolas
Rahmati Mohammad
Publication venue: 'Elsevier BV'
Publication date: 23/02/2018
Field of study

Sparse subspace clustering (SSC) is one of the current state-of-the-art methods for partitioning data points into the union of subspaces, with strong theoretical guarantees. However, it is not practical for large data sets as it requires solving a LASSO problem for each data point, where the number of variables in each LASSO problem is the number of data points. To improve the scalability of SSC, we propose to select a few sets of anchor points using a randomized hierarchical clustering method, and, for each set of anchor points, solve the LASSO problems for each data point allowing only anchor points to have a non-zero weight (this reduces drastically the number of variables). This generates a multilayer graph where each layer corresponds to a different set of anchor points. Using the Grassmann manifold of orthogonal matrices, the shared connectivity among the layers is summarized within a single subspace. Finally, we use

k

-means clustering within that subspace to cluster the data points, similarly as done by spectral clustering in SSC. We show on both synthetic and real-world data sets that the proposed method not only allows SSC to scale to large-scale data sets, but that it is also much more robust as it performs significantly better on noisy data and on data with close susbspaces and outliers, while it is not prone to oversegmentation.Comment: 25 pages, v2: typos correcte

arXiv.org e-Print Archive

Efficient Solvers for Sparse Subspace Clustering

Author: Becker Stephen
Folberth James
Pourkamali-Anaraki Farhad
Publication venue
Publication date: 19/02/2020
Field of study

Sparse subspace clustering (SSC) clusters

n

points that lie near a union of low-dimensional subspaces. The SSC model expresses each point as a linear or affine combination of the other points, using either

\ell_1

\ell_0

regularization. Using

\ell_1

regularization results in a convex problem but requires

O(n^2)

storage, and is typically solved by the alternating direction method of multipliers which takes

O(n^3)

flops. The

\ell_0

model is non-convex but only needs memory linear in

n

, and is solved via orthogonal matching pursuit and cannot handle the case of affine subspaces. This paper shows that a proximal gradient framework can solve SSC, covering both

\ell_1

and

\ell_0

models, and both linear and affine constraints. For both

\ell_1

and

\ell_0

, algorithms to compute the proximity operator in the presence of affine constraints have not been presented in the SSC literature, so we derive an exact and efficient algorithm that solves the

\ell_1

case with just

O(n^2)

flops. In the

\ell_0

case, our algorithm retains the low-memory overhead, and is the first algorithm to solve the SSC-

\ell_0

model with affine constraints. Experiments show our algorithms do not rely on sensitive regularization parameters, and they are less sensitive to sparsity misspecification and high noise.Comment: This paper is accepted for publication in Signal Processin

arXiv.org e-Print Archive

Exactly Robust Kernel Principal Component Analysis

Author: Chow Tommy W. S.
Fan Jicong
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 17/04/2019
Field of study

Robust principal component analysis (RPCA) can recover low-rank matrices when they are corrupted by sparse noises. In practice, many matrices are, however, of high-rank and hence cannot be recovered by RPCA. We propose a novel method called robust kernel principal component analysis (RKPCA) to decompose a partially corrupted matrix as a sparse matrix plus a high or full-rank matrix with low latent dimensionality. RKPCA can be applied to many problems such as noise removal and subspace clustering and is still the only unsupervised nonlinear method robust to sparse noises. Our theoretical analysis shows that, with high probability, RKPCA can provide high recovery accuracy. The optimization of RKPCA involves nonconvex and indifferentiable problems. We propose two nonconvex optimization algorithms for RKPCA. They are alternating direction method of multipliers with backtracking line search and proximal linearized minimization with adaptive step size. Comparative studies in noise removal and robust subspace clustering corroborate the effectiveness and superiority of RKPCA.Comment: The paper was accepted by IEEE Transactions on Neural Networks and Learning System

arXiv.org e-Print Archive

On a minimum enclosing ball of a collection of linear subspaces

Author: Absil P. -A.
Gillis Nicolas
Marrinan Timothy
Publication venue
Publication date: 27/03/2020
Field of study

This paper concerns the minimax center of a collection of linear subspaces. When the subspaces are

k

-dimensional subspaces of

\mathbb{R}^n

, this can be cast as finding the center of a minimum enclosing ball on a Grassmann manifold, Gr

(k,n)

. For subspaces of different dimension, the setting becomes a disjoint union of Grassmannians rather than a single manifold, and the problem is no longer well-defined. However, natural geometric maps exist between these manifolds with a well-defined notion of distance for the images of the subspaces under the mappings. Solving the initial problem in this context leads to a candidate minimax center on each of the constituent manifolds, but does not inherently provide intuition about which candidate is the best representation of the data. Additionally, the solutions of different rank are generally not nested so a deflationary approach will not suffice, and the problem must be solved independently on each manifold. We propose and solve an optimization problem parametrized by the rank of the minimax center. The solution is computed using a subgradient algorithm on the dual. By scaling the objective and penalizing the information lost by the rank-

k

minimax center, we jointly recover an optimal dimension,

k^*

, and a central subspace,

U^* \in

(k^*,n)

at the center of the minimum enclosing ball, that best represents the data.Comment: 26 page

arXiv.org e-Print Archive

Beyond Linear Subspace Clustering: A Comparative Study of Nonlinear Manifold Clustering Algorithms

Author: Abdolali Maryam
Gillis Nicolas
Publication venue: 'Elsevier BV'
Publication date: 19/03/2021
Field of study

Subspace clustering is an important unsupervised clustering approach. It is based on the assumption that the high-dimensional data points are approximately distributed around several low-dimensional linear subspaces. The majority of the prominent subspace clustering algorithms rely on the representation of the data points as linear combinations of other data points, which is known as a self-expressive representation. To overcome the restrictive linearity assumption, numerous nonlinear approaches were proposed to extend successful subspace clustering approaches to data on a union of nonlinear manifolds. In this comparative study, we provide a comprehensive overview of nonlinear subspace clustering approaches proposed in the last decade. We introduce a new taxonomy to classify the state-of-the-art approaches into three categories, namely locality preserving, kernel based, and neural network based. The major representative algorithms within each category are extensively compared on carefully designed synthetic and real-world data sets. The detailed analysis of these approaches unfolds potential research directions and unsolved challenges in this field.Comment: 55 page

arXiv.org e-Print Archive