48,496 research outputs found
Subspace Clustering via Optimal Direction Search
This letter presents a new spectral-clustering-based approach to the subspace
clustering problem. Underpinning the proposed method is a convex program for
optimal direction search, which for each data point d finds an optimal
direction in the span of the data that has minimum projection on the other data
points and non-vanishing projection on d. The obtained directions are
subsequently leveraged to identify a neighborhood set for each data point. An
alternating direction method of multipliers framework is provided to
efficiently solve for the optimal directions. The proposed method is shown to
notably outperform the existing subspace clustering methods, particularly for
unwieldy scenarios involving high levels of noise and close subspaces, and
yields the state-of-the-art results for the problem of face clustering using
subspace segmentation
Innovation Pursuit: A New Approach to Subspace Clustering
In subspace clustering, a group of data points belonging to a union of
subspaces are assigned membership to their respective subspaces. This paper
presents a new approach dubbed Innovation Pursuit (iPursuit) to the problem of
subspace clustering using a new geometrical idea whereby subspaces are
identified based on their relative novelties. We present two frameworks in
which the idea of innovation pursuit is used to distinguish the subspaces.
Underlying the first framework is an iterative method that finds the subspaces
consecutively by solving a series of simple linear optimization problems, each
searching for a direction of innovation in the span of the data potentially
orthogonal to all subspaces except for the one to be identified in one step of
the algorithm. A detailed mathematical analysis is provided establishing
sufficient conditions for iPursuit to correctly cluster the data. The proposed
approach can provably yield exact clustering even when the subspaces have
significant intersections. It is shown that the complexity of the iterative
approach scales only linearly in the number of data points and subspaces, and
quadratically in the dimension of the subspaces. The second framework
integrates iPursuit with spectral clustering to yield a new variant of
spectral-clustering-based algorithms. The numerical simulations with both real
and synthetic data demonstrate that iPursuit can often outperform the
state-of-the-art subspace clustering algorithms, more so for subspaces with
significant intersections, and that it significantly improves the
state-of-the-art result for subspace-segmentation-based face clustering
LogDet Rank Minimization with Application to Subspace Clustering
Low-rank matrix is desired in many machine learning and computer vision
problems. Most of the recent studies use the nuclear norm as a convex surrogate
of the rank operator. However, all singular values are simply added together by
the nuclear norm, and thus the rank may not be well approximated in practical
problems. In this paper, we propose to use a log-determinant (LogDet) function
as a smooth and closer, though non-convex, approximation to rank for obtaining
a low-rank representation in subspace clustering. Augmented Lagrange
multipliers strategy is applied to iteratively optimize the LogDet-based
non-convex objective function on potentially large-scale data. By making use of
the angular information of principal directions of the resultant low-rank
representation, an affinity graph matrix is constructed for spectral
clustering. Experimental results on motion segmentation and face clustering
data demonstrate that the proposed method often outperforms state-of-the-art
subspace clustering algorithms.Comment: 10 pages, 4 figure
Constrained spectral embedding for K-way data clustering
International audienceSpectral clustering methods meet more and more success in machine learning community thanks to their ability to cluster data points of any complex shapes. The problem of clustering is addressed in terms of finding an embedding space in which the projected data are linearly separable by a classical clustering algorithm such as K-means algorithm. Often, spectral algorithm performances are significantly improved by incorporating prior knowledge in their design, and several techniques have been developed for this purpose. In this paper, we describe and compare some recent linear and non-linear projection algorithms integrating instance-level constraints (“must-link” and “cannot-link”) and applied for data clustering. We outline a K-way spectral clustering algorithm able to integrate pairwise relationships between the data samples. We formulate the objective function as a combination of the original spectral clustering criterion and the penalization term based on the instance constraints. The optimization problem is solved as a standard eigensystem of a signed Laplacian matrix. The relevance of the proposed algorithm is highlighted using six UCI benchmarks and two public face databases
Sparse Subspace Clustering: Algorithm, Theory, and Applications
In many real-world problems, we are dealing with collections of
high-dimensional data, such as images, videos, text and web documents, DNA
microarray data, and more. Often, high-dimensional data lie close to
low-dimensional structures corresponding to several classes or categories the
data belongs to. In this paper, we propose and study an algorithm, called
Sparse Subspace Clustering (SSC), to cluster data points that lie in a union of
low-dimensional subspaces. The key idea is that, among infinitely many possible
representations of a data point in terms of other points, a sparse
representation corresponds to selecting a few points from the same subspace.
This motivates solving a sparse optimization program whose solution is used in
a spectral clustering framework to infer the clustering of data into subspaces.
Since solving the sparse optimization program is in general NP-hard, we
consider a convex relaxation and show that, under appropriate conditions on the
arrangement of subspaces and the distribution of data, the proposed
minimization program succeeds in recovering the desired sparse representations.
The proposed algorithm can be solved efficiently and can handle data points
near the intersections of subspaces. Another key advantage of the proposed
algorithm with respect to the state of the art is that it can deal with data
nuisances, such as noise, sparse outlying entries, and missing entries,
directly by incorporating the model of the data into the sparse optimization
program. We demonstrate the effectiveness of the proposed algorithm through
experiments on synthetic data as well as the two real-world problems of motion
segmentation and face clustering
- …