1,319 research outputs found
Sparse Subspace Clustering: Algorithm, Theory, and Applications
In many real-world problems, we are dealing with collections of
high-dimensional data, such as images, videos, text and web documents, DNA
microarray data, and more. Often, high-dimensional data lie close to
low-dimensional structures corresponding to several classes or categories the
data belongs to. In this paper, we propose and study an algorithm, called
Sparse Subspace Clustering (SSC), to cluster data points that lie in a union of
low-dimensional subspaces. The key idea is that, among infinitely many possible
representations of a data point in terms of other points, a sparse
representation corresponds to selecting a few points from the same subspace.
This motivates solving a sparse optimization program whose solution is used in
a spectral clustering framework to infer the clustering of data into subspaces.
Since solving the sparse optimization program is in general NP-hard, we
consider a convex relaxation and show that, under appropriate conditions on the
arrangement of subspaces and the distribution of data, the proposed
minimization program succeeds in recovering the desired sparse representations.
The proposed algorithm can be solved efficiently and can handle data points
near the intersections of subspaces. Another key advantage of the proposed
algorithm with respect to the state of the art is that it can deal with data
nuisances, such as noise, sparse outlying entries, and missing entries,
directly by incorporating the model of the data into the sparse optimization
program. We demonstrate the effectiveness of the proposed algorithm through
experiments on synthetic data as well as the two real-world problems of motion
segmentation and face clustering
Robust Recovery of Subspace Structures by Low-Rank Representation
In this work we address the subspace recovery problem. Given a set of data
samples (vectors) approximately drawn from a union of multiple subspaces, our
goal is to segment the samples into their respective subspaces and correct the
possible errors as well. To this end, we propose a novel method termed Low-Rank
Representation (LRR), which seeks the lowest-rank representation among all the
candidates that can represent the data samples as linear combinations of the
bases in a given dictionary. It is shown that LRR well solves the subspace
recovery problem: when the data is clean, we prove that LRR exactly captures
the true subspace structures; for the data contaminated by outliers, we prove
that under certain conditions LRR can exactly recover the row space of the
original data and detect the outlier as well; for the data corrupted by
arbitrary errors, LRR can also approximately recover the row space with
theoretical guarantees. Since the subspace membership is provably determined by
the row space, these further imply that LRR can perform robust subspace
segmentation and error correction, in an efficient way.Comment: IEEE Trans. Pattern Analysis and Machine Intelligenc
Unsupervised Deep Epipolar Flow for Stationary or Dynamic Scenes
Unsupervised deep learning for optical flow computation has achieved
promising results. Most existing deep-net based methods rely on image
brightness consistency and local smoothness constraint to train the networks.
Their performance degrades at regions where repetitive textures or occlusions
occur. In this paper, we propose Deep Epipolar Flow, an unsupervised optical
flow method which incorporates global geometric constraints into network
learning. In particular, we investigate multiple ways of enforcing the epipolar
constraint in flow estimation. To alleviate a "chicken-and-egg" type of problem
encountered in dynamic scenes where multiple motions may be present, we propose
a low-rank constraint as well as a union-of-subspaces constraint for training.
Experimental results on various benchmarking datasets show that our method
achieves competitive performance compared with supervised methods and
outperforms state-of-the-art unsupervised deep-learning methods.Comment: CVPR 201
Block-Sparse Recovery via Convex Optimization
Given a dictionary that consists of multiple blocks and a signal that lives
in the range space of only a few blocks, we study the problem of finding a
block-sparse representation of the signal, i.e., a representation that uses the
minimum number of blocks. Motivated by signal/image processing and computer
vision applications, such as face recognition, we consider the block-sparse
recovery problem in the case where the number of atoms in each block is
arbitrary, possibly much larger than the dimension of the underlying subspace.
To find a block-sparse representation of a signal, we propose two classes of
non-convex optimization programs, which aim to minimize the number of nonzero
coefficient blocks and the number of nonzero reconstructed vectors from the
blocks, respectively. Since both classes of problems are NP-hard, we propose
convex relaxations and derive conditions under which each class of the convex
programs is equivalent to the original non-convex formulation. Our conditions
depend on the notions of mutual and cumulative subspace coherence of a
dictionary, which are natural generalizations of existing notions of mutual and
cumulative coherence. We evaluate the performance of the proposed convex
programs through simulations as well as real experiments on face recognition.
We show that treating the face recognition problem as a block-sparse recovery
problem improves the state-of-the-art results by 10% with only 25% of the
training data.Comment: IEEE Transactions on Signal Processin
CUR Decompositions, Similarity Matrices, and Subspace Clustering
A general framework for solving the subspace clustering problem using the CUR
decomposition is presented. The CUR decomposition provides a natural way to
construct similarity matrices for data that come from a union of unknown
subspaces . The similarity
matrices thus constructed give the exact clustering in the noise-free case.
Additionally, this decomposition gives rise to many distinct similarity
matrices from a given set of data, which allow enough flexibility to perform
accurate clustering of noisy data. We also show that two known methods for
subspace clustering can be derived from the CUR decomposition. An algorithm
based on the theoretical construction of similarity matrices is presented, and
experiments on synthetic and real data are presented to test the method.
Additionally, an adaptation of our CUR based similarity matrices is utilized
to provide a heuristic algorithm for subspace clustering; this algorithm yields
the best overall performance to date for clustering the Hopkins155 motion
segmentation dataset.Comment: Approximately 30 pages. Current version contains improved algorithm
and numerical experiments from the previous versio
Criminal data analysis based on low rank sparse representation
FINDING effective clustering methods for a high dimensional dataset is challenging due to the curse of dimensionality. These challenges can usually make the most of basic common algorithms fail in highdimensional spaces from tackling problems such as large number of groups, and overlapping. Most domains uses some parameters to describe the appearance, geometry and dynamics of a scene. This has motivated the implementation of several techniques of a high-dimensional data for finding a low-dimensional space. Many proposed methods fail to overcome the challenges, especially when the data input is high-dimensional, and the clusters have a complex.
REGULARLY in high dimensional data, lots of the data dimensions are not related and might hide the existing clusters in noisy data. High-dimensional data often reside on some low dimensional subspaces. The problem of subspace clustering algorithms is to uncover the type of relationship of an objects from one dimension that are related in different subsets of another dimensions. The state-of-the-art methods for subspace segmentation which included the Low Rank Representation (LRR) and Sparse Representation (SR). The former
seeks the global lowest-rank representation but restrictively assumes the independence among subspaces, whereas the latter seeks the clustering of disjoint or overlapped subspaces through locality measure, which, however, causes failure in the case of large noise.
THIS thesis aims are to identify the key problems and obstacles that have challenged the researchers in recent years in clustering high dimensional data, then to implement an effective subspace clustering methods for solving high dimensional crimes domains for both real events and synthetic data which has complex data structure with 168 different offence crimes. As well as to overcome the disadvantages of existed subspace algorithms techniques. To this end, a Low-Rank Sparse Representation (LRSR) theory, the future will refer to as Criminal Data Analysis Based on LRSR will be examined, then to be used to recover and segment embedding subspaces. The results of these methods will be discussed and compared with what already have been examined on previous approaches such as K-mean and PCA segmented based on K-means. The previous approaches have helped us to chose the right subspace clustering methods. The Proposed method based on subspace segmentation method named Low Rank subspace Sparse Representation (LRSR) which not only recovers the low-rank subspaces but also gets a relatively sparse segmentation with respect to disjoint subspaces or even overlapping subspaces.
BOTH UCI Machine Learning Repository, and crime database are the best to find and compare the best subspace clustering algorithm that fit for high dimensional space data. We used many Open-Source Machine Learning Frameworks and Tools for both employ our machine learning tasks and methods including preparing, transforming, clustering and visualizing the high-dimensional crime dataset, we precisely have used the most modern and powerful Machine Learning Frameworks data science that known as SciKit-Learn for library for the Python programming language, as well as we have used R, and Matlab in previous experiment
- …