846 research outputs found
Similarity Learning via Kernel Preserving Embedding
Data similarity is a key concept in many data-driven applications. Many
algorithms are sensitive to similarity measures. To tackle this fundamental
problem, automatically learning of similarity information from data via
self-expression has been developed and successfully applied in various models,
such as low-rank representation, sparse subspace learning, semi-supervised
learning. However, it just tries to reconstruct the original data and some
valuable information, e.g., the manifold structure, is largely ignored. In this
paper, we argue that it is beneficial to preserve the overall relations when we
extract similarity information. Specifically, we propose a novel similarity
learning framework by minimizing the reconstruction error of kernel matrices,
rather than the reconstruction error of original data adopted by existing work.
Taking the clustering task as an example to evaluate our method, we observe
considerable improvements compared to other state-of-the-art methods. More
importantly, our proposed framework is very general and provides a novel and
fundamental building block for many other similarity-based tasks. Besides, our
proposed kernel preserving opens up a large number of possibilities to embed
high-dimensional data into low-dimensional space.Comment: Published in AAAI 201
Robust Spectral Clustering via Sparse Representation
Clustering high-dimensional data has been a challenging problem in data mining and machining learning. Spectral clustering via sparse representation has been proposed for clustering high-dimensional data. A critical step in spectral clustering is to effectively construct a weight matrix by assessing the proximity between each pair of objects. While sparse representation proves its effectiveness for compressing high-dimensional signals, existing spectral clustering algorithms based on sparse representation use those sparse coefficients directly. We believe that the similarity measure exploiting more global information from the coefficient vectors will provide more truthful similarity among data objects. The intuition is that the sparse coefficient vectors corresponding to two similar objects are similar and those of two dissimilar objects are also dissimilar. In particular, we propose two approaches of weight matrix construction according to the similarity of the sparse coefficient vectors. Experimental results on several real-world high-dimensional data sets demonstrate that spectral clustering based on the proposed similarity matrices outperforms existing spectral clustering algorithms via sparse representation
- …