15 research outputs found
Self-weighted Multiple Kernel Learning for Graph-based Clustering and Semi-supervised Classification
Multiple kernel learning (MKL) method is generally believed to perform better
than single kernel method. However, some empirical studies show that this is
not always true: the combination of multiple kernels may even yield an even
worse performance than using a single kernel. There are two possible reasons
for the failure: (i) most existing MKL methods assume that the optimal kernel
is a linear combination of base kernels, which may not hold true; and (ii) some
kernel weights are inappropriately assigned due to noises and carelessly
designed algorithms. In this paper, we propose a novel MKL framework by
following two intuitive assumptions: (i) each kernel is a perturbation of the
consensus kernel; and (ii) the kernel that is close to the consensus kernel
should be assigned a large weight. Impressively, the proposed method can
automatically assign an appropriate weight to each kernel without introducing
additional parameters, as existing methods do. The proposed framework is
integrated into a unified framework for graph-based clustering and
semi-supervised classification. We have conducted experiments on multiple
benchmark datasets and our empirical results verify the superiority of the
proposed framework.Comment: Accepted by IJCAI 2018, Code is availabl
Similarity Learning via Kernel Preserving Embedding
Data similarity is a key concept in many data-driven applications. Many
algorithms are sensitive to similarity measures. To tackle this fundamental
problem, automatically learning of similarity information from data via
self-expression has been developed and successfully applied in various models,
such as low-rank representation, sparse subspace learning, semi-supervised
learning. However, it just tries to reconstruct the original data and some
valuable information, e.g., the manifold structure, is largely ignored. In this
paper, we argue that it is beneficial to preserve the overall relations when we
extract similarity information. Specifically, we propose a novel similarity
learning framework by minimizing the reconstruction error of kernel matrices,
rather than the reconstruction error of original data adopted by existing work.
Taking the clustering task as an example to evaluate our method, we observe
considerable improvements compared to other state-of-the-art methods. More
importantly, our proposed framework is very general and provides a novel and
fundamental building block for many other similarity-based tasks. Besides, our
proposed kernel preserving opens up a large number of possibilities to embed
high-dimensional data into low-dimensional space.Comment: Published in AAAI 201
Unified Spectral Clustering with Optimal Graph
Spectral clustering has found extensive use in many areas. Most traditional
spectral clustering algorithms work in three separate steps: similarity graph
construction; continuous labels learning; discretizing the learned labels by
k-means clustering. Such common practice has two potential flaws, which may
lead to severe information loss and performance degradation. First, predefined
similarity graph might not be optimal for subsequent clustering. It is
well-accepted that similarity graph highly affects the clustering results. To
this end, we propose to automatically learn similarity information from data
and simultaneously consider the constraint that the similarity matrix has exact
c connected components if there are c clusters. Second, the discrete solution
may deviate from the spectral solution since k-means method is well-known as
sensitive to the initialization of cluster centers. In this work, we transform
the candidate solution into a new one that better approximates the discrete
one. Finally, those three subtasks are integrated into a unified framework,
with each subtask iteratively boosted by using the results of the others
towards an overall optimal solution. It is known that the performance of a
kernel method is largely determined by the choice of kernels. To tackle this
practical problem of how to select the most suitable kernel for a particular
data set, we further extend our model to incorporate multiple kernel learning
ability. Extensive experiments demonstrate the superiority of our proposed
method as compared to existing clustering approaches.Comment: Accepted by AAAI 201