Search CORE

91,587 research outputs found

Efficient Clustering on Riemannian Manifolds: A Kernelised Random Projection Approach

Author: Alavi Azadeh
Lovell Brian C.
Wiliem Arnold
Zhao Kun
Publication venue
Publication date: 18/09/2015
Field of study

Reformulating computer vision problems over Riemannian manifolds has demonstrated superior performance in various computer vision applications. This is because visual data often forms a special structure lying on a lower dimensional space embedded in a higher dimensional space. However, since these manifolds belong to non-Euclidean topological spaces, exploiting their structures is computationally expensive, especially when one considers the clustering analysis of massive amounts of data. To this end, we propose an efficient framework to address the clustering problem on Riemannian manifolds. This framework implements random projections for manifold points via kernel space, which can preserve the geometric structure of the original space, but is computationally efficient. Here, we introduce three methods that follow our framework. We then validate our framework on several computer vision applications by comparing against popular clustering methods on Riemannian manifolds. Experimental results demonstrate that our framework maintains the performance of the clustering whilst massively reducing computational complexity by over two orders of magnitude in some cases

arXiv.org e-Print Archive

University of Queensland eSpace

Kernel Truncated Regression Representation for Robust Subspace Clustering

Author: Peng Dezhong
Wang Wei
Yao Xin
Zhen Liangli
Publication venue
Publication date: 27/03/2020
Field of study

Subspace clustering aims to group data points into multiple clusters of which each corresponds to one subspace. Most existing subspace clustering approaches assume that input data lie on linear subspaces. In practice, however, this assumption usually does not hold. To achieve nonlinear subspace clustering, we propose a novel method, called kernel truncated regression representation. Our method consists of the following four steps: 1) projecting the input data into a hidden space, where each data point can be linearly represented by other data points; 2) calculating the linear representation coefficients of the data representations in the hidden space; 3) truncating the trivial coefficients to achieve robustness and block-diagonality; and 4) executing the graph cutting operation on the coefficient matrix by solving a graph Laplacian problem. Our method has the advantages of a closed-form solution and the capacity of clustering data points that lie on nonlinear subspaces. The first advantage makes our method efficient in handling large-scale datasets, and the second one enables the proposed method to conquer the nonlinear subspace clustering challenge. Extensive experiments on six benchmarks demonstrate the effectiveness and the efficiency of the proposed method in comparison with current state-of-the-art approaches.Comment: 14 page

arXiv.org e-Print Archive

University of Birmingham Research Portal

Embed and Conquer: Scalable Embeddings for Kernel k-Means on MapReduce

Author: Elgohary Ahmed
Farahat Ahmed K.
Kamel Mohamed S.
Karray Fakhri
Publication venue
Publication date: 29/01/2014
Field of study

The kernel

k

-means is an effective method for data clustering which extends the commonly-used

k

-means algorithm to work on a similarity matrix over complex data structures. The kernel

k

-means algorithm is however computationally very complex as it requires the complete data matrix to be calculated and stored. Further, the kernelized nature of the kernel

k

-means algorithm hinders the parallelization of its computations on modern infrastructures for distributed computing. In this paper, we are defining a family of kernel-based low-dimensional embeddings that allows for scaling kernel

k

-means on MapReduce via an efficient and unified parallelization strategy. Afterwards, we propose two methods for low-dimensional embedding that adhere to our definition of the embedding family. Exploiting the proposed parallelization strategy, we present two scalable MapReduce algorithms for kernel

k

-means. We demonstrate the effectiveness and efficiency of the proposed algorithms through an empirical evaluation on benchmark data sets.Comment: Appears in Proceedings of the SIAM International Conference on Data Mining (SDM), 201

arXiv.org e-Print Archive

CiteSeerX