Search CORE

141 research outputs found

A Distributed Block Chebyshev-Davidson Algorithm for Parallel Spectral Clustering

Author: Pang Qiyuan
Yang Haizhao
Publication venue
Publication date: 05/01/2024
Field of study

We develop a distributed Block Chebyshev-Davidson algorithm to solve large-scale leading eigenvalue problems for spectral analysis in spectral clustering. First, the efficiency of the Chebyshev-Davidson algorithm relies on the prior knowledge of the eigenvalue spectrum, which could be expensive to estimate. This issue can be lessened by the analytic spectrum estimation of the Laplacian or normalized Laplacian matrices in spectral clustering, making the proposed algorithm very efficient for spectral clustering. Second, to make the proposed algorithm capable of analyzing big data, a distributed and parallel version has been developed with attractive scalability. The speedup by parallel computing is approximately equivalent to

\sqrt{p}

, where

p

denotes the number of processes. {Numerical results will be provided to demonstrate its efficiency in spectral clustering and scalability advantage over existing eigensolvers used for spectral clustering in parallel computing environments.

arXiv.org e-Print Archive

K-tree: Large Scale Document Clustering

Author: De Vries Christopher M.
Geva Shlomo
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2009
Field of study

We introduce K-tree in an information retrieval context. It is an efficient approximation of the k-means clustering algorithm. Unlike k-means it forms a hierarchy of clusters. It has been extended to address issues with sparse representations. We compare performance and quality to CLUTO using document collections. The K-tree has a low time complexity that is suitable for large document collections. This tree structure allows for efficient disk based implementations where space requirements exceed that of main memory.Comment: 2 pages, SIGIR 200

arXiv.org e-Print Archive

CiteSeerX

Crossref

Queensland University of Technology ePrints Archive