Many data types arising from data mining applications can be modeled as
bipartite graphs, examples include terms and documents in a text corpus,
customers and purchasing items in market basket analysis and reviewers and
movies in a movie recommender system. In this paper, we propose a new data
clustering method based on partitioning the underlying bipartite graph. The
partition is constructed by minimizing a normalized sum of edge weights between
unmatched pairs of vertices of the bipartite graph. We show that an approximate
solution to the minimization problem can be obtained by computing a partial
singular value decomposition (SVD) of the associated edge weight matrix of the
bipartite graph. We point out the connection of our clustering algorithm to
correspondence analysis used in multivariate analysis. We also briefly discuss
the issue of assigning data objects to multiple clusters. In the experimental
results, we apply our clustering algorithm to the problem of document
clustering to illustrate its effectiveness and efficiency.Comment: Proceedings of ACM CIKM 2001, the Tenth International Conference on
Information and Knowledge Management, 200