163,223 research outputs found
Bipartite graph partitioning and data clustering
Many data types arising from data mining applications can be modeled as
bipartite graphs, examples include terms and documents in a text corpus,
customers and purchasing items in market basket analysis and reviewers and
movies in a movie recommender system. In this paper, we propose a new data
clustering method based on partitioning the underlying bipartite graph. The
partition is constructed by minimizing a normalized sum of edge weights between
unmatched pairs of vertices of the bipartite graph. We show that an approximate
solution to the minimization problem can be obtained by computing a partial
singular value decomposition (SVD) of the associated edge weight matrix of the
bipartite graph. We point out the connection of our clustering algorithm to
correspondence analysis used in multivariate analysis. We also briefly discuss
the issue of assigning data objects to multiple clusters. In the experimental
results, we apply our clustering algorithm to the problem of document
clustering to illustrate its effectiveness and efficiency.Comment: Proceedings of ACM CIKM 2001, the Tenth International Conference on
Information and Knowledge Management, 200
Local Measurement and Reconstruction for Noisy Graph Signals
The emerging field of signal processing on graph plays a more and more
important role in processing signals and information related to networks.
Existing works have shown that under certain conditions a smooth graph signal
can be uniquely reconstructed from its decimation, i.e., data associated with a
subset of vertices. However, in some potential applications (e.g., sensor
networks with clustering structure), the obtained data may be a combination of
signals associated with several vertices, rather than the decimation. In this
paper, we propose a new concept of local measurement, which is a generalization
of decimation. Using the local measurements, a local-set-based method named
iterative local measurement reconstruction (ILMR) is proposed to reconstruct
bandlimited graph signals. It is proved that ILMR can reconstruct the original
signal perfectly under certain conditions. The performance of ILMR against
noise is theoretically analyzed. The optimal choice of local weights and a
greedy algorithm of local set partition are given in the sense of minimizing
the expected reconstruction error. Compared with decimation, the proposed local
measurement sampling and reconstruction scheme is more robust in noise existing
scenarios.Comment: 24 pages, 6 figures, 2 tables, journal manuscrip
Notes on Elementary Spectral Graph Theory Applications to Graph Clustering Using Normalized Cuts
These are notes on the method of normalized graph cuts and its applications to graph clustering. I provide a fairly thorough treatment of this deeply original method due to Shi and Malik, including complete proofs. I include the necessary background on graphs and graph Laplacians. I then explain in detail how the eigenvectors of the graph Laplacian can be used to draw a graph. This is an attractive application of graph Laplacians. The main thrust of this paper is the method of normalized cuts. I give a detailed account for K = 2 clusters, and also for K \u3e 2 clusters, based on the work of Yu and Shi. Three points that do not appear to have been clearly articulated before are elaborated: 1. The solutions of the main optimization problem should be viewed as tuples in the K-fold cartesian product of projective space RP^{N-1}. 2. When K \u3e 2 , the solutions of the relaxed problem should be viewed as elements of the Grassmannian G(K,N)
- …