163,223 research outputs found

    Bipartite graph partitioning and data clustering

    Full text link
    Many data types arising from data mining applications can be modeled as bipartite graphs, examples include terms and documents in a text corpus, customers and purchasing items in market basket analysis and reviewers and movies in a movie recommender system. In this paper, we propose a new data clustering method based on partitioning the underlying bipartite graph. The partition is constructed by minimizing a normalized sum of edge weights between unmatched pairs of vertices of the bipartite graph. We show that an approximate solution to the minimization problem can be obtained by computing a partial singular value decomposition (SVD) of the associated edge weight matrix of the bipartite graph. We point out the connection of our clustering algorithm to correspondence analysis used in multivariate analysis. We also briefly discuss the issue of assigning data objects to multiple clusters. In the experimental results, we apply our clustering algorithm to the problem of document clustering to illustrate its effectiveness and efficiency.Comment: Proceedings of ACM CIKM 2001, the Tenth International Conference on Information and Knowledge Management, 200

    Local Measurement and Reconstruction for Noisy Graph Signals

    Full text link
    The emerging field of signal processing on graph plays a more and more important role in processing signals and information related to networks. Existing works have shown that under certain conditions a smooth graph signal can be uniquely reconstructed from its decimation, i.e., data associated with a subset of vertices. However, in some potential applications (e.g., sensor networks with clustering structure), the obtained data may be a combination of signals associated with several vertices, rather than the decimation. In this paper, we propose a new concept of local measurement, which is a generalization of decimation. Using the local measurements, a local-set-based method named iterative local measurement reconstruction (ILMR) is proposed to reconstruct bandlimited graph signals. It is proved that ILMR can reconstruct the original signal perfectly under certain conditions. The performance of ILMR against noise is theoretically analyzed. The optimal choice of local weights and a greedy algorithm of local set partition are given in the sense of minimizing the expected reconstruction error. Compared with decimation, the proposed local measurement sampling and reconstruction scheme is more robust in noise existing scenarios.Comment: 24 pages, 6 figures, 2 tables, journal manuscrip

    Notes on Elementary Spectral Graph Theory Applications to Graph Clustering Using Normalized Cuts

    Get PDF
    These are notes on the method of normalized graph cuts and its applications to graph clustering. I provide a fairly thorough treatment of this deeply original method due to Shi and Malik, including complete proofs. I include the necessary background on graphs and graph Laplacians. I then explain in detail how the eigenvectors of the graph Laplacian can be used to draw a graph. This is an attractive application of graph Laplacians. The main thrust of this paper is the method of normalized cuts. I give a detailed account for K = 2 clusters, and also for K \u3e 2 clusters, based on the work of Yu and Shi. Three points that do not appear to have been clearly articulated before are elaborated: 1. The solutions of the main optimization problem should be viewed as tuples in the K-fold cartesian product of projective space RP^{N-1}. 2. When K \u3e 2 , the solutions of the relaxed problem should be viewed as elements of the Grassmannian G(K,N)
    • …
    corecore