Search CORE

4,330 research outputs found

Euclidean Distances, soft and spectral Clustering on Weighted Graphs

Author: A. Beurling
A. Ng
C.K.I. Williams
F. Bavaud
F. Bavaud
F. Bavaud
F. Bavaud
F. Critchley
F. Fouss
G. Dunn
G. Young
I.J. Schoenberg
I.J. Schoenberg
J. Berger
J. Shi
J.B. Tenenbaum
J.G. Kemeny
K. Rose
K. Rose
K.V. Mardia
M. Belkin
M. Deza
M. Filippone
M. Hein
M. Kijima
M.J. Greenacre
R. Nock
S. Lafon
T.M. Cover
U. Luxburg von
W.S. Torgerson
Z. Huang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

We define a class of Euclidean distances on weighted graphs, enabling to perform thermodynamic soft graph clustering. The class can be constructed form the "raw coordinates" encountered in spectral clustering, and can be extended by means of higher-dimensional embeddings (Schoenberg transformations). Geographical flow data, properly conditioned, illustrate the procedure as well as visualization aspects.Comment: accepted for presentation (and further publication) at the ECML PKDD 2010 conferenc

arXiv.org e-Print Archive

Crossref

Serveur académique lausannois

Laplacian Mixture Modeling for Network Analysis and Unsupervised Learning on Graphs

Author: Korenblum Daniel
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2018
Field of study

Laplacian mixture models identify overlapping regions of influence in unlabeled graph and network data in a scalable and computationally efficient way, yielding useful low-dimensional representations. By combining Laplacian eigenspace and finite mixture modeling methods, they provide probabilistic or fuzzy dimensionality reductions or domain decompositions for a variety of input data types, including mixture distributions, feature vectors, and graphs or networks. Provable optimal recovery using the algorithm is analytically shown for a nontrivial class of cluster graphs. Heuristic approximations for scalable high-performance implementations are described and empirically tested. Connections to PageRank and community detection in network analysis demonstrate the wide applicability of this approach. The origins of fuzzy spectral methods, beginning with generalized heat or diffusion equations in physics, are reviewed and summarized. Comparisons to other dimensionality reduction and clustering methods for challenging unsupervised machine learning problems are also discussed.Comment: 13 figures, 35 reference

arXiv.org e-Print Archive

Directory of Open Access Journals

A survey of kernel and spectral methods for clustering

Author: Aizerman
Aronszajn
Belkin
Bengio
Bezdek
Bishop
Burges
Camastra
Chan
Chen
Chiang
Cortes
Cristianini
Cristianini
Dhillon
Dhillon
Donath
Duda
Fiedler
Fisher
Francesco Camastra
Francesco Masulli
Gersho
Girolami
Golub
Have
Horn
Huber
Hur
Jain
Kernighan
Kluger
Kohonen
Kohonen
Krishnapuram
Krishnapuram
Kulis
Lee
Leski
Linde
Lloyd
Martinetz
Maurizio Filippone
Mercer
Müller
Ng
Ritter
Rose
Roth
Roweis
Saitoh
Schölkopf
Schölkopf
Shi
Sigillito
Sneath
Stefano Rovetta
Tax
Vapnik
von Luxburg
Ward
Weston
Wolberg
Xu
Zhang
Publication venue: 'Elsevier BV'
Publication date: 01/01/2007
Field of study

Clustering algorithms are a useful tool to explore data structures and have been employed in many disciplines. The focus of this paper is the partitioning clustering problem with a special interest in two recent approaches: kernel and spectral methods. The aim of this paper is to present a survey of kernel and spectral clustering methods, two approaches able to produce nonlinear separating hypersurfaces between clusters. The presented kernel clustering methods are the kernel version of many classical clustering algorithms, e.g., K-means, SOM and neural gas. Spectral clustering arise from concepts in spectral graph theory and the clustering problem is configured as a graph cut problem where an appropriate objective function has to be optimized. An explicit proof of the fact that these two paradigms have the same objective is reported since it has been proven that these two seemingly different approaches have the same mathematical foundation. Besides, fuzzy kernel clustering methods are presented as extensions of kernel K-means clustering algorithm. (C) 2007 Pattem Recognition Society. Published by Elsevier Ltd. All rights reserved

CiteSeerX

Archivio della ricerca - Università degli studi di Napoli "Parthenope"

Crossref

Enlighten

Archivio istituzionale della ricerca - Università di Genova

White Rose Research Online

Developments in the theory of randomized shortest paths with a comparison of graph node distances

Author: Kivimäki Ilkka
Saerens Marco
Shimbo Masashi
Publication venue: 'Elsevier BV'
Publication date: 03/10/2013
Field of study

There have lately been several suggestions for parametrized distances on a graph that generalize the shortest path distance and the commute time or resistance distance. The need for developing such distances has risen from the observation that the above-mentioned common distances in many situations fail to take into account the global structure of the graph. In this article, we develop the theory of one family of graph node distances, known as the randomized shortest path dissimilarity, which has its foundation in statistical physics. We show that the randomized shortest path dissimilarity can be easily computed in closed form for all pairs of nodes of a graph. Moreover, we come up with a new definition of a distance measure that we call the free energy distance. The free energy distance can be seen as an upgrade of the randomized shortest path dissimilarity as it defines a metric, in addition to which it satisfies the graph-geodetic property. The derivation and computation of the free energy distance are also straightforward. We then make a comparison between a set of generalized distances that interpolate between the shortest path distance and the commute time, or resistance distance. This comparison focuses on the applicability of the distances in graph node clustering and classification. The comparison, in general, shows that the parametrized distances perform well in the tasks. In particular, we see that the results obtained with the free energy distance are among the best in all the experiments.Comment: 30 pages, 4 figures, 3 table

arXiv.org e-Print Archive

Crossref

DIAL UCLouvain

Computing communities in large networks using random walks

Author: Latapy Matthieu
Pons Pascal
Publication venue
Publication date: 01/01/2004
Field of study

Dense subgraphs of sparse graphs (communities), which appear in most real-world complex networks, play an important role in many contexts. Computing them however is generally expensive. We propose here a measure of similarities between vertices based on random walks which has several important advantages: it captures well the community structure in a network, it can be computed efficiently, it works at various scales, and it can be used in an agglomerative algorithm to compute efficiently the community structure of a network. We propose such an algorithm which runs in time O(mn^2) and space O(n^2) in the worst case, and in time O(n^2log n) and space O(n^2) in most real-world cases (n and m are respectively the number of vertices and edges in the input graph). Experimental evaluation shows that our algorithm surpasses previously proposed ones concerning the quality of the obtained community structures and that it stands among the best ones concerning the running time. This is very promising because our algorithm can be improved in several ways, which we sketch at the end of the paper.Comment: 15 pages, 4 figure

arXiv.org e-Print Archive

CiteSeerX