Search CORE

52,822 research outputs found

Euclidean Distances, soft and spectral Clustering on Weighted Graphs

Author: A. Beurling
A. Ng
C.K.I. Williams
F. Bavaud
F. Bavaud
F. Bavaud
F. Bavaud
F. Critchley
F. Fouss
G. Dunn
G. Young
I.J. Schoenberg
I.J. Schoenberg
J. Berger
J. Shi
J.B. Tenenbaum
J.G. Kemeny
K. Rose
K. Rose
K.V. Mardia
M. Belkin
M. Deza
M. Filippone
M. Hein
M. Kijima
M.J. Greenacre
R. Nock
S. Lafon
T.M. Cover
U. Luxburg von
W.S. Torgerson
Z. Huang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

We define a class of Euclidean distances on weighted graphs, enabling to perform thermodynamic soft graph clustering. The class can be constructed form the "raw coordinates" encountered in spectral clustering, and can be extended by means of higher-dimensional embeddings (Schoenberg transformations). Geographical flow data, properly conditioned, illustrate the procedure as well as visualization aspects.Comment: accepted for presentation (and further publication) at the ECML PKDD 2010 conferenc

arXiv.org e-Print Archive

Crossref

Serveur académique lausannois

Latent Random Steps as Relaxations of Max-Cut, Min-Cut, and More

Author: Chanpuriya Sudhanshu
Musco Cameron
Publication venue
Publication date: 11/08/2023
Field of study

Algorithms for node clustering typically focus on finding homophilous structure in graphs. That is, they find sets of similar nodes with many edges within, rather than across, the clusters. However, graphs often also exhibit heterophilous structure, as exemplified by (nearly) bipartite and tripartite graphs, where most edges occur across the clusters. Grappling with such structure is typically left to the task of graph simplification. We present a probabilistic model based on non-negative matrix factorization which unifies clustering and simplification, and provides a framework for modeling arbitrary graph structure. Our model is based on factorizing the process of taking a random walk on the graph. It permits an unconstrained parametrization, allowing for optimization via simple gradient descent. By relaxing the hard clustering to a soft clustering, our algorithm relaxes potentially hard clustering problems to a tractable ones. We illustrate our algorithm's capabilities on a synthetic graph, as well as simple unsupervised learning tasks involving bipartite and tripartite clustering of orthographic and phonological data

arXiv.org e-Print Archive

Graph ambiguity

Author: LIVI LORENZO
RIZZI Antonello
Publication venue: 'Elsevier BV'
Publication date: 01/01/2013
Field of study

In this paper, we propose a rigorous way to define the concept of ambiguity in the domain of graphs. In past studies, the classical definition of ambiguity has been derived starting from fuzzy set and fuzzy information theories. Our aim is to show that also in the domain of the graphs it is possible to derive a formulation able to capture the same semantic and mathematical concept. To strengthen the theoretical results, we discuss the application of the graph ambiguity concept to the graph classification setting, conceiving a new kind of inexact graph matching procedure. The results prove that the graph ambiguity concept is a characterizing and discriminative property of graphs. (C) 2013 Elsevier B.V. All rights reserved

Archivio della ricerca- Università di Roma La Sapienza

The most persistent soft-clique in a set of sampled graphs

Author: Quadrianto Novi
Chen Chao
Lampert Christoph H
Publication venue: Omnipress
Publication date: 26/03/2005
Field of study

When searching for characteristic subpatterns in potentially noisy graph data, it appears self-evident that having multiple observations would be better than having just one. However, it turns out that the inconsistencies introduced when different graph instances have different edge sets pose a serious challenge. In this work we address this challenge for the problem of finding maximum weighted cliques. We introduce the concept of most persistent soft-clique. This is subset of vertices, that 1) is almost fully or at least densely connected, 2) occurs in all or almost all graph instances, and 3) has the maximum weight. We present a measure of clique-ness, that essentially counts the number of edge missing to make a subset of vertices into a clique. With this measure, we show that the problem of finding the most persistent soft-clique problem can be cast either as: a) a max-min two person game optimization problem, or b) a min-min soft margin optimization problem. Both formulations lead to the same solution when using a partial Lagrangian method to solve the optimization problems. By experiments on synthetic data and on real social network data, we show that the proposed method is able to reliably find soft cliques in graph data, even if that is distorted by random noise or unreliable observations

arXiv.org e-Print Archive

CiteSeerX

CERN Document Server

Sussex Research Online

University of St. Andrews - Pure