Search CORE

10 research outputs found

From random walks to distances on unweighted graphs

Author: Hashimoto Tatsunori B.
Jaakkola Tommi S.
Sun Yi
Publication venue
Publication date: 02/11/2015
Field of study

Large unweighted directed graphs are commonly used to capture relations between entities. A fundamental problem in the analysis of such networks is to properly define the similarity or dissimilarity between any two vertices. Despite the significance of this problem, statistical characterization of the proposed metrics has been limited. We introduce and develop a class of techniques for analyzing random walks on graphs using stochastic calculus. Using these techniques we generalize results on the degeneracy of hitting times and analyze a metric based on the Laplace transformed hitting time (LTHT). The metric serves as a natural, provably well-behaved alternative to the expected hitting time. We establish a general correspondence between hitting times of the Brownian motion and analogous hitting times on the graph. We show that the LTHT is consistent with respect to the underlying metric of a geometric graph, preserves clustering tendency, and remains robust against random addition of non-geometric edges. Tests on simulated and real-world data show that the LTHT matches theoretical predictions and outperforms alternatives.Comment: To appear in NIPS 201

arXiv.org e-Print Archive

DSpace@MIT

Measuring Global Similarity between Texts

Author: C Labbé
C Labbé
C Labbé
D Labbé
F Damerau
HW Kuhn
J Savoy
J Savoy
L Kaufman
MA Cortelazzo
SB Needleman
T Smith
U Fahrenberg
Publication venue
Publication date: 14/05/2014
Field of study

We propose a new similarity measure between texts which, contrary to the current state-of-the-art approaches, takes a global view of the texts to be compared. We have implemented a tool to compute our textual distance and conducted experiments on several corpuses of texts. The experiments show that our methods can reliably identify different global types of texts.Comment: Submitted to SLSP 201

arXiv.org e-Print Archive

HAL-CentraleSupelec

Crossref

INRIA a CCSD electronic archive server

Hal-Diderot

HAL-Rennes 1

Streaming Graph Challenge: Stochastic Block Partition

Author: Gadepally Vijay
Hurley Michael
Jones Michael
Kao Edward
Kepner Jeremy
Mohindra Sanjeev
Monticciolo Paul
Reuther Albert
Samsi Siddharth
Smith Steven
Song William
Staheli Diane
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 25/08/2017
Field of study

An important objective for analyzing real-world graphs is to achieve scalable performance on large, streaming graphs. A challenging and relevant example is the graph partition problem. As a combinatorial problem, graph partition is NP-hard, but existing relaxation methods provide reasonable approximate solutions that can be scaled for large graphs. Competitive benchmarks and challenges have proven to be an effective means to advance state-of-the-art performance and foster community collaboration. This paper describes a graph partition challenge with a baseline partition algorithm of sub-quadratic complexity. The algorithm employs rigorous Bayesian inferential methods based on a statistical model that captures characteristics of the real-world graphs. This strong foundation enables the algorithm to address limitations of well-known graph partition approaches such as modularity maximization. This paper describes various aspects of the challenge including: (1) the data sets and streaming graph generator, (2) the baseline partition algorithm with pseudocode, (3) an argument for the correctness of parallelizing the Bayesian inference, (4) different parallel computation strategies such as node-based parallelism and matrix-based parallelism, (5) evaluation metrics for partition correctness and computational requirements, (6) preliminary timing of a Python-based demonstration code and the open source C++ code, and (7) considerations for partitioning the graph in streaming fashion. Data sets and source code for the algorithm as well as metrics, with detailed documentation are available at GraphChallenge.org.Comment: To be published in 2017 IEEE High Performance Extreme Computing Conference (HPEC

arXiv.org e-Print Archive

Crossref

Le Pouvoir d'Information Supplementaire en Detection des Sousgraphes

Author: Avrachenkov Konstantin
Cottatellucci Laura
Kadavankandy Arun
Sundaresan Rajesh
Publication venue: HAL CCSD
Publication date: 28/02/2017
Field of study

In this work, we tackle the problem of hidden community detection. We consider Belief Propagation (BP) applied to the problem of detecting a hidden Erd\H{o}s-R\'enyi (ER) graph embedded in a larger and sparser ER graph, in the presence of side-information. We derive two related algorithms based on BP to perform subgraph detection in the presence of two kinds of side-information. The first variant of side-information consists of a set of nodes, called cues, known to be from the subgraph. The second variant of side-information consists of a set of nodes that are cues with a given probability. It was shown in past works that BP without side-information fails to detect the subgraph correctly when an effective signal-to-noise ratio (SNR) parameter falls below a threshold. In contrast, in the presence of non-trivial side-information, we show that the BP algorithm achieves asymptotically zero error for any value of the SNR parameter. We validate our results through simulations on synthetic datasets as well as on a few real world networks

HAL-UNICE

INRIA a CCSD electronic archive server

Contagion Source Detection in Epidemic and Infodemic Outbreaks: Mathematical Analysis and Network Algorithms

Author: Tan Chee Wei
Yu Pei-Duo
Publication venue: 'Now Publishers'
Publication date: 08/07/2023
Field of study

This monograph provides an overview of the mathematical theories and computational algorithm design for contagion source detection in large networks. By leveraging network centrality as a tool for statistical inference, we can accurately identify the source of contagions, trace their spread, and predict future trajectories. This approach provides fundamental insights into surveillance capability and asymptotic behavior of contagion spreading in networks. Mathematical theory and computational algorithms are vital to understanding contagion dynamics, improving surveillance capabilities, and developing effective strategies to prevent the spread of infectious diseases and misinformation.Comment: Suggested Citation: Chee Wei Tan and Pei-Duo Yu (2023), "Contagion Source Detection in Epidemic and Infodemic Outbreaks: Mathematical Analysis and Network Algorithms", Foundations and Trends in Networking: Vol. 13: No. 2-3, pp 107-251. http://dx.doi.org/10.1561/130000006

arXiv.org e-Print Archive

Bayesian Discovery of Threat Networks

Author: Edward K. Kao
Garrett Bernstein
Kenneth D. Senne
Scott Philips
Steven T. Smith
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref