Search CORE

13,303 research outputs found

A State-Space Model for the Dynamic Random Subgraph Model

Author: Bouveyron Charles
Latouche Pierre
Zreik Rawya
Publication venue: HAL CCSD
Publication date: 22/04/2015
Field of study

Proceedings of the 23-th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN 2015)International audienceIn recent years, many random graph models have been proposed to extract information from networks. The principle is to look for groups of vertices with homogenous connection profiles. Most of these models are suitable for static networks and can handle different types of edges. This work is motivated by the need of analyzing an evolving network describing email communications between employees of the Enron compagny where social positions play an important role. Therefore, in this paper, we consider the random subgraph model (RSM) which was proposed recently to model networks through latent clusters built within known partitions. Using a state space model to characterize the cluster proportions, RSM is then extended in order to deal with dynamic networks. We call the latter the dynamic random subgraph model (dRSM)

CiteSeerX

HAL Descartes

HAL-Paris1

CLASSIFICATION AUTOMATIQUE DE RÉSEAUXDYNAMIQUES AVEC SOUS-GRAPHES : ÉTUDE DUSCANDALE ENRON

Author: Bouveyron Charles
Latouche Pierre
Zreik Rawya
Publication venue: Société Française de Statistique et Société Mathématique de France
Publication date: 01/01/2015
Field of study

International audienceAbstract. — In recent years, many random graph models have been proposed to extract information from networks. The principle is to look for com-munities or groups of vertices with homogenous connection profiles. Most of these models are suitable for static networks, that is to say, not taking into account the temporal dimension, but can handle different types of edges, whether binary or discrete. This work is motivated by the need of analysing an evolving network describing email communications between employees of the Enron compagny where social positions play an important role. Therefore, in this paper, we consider the random subgraph model (RSM) which was pro-posed recently to model networks through latent clusters built within known partitions. Using a state space model to characterize the cluster proportions, RSM is then extended in order to deal with dynamic networks. We call the latter the dynamic random subgraph model (dRSM). A variational expectation maximisation (VEM) algorithm is proposed to perform inference. We show that the variational approximations lead to a new state space model from which the parameters along with hidden states can be estimated using the standard Kalman filter and Rauch-Tung-Striebel (RTS) smoother. The me-thodology is finally applied to the Enron email dataset and allows to discover a early reaction of the partners and directors compared to the other employees regarding the coming scandal.Résumé. — Ces dernières années, de nombreux modèles de graphes aléatoires ont été proposés pour extraire des informations à partir de réseaux dans des domaines variés. Le principe de ces modèles consiste à chercher des groupes de nœuds ayant des profils de connexion homogènes. La plupart de ces modèles sont adaptés pour des réseaux statiques ayant des arêtes binaires ou discrètes mais sans prendre en compte la dimension temporelle. Ce travail est motivé par la nécessité d'analyser un réseau dynamique décrivant les communications électroniques (e-mail) entre les employés de l'entreprise Enron où les positions sociales jouent un rôle important. Nous proposons dans cet article une extension au cadre dynamique du modèle de graphe aléatoire RSM qui a été récemment proposé pour modéliser à l'aide de groupes latents des réseaux statiques pour lesquels une partition en sous-graphes est connue. Notre approche est basée sur l'utilisation d'un state-space model pour modéliser l'évolution au cours du temps des proportions des groupes latents. Le modèle ainsi obtenu est appelé modèle de sous-graphes aléatoires dynamiques (dRSM) et un algorithme de type EM variationnel (VEM) est proposé pour en effectuer l'inférence. Nous montrons que les approximations variationnelles conduisent à un nouveau state-space model à partir duquel les paramètres ainsi que les états cachés peuvent être estimés en utilisant le filtre de Kalman et le Rauch-Tung-Striebel (RTS) smoother. La méthodologie est finalement appliquée au jeu des données d'e-mails de l'entreprise Enron et permet de mettre en évidence une réaction anticipée des cadres par rapport aux autres employés concernant le scandale à venir

HAL Descartes

Numérisation de Documents Anciens Mathématiques

HAL-Paris1

Hal-Diderot

Space- and Time-Efficient Algorithm for Maintaining Dense Subgraphs on One-Pass Dynamic Streams

Author: Epasto A.
Gibson D.
Goldberg A. V.
Lawler E.
Matula D.
Publication venue
Publication date: 01/01/2015
Field of study

While in many graph mining applications it is crucial to handle a stream of updates efficiently in terms of {\em both} time and space, not much was known about achieving such type of algorithm. In this paper we study this issue for a problem which lies at the core of many graph mining applications called {\em densest subgraph problem}. We develop an algorithm that achieves time- and space-efficiency for this problem simultaneously. It is one of the first of its kind for graph problems to the best of our knowledge. In a graph

G = (V, E)

, the "density" of a subgraph induced by a subset of nodes

S \subseteq V

is defined as

|E(S)|/|S|

, where

E(S)

is the set of edges in

E

with both endpoints in

S

. In the densest subgraph problem, the goal is to find a subset of nodes that maximizes the density of the corresponding induced subgraph. For any

\epsilon>0

, we present a dynamic algorithm that, with high probability, maintains a

(4+\epsilon)

-approximation to the densest subgraph problem under a sequence of edge insertions and deletions in a graph with

n

nodes. It uses

\tilde O(n)

space, and has an amortized update time of

\tilde O(1)

and a query time of

\tilde O(1)

. Here,

\tilde O

hides a O(\poly\log_{1+\epsilon} n) term. The approximation ratio can be improved to

(2+\epsilon)

at the cost of increasing the query time to

\tilde O(n)

. It can be extended to a

(2+\epsilon)

-approximation sublinear-time algorithm and a distributed-streaming algorithm. Our algorithm is the first streaming algorithm that can maintain the densest subgraph in {\em one pass}. The previously best algorithm in this setting required

O(\log n)

passes [Bahmani, Kumar and Vassilvitskii, VLDB'12]. The space required by our algorithm is tight up to a polylogarithmic factor.Comment: A preliminary version of this paper appeared in STOC 201

arXiv.org e-Print Archive

Publikationer från KTH

CiteSeerX

Crossref

Warwick Research Archives Portal Repository

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Coresets Meet EDCS: Algorithms for Matching and Vertex Cover on Massive Graphs

Author: Assadi Sepehr
Bateni MohammadHossein
Bernstein Aaron
Mirrokni Vahab
Stein Cliff
Publication venue
Publication date: 27/12/2018
Field of study

As massive graphs become more prevalent, there is a rapidly growing need for scalable algorithms that solve classical graph problems, such as maximum matching and minimum vertex cover, on large datasets. For massive inputs, several different computational models have been introduced, including the streaming model, the distributed communication model, and the massively parallel computation (MPC) model that is a common abstraction of MapReduce-style computation. In each model, algorithms are analyzed in terms of resources such as space used or rounds of communication needed, in addition to the more traditional approximation ratio. In this paper, we give a single unified approach that yields better approximation algorithms for matching and vertex cover in all these models. The highlights include: * The first one pass, significantly-better-than-2-approximation for matching in random arrival streams that uses subquadratic space, namely a

(1.5+\epsilon)

-approximation streaming algorithm that uses

O(n^{1.5})

space for constant

\epsilon > 0

. * The first 2-round, better-than-2-approximation for matching in the MPC model that uses subquadratic space per machine, namely a

(1.5+\epsilon)

-approximation algorithm with

O(\sqrt{mn} + n)

memory per machine for constant

\epsilon > 0

. By building on our unified approach, we further develop parallel algorithms in the MPC model that give a

(1 + \epsilon)

-approximation to matching and an

O(1)

-approximation to vertex cover in only

O(\log\log{n})

MPC rounds and

O(n/poly\log{(n)})

memory per machine. These results settle multiple open questions posed in the recent paper of Czumaj~et.al. [STOC 2018]

arXiv.org e-Print Archive

Crossref