Search CORE

309 research outputs found

Efficient Triangle Counting in Large Graphs via Degree-based Vertex Partitioning

Author: A. Hajnal
A. Magen
C. Papadimitriou
D. Knuth
F. Chung
H. Chernoff
H. Jowhari
J. Feigenbaum
J.H. Kim
M. Latapy
N. Alon
O. Frank
S. Wasserman
T. Schank
T. Schank
V.H. Vu
W. Johnson
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

The number of triangles is a computationally expensive graph statistic which is frequently used in complex network analysis (e.g., transitivity ratio), in various random graph models (e.g., exponential random graph model) and in important real world applications such as spam detection, uncovering of the hidden thematic structure of the Web and link recommendation. Counting triangles in graphs with millions and billions of edges requires algorithms which run fast, use small amount of space, provide accurate estimates of the number of triangles and preferably are parallelizable. In this paper we present an efficient triangle counting algorithm which can be adapted to the semistreaming model. The key idea of our algorithm is to combine the sampling algorithm of Tsourakakis et al. and the partitioning of the set of vertices into a high degree and a low degree subset respectively as in the Alon, Yuster and Zwick work treating each set appropriately. We obtain a running time

O \left(m + \frac{m^{3/2} \Delta \log{n}}{t \epsilon^2} \right)

and an

\epsilon

approximation (multiplicative error), where

n

is the number of vertices,

m

the number of edges and

\Delta

the maximum number of triangles an edge is contained. Furthermore, we show how this algorithm can be adapted to the semistreaming model with space usage

O\left(m^{1/2}\log{n} + \frac{m^{3/2} \Delta \log{n}}{t \epsilon^2} \right)

and a constant number of passes (three) over the graph stream. We apply our methods in various networks with several millions of edges and we obtain excellent results. Finally, we propose a random projection based method for triangle counting and provide a sufficient condition to obtain an estimate with low variance.Comment: 1) 12 pages 2) To appear in the 7th Workshop on Algorithms and Models for the Web Graph (WAW 2010

arXiv.org e-Print Archive

CiteSeerX

Crossref

Simple parallel and distributed algorithms for spectral graph sparsification

Author: Jonathan
Koutis Ioannis
Koutis Ioannis
Livne Oren E.
Peng Richard
Publication venue
Publication date: 17/04/2014
Field of study

We describe a simple algorithm for spectral graph sparsification, based on iterative computations of weighted spanners and uniform sampling. Leveraging the algorithms of Baswana and Sen for computing spanners, we obtain the first distributed spectral sparsification algorithm. We also obtain a parallel algorithm with improved work and time guarantees. Combining this algorithm with the parallel framework of Peng and Spielman for solving symmetric diagonally dominant linear systems, we get a parallel solver which is much closer to being practical and significantly more efficient in terms of the total work.Comment: replaces "A simple parallel and distributed algorithm for spectral sparsification". Minor change

arXiv.org e-Print Archive

Crossref

Probabilistic Spectral Sparsification In Sublinear Time

Author: Lee Yin Tat
Publication venue
Publication date: 30/12/2013
Field of study

In this paper, we introduce a variant of spectral sparsification, called probabilistic

(\varepsilon,\delta)

-spectral sparsification. Roughly speaking, it preserves the cut value of any cut

(S,S^{c})

with an

1\pm\varepsilon

multiplicative error and a

\delta\left|S\right|

additive error. We show how to produce a probabilistic

(\varepsilon,\delta)

-spectral sparsifier with

O(n\log n/\varepsilon^{2})

edges in time

\tilde{O}(n/\varepsilon^{2}\delta)

time for unweighted undirected graph. This gives fastest known sub-linear time algorithms for different cut problems on unweighted undirected graph such as - An

\tilde{O}(n/OPT+n^{3/2+t})

time

O(\sqrt{\log n/t})

-approximation algorithm for the sparsest cut problem and the balanced separator problem. - A

n^{1+o(1)}/\varepsilon^{4}

time approximation minimum s-t cut algorithm with an

\varepsilon n

additive error

arXiv.org e-Print Archive

CiteSeerX

Sketching Cuts in Graphs and Hypergraphs

Author: Kogan Dmitry
Krauthgamer Robert
Publication venue
Publication date: 08/09/2014
Field of study

Sketching and streaming algorithms are in the forefront of current research directions for cut problems in graphs. In the streaming model, we show that

(1-\epsilon)

-approximation for Max-Cut must use

n^{1-O(\epsilon)}

space; moreover, beating

4/5

-approximation requires polynomial space. For the sketching model, we show that

r

-uniform hypergraphs admit a

(1+\epsilon)

-cut-sparsifier (i.e., a weighted subhypergraph that approximately preserves all the cuts) with

O(\epsilon^{-2} n (r+\log n))

edges. We also make first steps towards sketching general CSPs (Constraint Satisfaction Problems)

arXiv.org e-Print Archive

CiteSeerX

Constructing Linear-Sized Spectral Sparsification in Almost-Linear Time

Author: Lee Yin Tat
Sun He
Publication venue
Publication date: 13/08/2015
Field of study

We present the first almost-linear time algorithm for constructing linear-sized spectral sparsification for graphs. This improves all previous constructions of linear-sized spectral sparsification, which requires

\Omega(n^2)

time. A key ingredient in our algorithm is a novel combination of two techniques used in literature for constructing spectral sparsification: Random sampling by effective resistance, and adaptive constructions based on barrier functions.Comment: 22 pages. A preliminary version of this paper is to appear in proceedings of the 56th Annual IEEE Symposium on Foundations of Computer Science (FOCS 2015

arXiv.org e-Print Archive

CiteSeerX

Crossref

Explore Bristol Research

Online Row Sampling

Author: Cohen Michael B.
Musco Cameron
Pachocki Jakub
Publication venue
Publication date: 01/01/2016
Field of study

Finding a small spectral approximation for a tall

n \times d

matrix

A

is a fundamental numerical primitive. For a number of reasons, one often seeks an approximation whose rows are sampled from those of

A

. Row sampling improves interpretability, saves space when

A

is sparse, and preserves row structure, which is especially important, for example, when

A

represents a graph. However, correctly sampling rows from

A

can be costly when the matrix is large and cannot be stored and processed in memory. Hence, a number of recent publications focus on row sampling in the streaming setting, using little more space than what is required to store the outputted approximation [KL13, KLM+14]. Inspired by a growing body of work on online algorithms for machine learning and data analysis, we extend this work to a more restrictive online setting: we read rows of

A

one by one and immediately decide whether each row should be kept in the spectral approximation or discarded, without ever retracting these decisions. We present an extremely simple algorithm that approximates

A

up to multiplicative error

\epsilon

and additive error

\delta

using

O(d \log d \log(\epsilon||A||_2/\delta)/\epsilon^2)

online samples, with memory overhead proportional to the cost of storing the spectral approximation. We also present an algorithm that uses

O(d^2

) memory but only requires

O(d\log(\epsilon||A||_2/\delta)/\epsilon^2)

samples, which we show is optimal. Our methods are clean and intuitive, allow for lower memory usage than prior work, and expose new theoretical properties of leverage score based matrix approximation

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server