309 research outputs found
Efficient Triangle Counting in Large Graphs via Degree-based Vertex Partitioning
The number of triangles is a computationally expensive graph statistic which
is frequently used in complex network analysis (e.g., transitivity ratio), in
various random graph models (e.g., exponential random graph model) and in
important real world applications such as spam detection, uncovering of the
hidden thematic structure of the Web and link recommendation. Counting
triangles in graphs with millions and billions of edges requires algorithms
which run fast, use small amount of space, provide accurate estimates of the
number of triangles and preferably are parallelizable.
In this paper we present an efficient triangle counting algorithm which can
be adapted to the semistreaming model. The key idea of our algorithm is to
combine the sampling algorithm of Tsourakakis et al. and the partitioning of
the set of vertices into a high degree and a low degree subset respectively as
in the Alon, Yuster and Zwick work treating each set appropriately. We obtain a
running time
and an approximation (multiplicative error), where is the number
of vertices, the number of edges and the maximum number of
triangles an edge is contained.
Furthermore, we show how this algorithm can be adapted to the semistreaming
model with space usage and a constant number of passes (three) over the graph
stream. We apply our methods in various networks with several millions of edges
and we obtain excellent results. Finally, we propose a random projection based
method for triangle counting and provide a sufficient condition to obtain an
estimate with low variance.Comment: 1) 12 pages 2) To appear in the 7th Workshop on Algorithms and Models
for the Web Graph (WAW 2010
Simple parallel and distributed algorithms for spectral graph sparsification
We describe a simple algorithm for spectral graph sparsification, based on
iterative computations of weighted spanners and uniform sampling. Leveraging
the algorithms of Baswana and Sen for computing spanners, we obtain the first
distributed spectral sparsification algorithm. We also obtain a parallel
algorithm with improved work and time guarantees. Combining this algorithm with
the parallel framework of Peng and Spielman for solving symmetric diagonally
dominant linear systems, we get a parallel solver which is much closer to being
practical and significantly more efficient in terms of the total work.Comment: replaces "A simple parallel and distributed algorithm for spectral
sparsification". Minor change
Probabilistic Spectral Sparsification In Sublinear Time
In this paper, we introduce a variant of spectral sparsification, called
probabilistic -spectral sparsification. Roughly speaking,
it preserves the cut value of any cut with an
multiplicative error and a additive error. We show how
to produce a probabilistic -spectral sparsifier with
edges in time
time for unweighted undirected graph. This gives fastest known sub-linear time
algorithms for different cut problems on unweighted undirected graph such as
- An time -approximation
algorithm for the sparsest cut problem and the balanced separator problem.
- A time approximation minimum s-t cut algorithm
with an additive error
Sketching Cuts in Graphs and Hypergraphs
Sketching and streaming algorithms are in the forefront of current research
directions for cut problems in graphs. In the streaming model, we show that
-approximation for Max-Cut must use space;
moreover, beating -approximation requires polynomial space. For the
sketching model, we show that -uniform hypergraphs admit a
-cut-sparsifier (i.e., a weighted subhypergraph that
approximately preserves all the cuts) with
edges. We also make first steps towards sketching general CSPs (Constraint
Satisfaction Problems)
Constructing Linear-Sized Spectral Sparsification in Almost-Linear Time
We present the first almost-linear time algorithm for constructing
linear-sized spectral sparsification for graphs. This improves all previous
constructions of linear-sized spectral sparsification, which requires
time.
A key ingredient in our algorithm is a novel combination of two techniques
used in literature for constructing spectral sparsification: Random sampling by
effective resistance, and adaptive constructions based on barrier functions.Comment: 22 pages. A preliminary version of this paper is to appear in
proceedings of the 56th Annual IEEE Symposium on Foundations of Computer
Science (FOCS 2015
Online Row Sampling
Finding a small spectral approximation for a tall matrix is
a fundamental numerical primitive. For a number of reasons, one often seeks an
approximation whose rows are sampled from those of . Row sampling improves
interpretability, saves space when is sparse, and preserves row structure,
which is especially important, for example, when represents a graph.
However, correctly sampling rows from can be costly when the matrix is
large and cannot be stored and processed in memory. Hence, a number of recent
publications focus on row sampling in the streaming setting, using little more
space than what is required to store the outputted approximation [KL13,
KLM+14].
Inspired by a growing body of work on online algorithms for machine learning
and data analysis, we extend this work to a more restrictive online setting: we
read rows of one by one and immediately decide whether each row should be
kept in the spectral approximation or discarded, without ever retracting these
decisions. We present an extremely simple algorithm that approximates up to
multiplicative error and additive error using online samples, with memory overhead
proportional to the cost of storing the spectral approximation. We also present
an algorithm that uses ) memory but only requires
samples, which we show is
optimal.
Our methods are clean and intuitive, allow for lower memory usage than prior
work, and expose new theoretical properties of leverage score based matrix
approximation
- …