140 research outputs found
The Sketching Complexity of Graph and Hypergraph Counting
Subgraph counting is a fundamental primitive in graph processing, with
applications in social network analysis (e.g., estimating the clustering
coefficient of a graph), database processing and other areas. The space
complexity of subgraph counting has been studied extensively in the literature,
but many natural settings are still not well understood. In this paper we
revisit the subgraph (and hypergraph) counting problem in the sketching model,
where the algorithm's state as it processes a stream of updates to the graph is
a linear function of the stream. This model has recently received a lot of
attention in the literature, and has become a standard model for solving
dynamic graph streaming problems.
In this paper we give a tight bound on the sketching complexity of counting
the number of occurrences of a small subgraph in a bounded degree graph
presented as a stream of edge updates. Specifically, we show that the space
complexity of the problem is governed by the fractional vertex cover number of
the graph . Our subgraph counting algorithm implements a natural vertex
sampling approach, with sampling probabilities governed by the vertex cover of
. Our main technical contribution lies in a new set of Fourier analytic
tools that we develop to analyze multiplayer communication protocols in the
simultaneous communication model, allowing us to prove a tight lower bound. We
believe that our techniques are likely to find applications in other settings.
Besides giving tight bounds for all graphs , both our algorithm and lower
bounds extend to the hypergraph setting, albeit with some loss in space
complexity
Sketching Cuts in Graphs and Hypergraphs
Sketching and streaming algorithms are in the forefront of current research
directions for cut problems in graphs. In the streaming model, we show that
-approximation for Max-Cut must use space;
moreover, beating -approximation requires polynomial space. For the
sketching model, we show that -uniform hypergraphs admit a
-cut-sparsifier (i.e., a weighted subhypergraph that
approximately preserves all the cuts) with
edges. We also make first steps towards sketching general CSPs (Constraint
Satisfaction Problems)
Counting Simplices in Hypergraph Streams
We consider the problem of space-efficiently estimating the number of
simplices in a hypergraph stream. This is the most natural hypergraph
generalization of the highly-studied problem of estimating the number of
triangles in a graph stream. Our input is a -uniform hypergraph with
vertices and hyperedges. A -simplex in is a subhypergraph on
vertices such that all possible hyperedges among exist in .
The goal is to process a stream of hyperedges of and compute a good
estimate of , the number of -simplices in .
We design a suite of algorithms for this problem. Under a promise that
, our algorithms use at most four passes and together imply a
space bound of for each fixed , in order to
guarantee an estimate within with probability at least
. We also give a simpler -pass algorithm that achieves
space, where (respectively, ) denotes
the maximum number of -simplices that share a hyperedge (respectively, a
vertex). We complement these algorithmic results with space lower bounds of the
form , , and
for multi-pass algorithms and
for -pass algorithms, which show that some of the dependencies on parameters
in our upper bounds are nearly tight. Our techniques extend and generalize
several different ideas previously developed for triangle counting in graphs,
using appropriate innovations to handle the more complicated combinatorics of
hypergraphs
Nearly Tight Spectral Sparsification of Directed Hypergraphs by a Simple Iterative Sampling Algorithm
Spectral hypergraph sparsification, which is an attempt to extend well-known
spectral graph sparsification to hypergraphs, has been extensively studied over
the past few years. For undirected hypergraphs, Kapralov, Krauthgamer, Tardos,
and Yoshida (2022) have recently obtained an algorithm for constructing an
-spectral sparsifier of optimal size, where
suppresses the and factors, while the optimal
sparsifier size has not been known for directed hypergraphs. In this paper, we
present the first algorithm for constructing an -spectral
sparsifier for a directed hypergraph with hyperarcs. This improves
the previous bound by Kapralov, Krauthgamer, Tardos, and Yoshida (2021), and it
is optimal up to the and factors since there is a
lower bound of even for directed graphs. For general directed
hypergraphs, we show the first non-trivial lower bound of
.
Our algorithm can be regarded as an extension of the spanner-based graph
sparsification by Koutis and Xu (2016). To exhibit the power of the
spanner-based approach, we also examine a natural extension of Koutis and Xu's
algorithm to undirected hypergraphs. We show that it outputs an
-spectral sparsifier of an undirected hypergraph with
hyperedges, where is the maximum size of a hyperedge. Our analysis of the
undirected case is based on that of Bansal, Svensson, and Trevisan (2019), and
the bound matches that of the hypergraph sparsification algorithm by Bansal et
al. We further show that our algorithm inherits advantages of the spanner-based
sparsification in that it is fast, can be implemented in parallel, and can be
converted to be fault-tolerant
Counting and Sampling Small Structures in Graph and Hypergraph Data Streams
In this thesis, we explore the problem of approximating the number of elementary substructures called simplices in large k-uniform hypergraphs. The hypergraphs are assumed to be too large to be stored in memory, so we adopt a data stream model, where the hypergraph is defined by a sequence of hyperedges.
First we propose an algorithm that (ε, δ)-estimates the number of simplices using O(m1+1/k / T) bits of space. In addition, we prove that no constant-pass streaming algorithm can (ε, δ)- approximate the number of simplices using less than O( m 1+1/k / T ) bits of space. Thus we resolve the space complexity of the simplex counting problem by providing an algorithm that matches the lower bound.
Second, we examine the triangle counting question –a hypergraph where k = 2. We develop and analyze an almost optimal O (n+m 3/2 / T) triangle-counting algorithm based on ideas introduced in [KMPT12]. The proposed algorithm is subsequently used to establish a method for uniformly sampling triangles in a graph stream using O(m 3/2 / T) bits of space, which beats the state-of-the-art O(mn / T) algorithm given by [PTTW13
- …