Search CORE

343 research outputs found

Counting Hypergraphs in Data Streams

Author: Sun He
Publication venue
Publication date: 28/04/2013
Field of study

We present the first streaming algorithm for counting an arbitrary hypergraph

H

of constant size in a massive hypergraph

G

. Our algorithm can handle both edge-insertions and edge-deletions, and is applicable for the distributed setting. Moreover, our approach provides the first family of graph polynomials for the hypergraph counting problem. Because of the close relationship between hypergraphs and set systems, our approach may have applications in studying similar problems

arXiv.org e-Print Archive

MPG.PuRe

Explore Bristol Research

Counting Hypergraphs in Data Streams

Author: Sun H.
Publication venue
Publication date: 28/04/2013
Field of study

We present the first streaming algorithm for counting an arbitrary hypergraph

H

of constant size in a massive hypergraph

G

MPG.PuRe

The Sketching Complexity of Graph and Hypergraph Counting

Author: Kallaugher John
Kapralov Michael
Price Eric
Publication venue
Publication date: 15/08/2018
Field of study

Subgraph counting is a fundamental primitive in graph processing, with applications in social network analysis (e.g., estimating the clustering coefficient of a graph), database processing and other areas. The space complexity of subgraph counting has been studied extensively in the literature, but many natural settings are still not well understood. In this paper we revisit the subgraph (and hypergraph) counting problem in the sketching model, where the algorithm's state as it processes a stream of updates to the graph is a linear function of the stream. This model has recently received a lot of attention in the literature, and has become a standard model for solving dynamic graph streaming problems. In this paper we give a tight bound on the sketching complexity of counting the number of occurrences of a small subgraph

H

in a bounded degree graph

G

presented as a stream of edge updates. Specifically, we show that the space complexity of the problem is governed by the fractional vertex cover number of the graph

H

. Our subgraph counting algorithm implements a natural vertex sampling approach, with sampling probabilities governed by the vertex cover of

H

. Our main technical contribution lies in a new set of Fourier analytic tools that we develop to analyze multiplayer communication protocols in the simultaneous communication model, allowing us to prove a tight lower bound. We believe that our techniques are likely to find applications in other settings. Besides giving tight bounds for all graphs

H

, both our algorithm and lower bounds extend to the hypergraph setting, albeit with some loss in space complexity

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Crossref

Sketching Cuts in Graphs and Hypergraphs

Author: Kogan Dmitry
Krauthgamer Robert
Publication venue
Publication date: 08/09/2014
Field of study

Sketching and streaming algorithms are in the forefront of current research directions for cut problems in graphs. In the streaming model, we show that

(1-\epsilon)

-approximation for Max-Cut must use

n^{1-O(\epsilon)}

space; moreover, beating

4/5

-approximation requires polynomial space. For the sketching model, we show that

r

-uniform hypergraphs admit a

(1+\epsilon)

-cut-sparsifier (i.e., a weighted subhypergraph that approximately preserves all the cuts) with

O(\epsilon^{-2} n (r+\log n))

edges. We also make first steps towards sketching general CSPs (Constraint Satisfaction Problems)

arXiv.org e-Print Archive

CiteSeerX

Counting and Sampling Small Structures in Graph and Hypergraph Data Streams

Author: Haris Themistoklis
Publication venue: Dartmouth Digital Commons
Publication date: 06/06/2021
Field of study

In this thesis, we explore the problem of approximating the number of elementary substructures called simplices in large k-uniform hypergraphs. The hypergraphs are assumed to be too large to be stored in memory, so we adopt a data stream model, where the hypergraph is defined by a sequence of hyperedges. First we propose an algorithm that (ε, δ)-estimates the number of simplices using O(m1+1/k / T) bits of space. In addition, we prove that no constant-pass streaming algorithm can (ε, δ)- approximate the number of simplices using less than O( m 1+1/k / T ) bits of space. Thus we resolve the space complexity of the simplex counting problem by providing an algorithm that matches the lower bound. Second, we examine the triangle counting question –a hypergraph where k = 2. We develop and analyze an almost optimal O (n+m 3/2 / T) triangle-counting algorithm based on ideas introduced in [KMPT12]. The proposed algorithm is subsequently used to establish a method for uniformly sampling triangles in a graph stream using O(m 3/2 / T) bits of space, which beats the state-of-the-art O(mn / T) algorithm given by [PTTW13

Dartmouth Digital Commons (Dartmouth College)

Counting Simplices in Hypergraph Streams

Author: Chakrabarti Amit
Haris Themistoklis
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 30th Annual European Symposium on Algorithms (ESA 2022)
Publication date: 21/12/2021
Field of study

We consider the problem of space-efficiently estimating the number of simplices in a hypergraph stream. This is the most natural hypergraph generalization of the highly-studied problem of estimating the number of triangles in a graph stream. Our input is a

k

-uniform hypergraph

H

with

n

vertices and

m

hyperedges. A

k

-simplex in

H

is a subhypergraph on

k+1

vertices

X

such that all

k+1

possible hyperedges among

X

exist in

H

. The goal is to process a stream of hyperedges of

H

and compute a good estimate of

T_k(H)

, the number of

k

-simplices in

H

. We design a suite of algorithms for this problem. Under a promise that

T_k(H) \ge T

, our algorithms use at most four passes and together imply a space bound of

O( \epsilon^{-2} \log\delta^{-1} \text{polylog} n \cdot \min\{ m^{1+1/k}/T, m/T^{2/(k+1)} \} )

for each fixed

k \ge 3

, in order to guarantee an estimate within

(1\pm\epsilon)T_k(H)

with probability at least

1-\delta

. We also give a simpler

1

-pass algorithm that achieves

O(\epsilon^{-2} \log\delta^{-1} \log n\cdot (m/T) ( \Delta_E + \Delta_V^{1-1/k} ))

space, where

\Delta_E

(respectively,

\Delta_V

) denotes the maximum number of

k

-simplices that share a hyperedge (respectively, a vertex). We complement these algorithmic results with space lower bounds of the form

\Omega(\epsilon^{-2})

\Omega(m^{1+1/k}/T)

\Omega(m/T^{1-1/k})

and

\Omega(m\Delta_V^{1/k}/T)

for multi-pass algorithms and

\Omega(m\Delta_E/T)

for

1

-pass algorithms, which show that some of the dependencies on parameters in our upper bounds are nearly tight. Our techniques extend and generalize several different ideas previously developed for triangle counting in graphs, using appropriate innovations to handle the more complicated combinatorics of hypergraphs

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Hypergraph Motifs and Their Extensions Beyond Binary

Author: Kim Hyunju
Ko Jihoon
Lee Geon
Shin Kijung
Yoon Seokbum
Publication venue
Publication date: 24/10/2023
Field of study

Hypergraphs naturally represent group interactions, which are omnipresent in many domains: collaborations of researchers, co-purchases of items, and joint interactions of proteins, to name a few. In this work, we propose tools for answering the following questions: (Q1) what are the structural design principles of real-world hypergraphs? (Q2) how can we compare local structures of hypergraphs of different sizes? (Q3) how can we identify domains from which hypergraphs are? We first define hypergraph motifs (h-motifs), which describe the overlapping patterns of three connected hyperedges. Then, we define the significance of each h-motif in a hypergraph as its occurrences relative to those in properly randomized hypergraphs. Lastly, we define the characteristic profile (CP) as the vector of the normalized significance of every h-motif. Regarding Q1, we find that h-motifs' occurrences in 11 real-world hypergraphs from 5 domains are clearly distinguished from those of randomized hypergraphs. Then, we demonstrate that CPs capture local structural patterns unique to each domain, and thus comparing CPs of hypergraphs addresses Q2 and Q3. The concept of CP is extended to represent the connectivity pattern of each node or hyperedge as a vector, which proves useful in node classification and hyperedge prediction. Our algorithmic contribution is to propose MoCHy, a family of parallel algorithms for counting h-motifs' occurrences in a hypergraph. We theoretically analyze their speed and accuracy and show empirically that the advanced approximate version MoCHy-A+ is more accurate and faster than the basic approximate and exact versions, respectively. Furthermore, we explore ternary hypergraph motifs that extends h-motifs by taking into account not only the presence but also the cardinality of intersections among hyperedges. This extension proves beneficial for all previously mentioned applications.Comment: Extended version of VLDB 2020 paper arXiv:2003.0185

arXiv.org e-Print Archive