140 research outputs found

    The Sketching Complexity of Graph and Hypergraph Counting

    Full text link
    Subgraph counting is a fundamental primitive in graph processing, with applications in social network analysis (e.g., estimating the clustering coefficient of a graph), database processing and other areas. The space complexity of subgraph counting has been studied extensively in the literature, but many natural settings are still not well understood. In this paper we revisit the subgraph (and hypergraph) counting problem in the sketching model, where the algorithm's state as it processes a stream of updates to the graph is a linear function of the stream. This model has recently received a lot of attention in the literature, and has become a standard model for solving dynamic graph streaming problems. In this paper we give a tight bound on the sketching complexity of counting the number of occurrences of a small subgraph HH in a bounded degree graph GG presented as a stream of edge updates. Specifically, we show that the space complexity of the problem is governed by the fractional vertex cover number of the graph HH. Our subgraph counting algorithm implements a natural vertex sampling approach, with sampling probabilities governed by the vertex cover of HH. Our main technical contribution lies in a new set of Fourier analytic tools that we develop to analyze multiplayer communication protocols in the simultaneous communication model, allowing us to prove a tight lower bound. We believe that our techniques are likely to find applications in other settings. Besides giving tight bounds for all graphs HH, both our algorithm and lower bounds extend to the hypergraph setting, albeit with some loss in space complexity

    Sketching Cuts in Graphs and Hypergraphs

    Full text link
    Sketching and streaming algorithms are in the forefront of current research directions for cut problems in graphs. In the streaming model, we show that (1ϵ)(1-\epsilon)-approximation for Max-Cut must use n1O(ϵ)n^{1-O(\epsilon)} space; moreover, beating 4/54/5-approximation requires polynomial space. For the sketching model, we show that rr-uniform hypergraphs admit a (1+ϵ)(1+\epsilon)-cut-sparsifier (i.e., a weighted subhypergraph that approximately preserves all the cuts) with O(ϵ2n(r+logn))O(\epsilon^{-2} n (r+\log n)) edges. We also make first steps towards sketching general CSPs (Constraint Satisfaction Problems)

    Counting Simplices in Hypergraph Streams

    Get PDF
    We consider the problem of space-efficiently estimating the number of simplices in a hypergraph stream. This is the most natural hypergraph generalization of the highly-studied problem of estimating the number of triangles in a graph stream. Our input is a kk-uniform hypergraph HH with nn vertices and mm hyperedges. A kk-simplex in HH is a subhypergraph on k+1k+1 vertices XX such that all k+1k+1 possible hyperedges among XX exist in HH. The goal is to process a stream of hyperedges of HH and compute a good estimate of Tk(H)T_k(H), the number of kk-simplices in HH. We design a suite of algorithms for this problem. Under a promise that Tk(H)TT_k(H) \ge T, our algorithms use at most four passes and together imply a space bound of O(ϵ2logδ1polylognmin{m1+1/k/T,m/T2/(k+1)})O( \epsilon^{-2} \log\delta^{-1} \text{polylog} n \cdot \min\{ m^{1+1/k}/T, m/T^{2/(k+1)} \} ) for each fixed k3k \ge 3, in order to guarantee an estimate within (1±ϵ)Tk(H)(1\pm\epsilon)T_k(H) with probability at least 1δ1-\delta. We also give a simpler 11-pass algorithm that achieves O(ϵ2logδ1logn(m/T)(ΔE+ΔV11/k))O(\epsilon^{-2} \log\delta^{-1} \log n\cdot (m/T) ( \Delta_E + \Delta_V^{1-1/k} )) space, where ΔE\Delta_E (respectively, ΔV\Delta_V) denotes the maximum number of kk-simplices that share a hyperedge (respectively, a vertex). We complement these algorithmic results with space lower bounds of the form Ω(ϵ2)\Omega(\epsilon^{-2}), Ω(m1+1/k/T)\Omega(m^{1+1/k}/T), Ω(m/T11/k)\Omega(m/T^{1-1/k}) and Ω(mΔV1/k/T)\Omega(m\Delta_V^{1/k}/T) for multi-pass algorithms and Ω(mΔE/T)\Omega(m\Delta_E/T) for 11-pass algorithms, which show that some of the dependencies on parameters in our upper bounds are nearly tight. Our techniques extend and generalize several different ideas previously developed for triangle counting in graphs, using appropriate innovations to handle the more complicated combinatorics of hypergraphs

    Nearly Tight Spectral Sparsification of Directed Hypergraphs by a Simple Iterative Sampling Algorithm

    Full text link
    Spectral hypergraph sparsification, which is an attempt to extend well-known spectral graph sparsification to hypergraphs, has been extensively studied over the past few years. For undirected hypergraphs, Kapralov, Krauthgamer, Tardos, and Yoshida (2022) have recently obtained an algorithm for constructing an ε\varepsilon-spectral sparsifier of optimal O(n)O^*(n) size, where OO^* suppresses the ε1\varepsilon^{-1} and logn\log n factors, while the optimal sparsifier size has not been known for directed hypergraphs. In this paper, we present the first algorithm for constructing an ε\varepsilon-spectral sparsifier for a directed hypergraph with O(n2)O^*(n^2) hyperarcs. This improves the previous bound by Kapralov, Krauthgamer, Tardos, and Yoshida (2021), and it is optimal up to the ε1\varepsilon^{-1} and logn\log n factors since there is a lower bound of Ω(n2)\Omega(n^2) even for directed graphs. For general directed hypergraphs, we show the first non-trivial lower bound of Ω(n2/ε)\Omega(n^2/\varepsilon). Our algorithm can be regarded as an extension of the spanner-based graph sparsification by Koutis and Xu (2016). To exhibit the power of the spanner-based approach, we also examine a natural extension of Koutis and Xu's algorithm to undirected hypergraphs. We show that it outputs an ε\varepsilon-spectral sparsifier of an undirected hypergraph with O(nr3)O^*(nr^3) hyperedges, where rr is the maximum size of a hyperedge. Our analysis of the undirected case is based on that of Bansal, Svensson, and Trevisan (2019), and the bound matches that of the hypergraph sparsification algorithm by Bansal et al. We further show that our algorithm inherits advantages of the spanner-based sparsification in that it is fast, can be implemented in parallel, and can be converted to be fault-tolerant

    Counting and Sampling Small Structures in Graph and Hypergraph Data Streams

    Get PDF
    In this thesis, we explore the problem of approximating the number of elementary substructures called simplices in large k-uniform hypergraphs. The hypergraphs are assumed to be too large to be stored in memory, so we adopt a data stream model, where the hypergraph is defined by a sequence of hyperedges. First we propose an algorithm that (ε, δ)-estimates the number of simplices using O(m1+1/k / T) bits of space. In addition, we prove that no constant-pass streaming algorithm can (ε, δ)- approximate the number of simplices using less than O( m 1+1/k / T ) bits of space. Thus we resolve the space complexity of the simplex counting problem by providing an algorithm that matches the lower bound. Second, we examine the triangle counting question –a hypergraph where k = 2. We develop and analyze an almost optimal O (n+m 3/2 / T) triangle-counting algorithm based on ideas introduced in [KMPT12]. The proposed algorithm is subsequently used to establish a method for uniformly sampling triangles in a graph stream using O(m 3/2 / T) bits of space, which beats the state-of-the-art O(mn / T) algorithm given by [PTTW13

    Nearly Tight Spectral Sparsification of Directed Hypergraphs

    Get PDF
    corecore