20 research outputs found

    A second look at counting triangles in graph streams (corrected)

    Get PDF
    In this paper we present improved results on the problem of counting triangles in edge streamed graphs. For graphs with m edges and at least T triangles, we show that an extra look over the stream yields a two-pass streaming algorithm that uses O((m)/(ε4.5sqrt(T))) space and outputs a (1+ε) approximation of the number of triangles in the graph. This improves upon the two-pass streaming tester of Braverman, Ostrovsky and Vilenchik, ICALP 2013, which distinguishes between triangle-free graphs and graphs with at least T triangle using O((m)/(T1/3)) space. Also, in terms of dependence on T, we show that more passes would not lead to a better space bound. In other words, we prove there is no constant pass streaming algorithm that distinguishes between triangle-free graphs from graphs with at least T triangles using O((m)/(T1/2+ρ)) space for any constant ρ>=0

    FLEET: Butterfly Estimation from a Bipartite Graph Stream

    Full text link
    We consider space-efficient single-pass estimation of the number of butterflies, a fundamental bipartite graph motif, from a massive bipartite graph stream where each edge represents a connection between entities in two different partitions. We present a space lower bound for any streaming algorithm that can estimate the number of butterflies accurately, as well as FLEET, a suite of algorithms for accurately estimating the number of butterflies in the graph stream. Estimates returned by the algorithms come with provable guarantees on the approximation error, and experiments show good tradeoffs between the space used and the accuracy of approximation. We also present space-efficient algorithms for estimating the number of butterflies within a sliding window of the most recent elements in the stream. While there is a significant body of work on counting subgraphs such as triangles in a unipartite graph stream, our work seems to be one of the few to tackle the case of bipartite graph streams.Comment: This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in Seyed-Vahid Sanei-Mehri, Yu Zhang, Ahmet Erdem Sariyuce and Srikanta Tirthapura. "FLEET: Butterfly Estimation from a Bipartite Graph Stream". The 28th ACM International Conference on Information and Knowledge Managemen

    Triangle Estimation Using Tripartite Independent Set Queries

    Get PDF
    Estimating the number of triangles in a graph is one of the most fundamental problems in sublinear algorithms. In this work, we provide an approximate triangle counting algorithm using only polylogarithmic queries when the number of triangles on any edge in the graph is polylogarithmically bounded. Our query oracle Tripartite Independent Set (TIS) takes three disjoint sets of vertices A, B and C as input, and answers whether there exists a triangle having one endpoint in each of these three sets. Our query model generally belongs to the class of group queries (Ron and Tsur, ACM ToCT, 2016; Dell and Lapinskas, STOC 2018) and in particular is inspired by the Bipartite Independent Set (BIS) query oracle of Beame et al. (ITCS 2018). We extend the algorithmic framework of Beame et al., with TIS replacing BIS, for triangle counting using ideas from color coding due to Alon et al. (J. ACM, 1995) and a concentration inequality for sums of random variables with bounded dependency (Janson, Rand. Struct. Alg., 2004)

    Counting Simplices in Hypergraph Streams

    Get PDF
    We consider the problem of space-efficiently estimating the number of simplices in a hypergraph stream. This is the most natural hypergraph generalization of the highly-studied problem of estimating the number of triangles in a graph stream. Our input is a kk-uniform hypergraph HH with nn vertices and mm hyperedges. A kk-simplex in HH is a subhypergraph on k+1k+1 vertices XX such that all k+1k+1 possible hyperedges among XX exist in HH. The goal is to process a stream of hyperedges of HH and compute a good estimate of Tk(H)T_k(H), the number of kk-simplices in HH. We design a suite of algorithms for this problem. Under a promise that Tk(H)TT_k(H) \ge T, our algorithms use at most four passes and together imply a space bound of O(ϵ2logδ1polylognmin{m1+1/k/T,m/T2/(k+1)})O( \epsilon^{-2} \log\delta^{-1} \text{polylog} n \cdot \min\{ m^{1+1/k}/T, m/T^{2/(k+1)} \} ) for each fixed k3k \ge 3, in order to guarantee an estimate within (1±ϵ)Tk(H)(1\pm\epsilon)T_k(H) with probability at least 1δ1-\delta. We also give a simpler 11-pass algorithm that achieves O(ϵ2logδ1logn(m/T)(ΔE+ΔV11/k))O(\epsilon^{-2} \log\delta^{-1} \log n\cdot (m/T) ( \Delta_E + \Delta_V^{1-1/k} )) space, where ΔE\Delta_E (respectively, ΔV\Delta_V) denotes the maximum number of kk-simplices that share a hyperedge (respectively, a vertex). We complement these algorithmic results with space lower bounds of the form Ω(ϵ2)\Omega(\epsilon^{-2}), Ω(m1+1/k/T)\Omega(m^{1+1/k}/T), Ω(m/T11/k)\Omega(m/T^{1-1/k}) and Ω(mΔV1/k/T)\Omega(m\Delta_V^{1/k}/T) for multi-pass algorithms and Ω(mΔE/T)\Omega(m\Delta_E/T) for 11-pass algorithms, which show that some of the dependencies on parameters in our upper bounds are nearly tight. Our techniques extend and generalize several different ideas previously developed for triangle counting in graphs, using appropriate innovations to handle the more complicated combinatorics of hypergraphs

    The Sketching Complexity of Graph and Hypergraph Counting

    Full text link
    Subgraph counting is a fundamental primitive in graph processing, with applications in social network analysis (e.g., estimating the clustering coefficient of a graph), database processing and other areas. The space complexity of subgraph counting has been studied extensively in the literature, but many natural settings are still not well understood. In this paper we revisit the subgraph (and hypergraph) counting problem in the sketching model, where the algorithm's state as it processes a stream of updates to the graph is a linear function of the stream. This model has recently received a lot of attention in the literature, and has become a standard model for solving dynamic graph streaming problems. In this paper we give a tight bound on the sketching complexity of counting the number of occurrences of a small subgraph HH in a bounded degree graph GG presented as a stream of edge updates. Specifically, we show that the space complexity of the problem is governed by the fractional vertex cover number of the graph HH. Our subgraph counting algorithm implements a natural vertex sampling approach, with sampling probabilities governed by the vertex cover of HH. Our main technical contribution lies in a new set of Fourier analytic tools that we develop to analyze multiplayer communication protocols in the simultaneous communication model, allowing us to prove a tight lower bound. We believe that our techniques are likely to find applications in other settings. Besides giving tight bounds for all graphs HH, both our algorithm and lower bounds extend to the hypergraph setting, albeit with some loss in space complexity

    Towards Optimal Dynamic Indexes for Approximate (and Exact) Triangle Counting

    Get PDF

    A second look at counting triangles in graph streams

    Get PDF
    In this paper we present improved results on the problem of counting triangles in edge streamed graphs. For graphs with m edges and at least T triangles, we show that an extra look over the stream yields a two-pass streaming algorithm that uses O((m)/(ε4.5sqrt(T))) space and outputs a (1+ε) approximation of the number of triangles in the graph. This improves upon the two-pass streaming tester of Braverman, Ostrovsky and Vilenchik, ICALP 2013, which distinguishes between triangle-free graphs and graphs with at least T triangle using O((m)/(T1/3)) space. Also, in terms of dependence on T, we show that more passes would not lead to a better space bound. In other words, we prove there is no constant pass streaming algorithm that distinguishes between triangle-free graphs from graphs with at least T triangles using O((m)/(T1/2+ρ)) space for any constant ρ>=0