287 research outputs found

    FLEET: Butterfly Estimation from a Bipartite Graph Stream

    Full text link
    We consider space-efficient single-pass estimation of the number of butterflies, a fundamental bipartite graph motif, from a massive bipartite graph stream where each edge represents a connection between entities in two different partitions. We present a space lower bound for any streaming algorithm that can estimate the number of butterflies accurately, as well as FLEET, a suite of algorithms for accurately estimating the number of butterflies in the graph stream. Estimates returned by the algorithms come with provable guarantees on the approximation error, and experiments show good tradeoffs between the space used and the accuracy of approximation. We also present space-efficient algorithms for estimating the number of butterflies within a sliding window of the most recent elements in the stream. While there is a significant body of work on counting subgraphs such as triangles in a unipartite graph stream, our work seems to be one of the few to tackle the case of bipartite graph streams.Comment: This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in Seyed-Vahid Sanei-Mehri, Yu Zhang, Ahmet Erdem Sariyuce and Srikanta Tirthapura. "FLEET: Butterfly Estimation from a Bipartite Graph Stream". The 28th ACM International Conference on Information and Knowledge Managemen

    Efficient Sampling Algorithms for Approximate Motif Counting in Temporal Graph Streams

    Full text link
    A great variety of complex systems, from user interactions in communication networks to transactions in financial markets, can be modeled as temporal graphs consisting of a set of vertices and a series of timestamped and directed edges. Temporal motifs are generalized from subgraph patterns in static graphs which consider edge orderings and durations in addition to topologies. Counting the number of occurrences of temporal motifs is a fundamental problem for temporal network analysis. However, existing methods either cannot support temporal motifs or suffer from performance issues. Moreover, they cannot work in the streaming model where edges are observed incrementally over time. In this paper, we focus on approximate temporal motif counting via random sampling. We first propose two sampling algorithms for temporal motif counting in the offline setting. The first is an edge sampling (ES) algorithm for estimating the number of instances of any temporal motif. The second is an improved edge-wedge sampling (EWS) algorithm that hybridizes edge sampling with wedge sampling for counting temporal motifs with 33 vertices and 33 edges. Furthermore, we propose two algorithms to count temporal motifs incrementally in temporal graph streams by extending the ES and EWS algorithms referred to as SES and SEWS. We provide comprehensive analyses of the theoretical bounds and complexities of our proposed algorithms. Finally, we perform extensive experimental evaluations of our proposed algorithms on several real-world temporal graphs. The results show that ES and EWS have higher efficiency, better accuracy, and greater scalability than state-of-the-art sampling methods for temporal motif counting in the offline setting. Moreover, SES and SEWS achieve up to three orders of magnitude speedups over ES and EWS while having comparable estimation errors for temporal motif counting in the streaming setting.Comment: 27 pages, 11 figures; overlapped with arXiv:2007.1402

    Butterfly Counting in Bipartite Networks

    Full text link
    We consider the problem of counting motifs in bipartite affiliation networks, such as author-paper, user-product, and actor-movie relations. We focus on counting the number of occurrences of a "butterfly", a complete 2×22 \times 2 biclique, the simplest cohesive higher-order structure in a bipartite graph. Our main contribution is a suite of randomized algorithms that can quickly approximate the number of butterflies in a graph with a provable guarantee on accuracy. An experimental evaluation on large real-world networks shows that our algorithms return accurate estimates within a few seconds, even for networks with trillions of butterflies and hundreds of millions of edges.Comment: 28 pages, 5 tables, 6 figure
    • …
    corecore