16 research outputs found

    Linear Time Subgraph Counting, Graph Degeneracy, and the Chasm at Size Six

    Get PDF
    We consider the problem of counting all k-vertex subgraphs in an input graph, for any constant k. This problem (denoted SUB-CNT_k) has been studied extensively in both theory and practice. In a classic result, Chiba and Nishizeki (SICOMP 85) gave linear time algorithms for clique and 4-cycle counting for bounded degeneracy graphs. This is a rich class of sparse graphs that contains, for example, all minor-free families and preferential attachment graphs. The techniques from this result have inspired a number of recent practical algorithms for SUB-CNT_k. Towards a better understanding of the limits of these techniques, we ask: for what values of k can SUB_CNT_k be solved in linear time? We discover a chasm at k=6. Specifically, we prove that for k < 6, SUB_CNT_k can be solved in linear time. Assuming a standard conjecture in fine-grained complexity, we prove that for all k ? 6, SUB-CNT_k cannot be solved even in near-linear time

    A Fast Counting Method for 6-motifs with Low Connectivity

    Full text link
    A kk-motif (or graphlet) is a subgraph on kk nodes in a graph or network. Counting of motifs in complex networks has been a well-studied problem in network analysis of various real-word graphs arising from the study of social networks and bioinformatics. In particular, the triangle counting problem has received much attention due to its significance in understanding the behavior of social networks. Similarly, subgraphs with more than 3 nodes have received much attention recently. While there have been successful methods developed on this problem, most of the existing algorithms are not scalable to large networks with millions of nodes and edges. The main contribution of this paper is a preliminary study that genaralizes the exact counting algorithm provided by Pinar, Seshadhri and Vishal to a collection of 6-motifs. This method uses the counts of motifs with smaller size to obtain the counts of 6-motifs with low connecivity, that is, containing a cut-vertex or a cut-edge. Therefore, it circumvents the combinatorial explosion that naturally arises when counting subgraphs in large networks

    Discovering Motifs in Real-World Social Networks

    Get PDF
    We built a framework for analyzing the contents of large social networks, based on the approximate counting technique developed by Gonen and Shavitt. Our toolbox was used on data from a large forum---\texttt{boards.ie}---the most prominent community website in Ireland. For the purpose of this experiment, we were granted access to 10 years of forum data. This is the first time the approximate counting technique is tested on real-world, social network data

    Efficient and Scalable Listing of Four-Vertex Subgraph

    Get PDF
    Identifying four-vertex subgraphs has long been recognized as a fundamental technique in bioinformatics and social networks. However, listing these structures is a challenging task, especially for graphs that do not fit in RAM. To address this problem, we build a set of algorithms, models, and implementations that can handle massive graphs on commodity hardware. Our technique achieves 4 – 5 orders of magnitude speedup compared to the best prior methods on graphs with billions of edges, with external-memory operation equally efficient

    Efficient and Scalable Listing of Four-Vertex Subgraph

    Get PDF
    Identifying four-vertex subgraphs has long been recognized as a fundamental technique in bioinformatics and social networks. However, listing these structures is a challenging task, especially for graphs that do not fit in RAM. To address this problem, we build a set of algorithms, models, and implementations that can handle massive graphs on commodity hardware. Our technique achieves 4 – 5 orders of magnitude speedup compared to the best prior methods on graphs with billions of edges, with external-memory operation equally efficient

    Efficiently Counting Complex Multilayer Temporal Motifs in Large-Scale Networks

    Get PDF
    This paper proposes novel algorithms for efficiently counting complex network motifs in dynamic networks that are changing over time. Network motifs are small characteristic configurations of a few nodes and edges, and have repeatedly been shown to provide insightful information for understanding the meso-level structure of a network. Here, we deal with counting more complex temporal motifs in large-scale networks that may consist of millions of nodes and edges. The first contribution is an efficient approach to count temporal motifs in multilayer networks and networks with partial timing, two prevalent aspects of many real-world complex networks. We analyze the complexity of these algorithms and empirically validate their performance on a number of real-world user communication networks extracted from online knowledge exchange platforms. Among other things, we find that the multilayer aspects provide significant insights in how complex user interaction patterns differ substantially between online platforms. The second contribution is an analysis of the viability of motif counting algorithms for motifs that are larger than the triad motifs studied in previous work. We provide a novel categorization of motifs of size four, and determine how and at what computational cost these motifs can still be counted efficiently. In doing so, we delineate the “computational frontier” of temporal motif counting algorithms.Algorithms and the Foundations of Software technolog

    Scaling Up Network Analysis and Mining: Statistical Sampling, Estimation, and Pattern Discovery

    Get PDF
    Network analysis and graph mining play a prominent role in providing insights and studying phenomena across various domains, including social, behavioral, biological, transportation, communication, and financial domains. Across all these domains, networks arise as a natural and rich representation for data. Studying these real-world networks is crucial for solving numerous problems that lead to high-impact applications. For example, identifying the behavior and interests of users in online social networks (e.g., viral marketing), monitoring and detecting virus outbreaks in human contact networks, predicting protein functions in biological networks, and detecting anomalous behavior in computer networks. A key characteristic of these networks is that their complex structure is massive and continuously evolving over time, which makes it challenging and computationally intensive to analyze, query, and model these networks in their entirety. In this dissertation, we propose sampling as well as fast, efficient, and scalable methods for network analysis and mining in both static and streaming graphs