7,885 research outputs found

    Counting Triangles in Large Graphs on GPU

    Full text link
    The clustering coefficient and the transitivity ratio are concepts often used in network analysis, which creates a need for fast practical algorithms for counting triangles in large graphs. Previous research in this area focused on sequential algorithms, MapReduce parallelization, and fast approximations. In this paper we propose a parallel triangle counting algorithm for CUDA GPU. We describe the implementation details necessary to achieve high performance and present the experimental evaluation of our approach. Our algorithm achieves 8 to 15 times speedup over the CPU implementation and is capable of finding 3.8 billion triangles in an 89 million edges graph in less than 10 seconds on the Nvidia Tesla C2050 GPU.Comment: 2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW

    Triadic Measures on Graphs: The Power of Wedge Sampling

    Full text link
    Graphs are used to model interactions in a variety of contexts, and there is a growing need to quickly assess the structure of a graph. Some of the most useful graph metrics, especially those measuring social cohesion, are based on triangles. Despite the importance of these triadic measures, associated algorithms can be extremely expensive. We propose a new method based on wedge sampling. This versatile technique allows for the fast and accurate approximation of all current variants of clustering coefficients and enables rapid uniform sampling of the triangles of a graph. Our methods come with provable and practical time-approximation tradeoffs for all computations. We provide extensive results that show our methods are orders of magnitude faster than the state-of-the-art, while providing nearly the accuracy of full enumeration. Our results will enable more wide-scale adoption of triadic measures for analysis of extremely large graphs, as demonstrated on several real-world examples

    Shared-memory Graph Truss Decomposition

    Full text link
    We present PKT, a new shared-memory parallel algorithm and OpenMP implementation for the truss decomposition of large sparse graphs. A k-truss is a dense subgraph definition that can be considered a relaxation of a clique. Truss decomposition refers to a partitioning of all the edges in the graph based on their k-truss membership. The truss decomposition of a graph has many applications. We show that our new approach PKT consistently outperforms other truss decomposition approaches for a collection of large sparse graphs and on a 24-core shared-memory server. PKT is based on a recently proposed algorithm for k-core decomposition.Comment: 10 pages, conference submissio

    Wedge Sampling for Computing Clustering Coefficients and Triangle Counts on Large Graphs

    Full text link
    Graphs are used to model interactions in a variety of contexts, and there is a growing need to quickly assess the structure of such graphs. Some of the most useful graph metrics are based on triangles, such as those measuring social cohesion. Algorithms to compute them can be extremely expensive, even for moderately-sized graphs with only millions of edges. Previous work has considered node and edge sampling; in contrast, we consider wedge sampling, which provides faster and more accurate approximations than competing techniques. Additionally, wedge sampling enables estimation local clustering coefficients, degree-wise clustering coefficients, uniform triangle sampling, and directed triangle counts. Our methods come with provable and practical probabilistic error estimates for all computations. We provide extensive results that show our methods are both more accurate and faster than state-of-the-art alternatives.Comment: Full version of SDM 2013 paper "Triadic Measures on Graphs: The Power of Wedge Sampling" (arxiv:1202.5230

    Algorithmic Complexity of Power Law Networks

    Full text link
    It was experimentally observed that the majority of real-world networks follow power law degree distribution. The aim of this paper is to study the algorithmic complexity of such "typical" networks. The contribution of this work is twofold. First, we define a deterministic condition for checking whether a graph has a power law degree distribution and experimentally validate it on real-world networks. This definition allows us to derive interesting properties of power law networks. We observe that for exponents of the degree distribution in the range [1,2][1,2] such networks exhibit double power law phenomenon that was observed for several real-world networks. Our observation indicates that this phenomenon could be explained by just pure graph theoretical properties. The second aim of our work is to give a novel theoretical explanation why many algorithms run faster on real-world data than what is predicted by algorithmic worst-case analysis. We show how to exploit the power law degree distribution to design faster algorithms for a number of classical P-time problems including transitive closure, maximum matching, determinant, PageRank and matrix inverse. Moreover, we deal with the problems of counting triangles and finding maximum clique. Previously, it has been only shown that these problems can be solved very efficiently on power law graphs when these graphs are random, e.g., drawn at random from some distribution. However, it is unclear how to relate such a theoretical analysis to real-world graphs, which are fixed. Instead of that, we show that the randomness assumption can be replaced with a simple condition on the degrees of adjacent vertices, which can be used to obtain similar results. As a result, in some range of power law exponents, we are able to solve the maximum clique problem in polynomial time, although in general power law networks the problem is NP-complete
    corecore