1,304 research outputs found

    Efficient Exact and Approximate Algorithms for Computing Betweenness Centrality in Directed Graphs

    Full text link
    Graphs are an important tool to model data in different domains, including social networks, bioinformatics and the world wide web. Most of the networks formed in these domains are directed graphs, where all the edges have a direction and they are not symmetric. Betweenness centrality is an important index widely used to analyze networks. In this paper, first given a directed network GG and a vertex rV(G)r \in V(G), we propose a new exact algorithm to compute betweenness score of rr. Our algorithm pre-computes a set RV(r)\mathcal{RV}(r), which is used to prune a huge amount of computations that do not contribute in the betweenness score of rr. Time complexity of our exact algorithm depends on RV(r)|\mathcal{RV}(r)| and it is respectively Θ(RV(r)E(G))\Theta(|\mathcal{RV}(r)|\cdot|E(G)|) and Θ(RV(r)E(G)+RV(r)V(G)logV(G))\Theta(|\mathcal{RV}(r)|\cdot|E(G)|+|\mathcal{RV}(r)|\cdot|V(G)|\log |V(G)|) for unweighted graphs and weighted graphs with positive weights. RV(r)|\mathcal{RV}(r)| is bounded from above by V(G)1|V(G)|-1 and in most cases, it is a small constant. Then, for the cases where RV(r)\mathcal{RV}(r) is large, we present a simple randomized algorithm that samples from RV(r)\mathcal{RV}(r) and performs computations for only the sampled elements. We show that this algorithm provides an (ϵ,δ)(\epsilon,\delta)-approximation of the betweenness score of rr. Finally, we perform extensive experiments over several real-world datasets from different domains for several randomly chosen vertices as well as for the vertices with the highest betweenness scores. Our experiments reveal that in most cases, our algorithm significantly outperforms the most efficient existing randomized algorithms, in terms of both running time and accuracy. Our experiments also show that our proposed algorithm computes betweenness scores of all vertices in the sets of sizes 5, 10 and 15, much faster and more accurate than the most efficient existing algorithms.Comment: arXiv admin note: text overlap with arXiv:1704.0735

    Discriminative Distance-Based Network Indices with Application to Link Prediction

    Full text link
    In large networks, using the length of shortest paths as the distance measure has shortcomings. A well-studied shortcoming is that extending it to disconnected graphs and directed graphs is controversial. The second shortcoming is that a huge number of vertices may have exactly the same score. The third shortcoming is that in many applications, the distance between two vertices not only depends on the length of shortest paths, but also on the number of shortest paths. In this paper, first we develop a new distance measure between vertices of a graph that yields discriminative distance-based centrality indices. This measure is proportional to the length of shortest paths and inversely proportional to the number of shortest paths. We present algorithms for exact computation of the proposed discriminative indices. Second, we develop randomized algorithms that precisely estimate average discriminative path length and average discriminative eccentricity and show that they give (ϵ,δ)(\epsilon,\delta)-approximations of these indices. Third, we perform extensive experiments over several real-world networks from different domains. In our experiments, we first show that compared to the traditional indices, discriminative indices have usually much more discriminability. Then, we show that our randomized algorithms can very precisely estimate average discriminative path length and average discriminative eccentricity, using only few samples. Then, we show that real-world networks have usually a tiny average discriminative path length, bounded by a constant (e.g., 2). Fourth, in order to better motivate the usefulness of our proposed distance measure, we present a novel link prediction method, that uses discriminative distance to decide which vertices are more likely to form a link in future, and show its superior performance compared to the well-known existing measures

    Theoretically Efficient Parallel Graph Algorithms Can Be Fast and Scalable

    Full text link
    There has been significant recent interest in parallel graph processing due to the need to quickly analyze the large graphs available today. Many graph codes have been designed for distributed memory or external memory. However, today even the largest publicly-available real-world graph (the Hyperlink Web graph with over 3.5 billion vertices and 128 billion edges) can fit in the memory of a single commodity multicore server. Nevertheless, most experimental work in the literature report results on much smaller graphs, and the ones for the Hyperlink graph use distributed or external memory. Therefore, it is natural to ask whether we can efficiently solve a broad class of graph problems on this graph in memory. This paper shows that theoretically-efficient parallel graph algorithms can scale to the largest publicly-available graphs using a single machine with a terabyte of RAM, processing them in minutes. We give implementations of theoretically-efficient parallel algorithms for 20 important graph problems. We also present the optimizations and techniques that we used in our implementations, which were crucial in enabling us to process these large graphs quickly. We show that the running times of our implementations outperform existing state-of-the-art implementations on the largest real-world graphs. For many of the problems that we consider, this is the first time they have been solved on graphs at this scale. We have made the implementations developed in this work publicly-available as the Graph-Based Benchmark Suite (GBBS).Comment: This is the full version of the paper appearing in the ACM Symposium on Parallelism in Algorithms and Architectures (SPAA), 201

    A Bag-of-Paths Node Criticality Measure

    Full text link
    This work compares several node (and network) criticality measures quantifying to which extend each node is critical with respect to the communication flow between nodes of the network, and introduces a new measure based on the Bag-of-Paths (BoP) framework. Network disconnection simulation experiments show that the new BoP measure outperforms all the other measures on a sample of Erdos-Renyi and Albert-Barabasi graphs. Furthermore, a faster (still O(n^3)), approximate, BoP criticality relying on the Sherman-Morrison rank-one update of a matrix is introduced for tackling larger networks. This approximate measure shows similar performances as the original, exact, one

    KADABRA is an ADaptive Algorithm for Betweenness via Random Approximation

    Get PDF
    We present KADABRA, a new algorithm to approximate betweenness centrality in directed and undirected graphs, which significantly outperforms all previous approaches on real-world complex networks. The efficiency of the new algorithm relies on two new theoretical contributions, of independent interest. The first contribution focuses on sampling shortest paths, a subroutine used by most algorithms that approximate betweenness centrality. We show that, on realistic random graph models, we can perform this task in time E12+o(1)|E|^{\frac{1}{2}+o(1)} with high probability, obtaining a significant speedup with respect to the Θ(E)\Theta(|E|) worst-case performance. We experimentally show that this new technique achieves similar speedups on real-world complex networks, as well. The second contribution is a new rigorous application of the adaptive sampling technique. This approach decreases the total number of shortest paths that need to be sampled to compute all betweenness centralities with a given absolute error, and it also handles more general problems, such as computing the kk most central nodes. Furthermore, our analysis is general, and it might be extended to other settings.Comment: Some typos correcte
    corecore