1,304 research outputs found
Efficient Exact and Approximate Algorithms for Computing Betweenness Centrality in Directed Graphs
Graphs are an important tool to model data in different domains, including
social networks, bioinformatics and the world wide web. Most of the networks
formed in these domains are directed graphs, where all the edges have a
direction and they are not symmetric. Betweenness centrality is an important
index widely used to analyze networks. In this paper, first given a directed
network and a vertex , we propose a new exact algorithm to
compute betweenness score of . Our algorithm pre-computes a set
, which is used to prune a huge amount of computations that do
not contribute in the betweenness score of . Time complexity of our exact
algorithm depends on and it is respectively
and
for unweighted graphs and weighted graphs with positive weights.
is bounded from above by and in most cases, it
is a small constant. Then, for the cases where is large, we
present a simple randomized algorithm that samples from and
performs computations for only the sampled elements. We show that this
algorithm provides an -approximation of the betweenness
score of . Finally, we perform extensive experiments over several real-world
datasets from different domains for several randomly chosen vertices as well as
for the vertices with the highest betweenness scores. Our experiments reveal
that in most cases, our algorithm significantly outperforms the most efficient
existing randomized algorithms, in terms of both running time and accuracy. Our
experiments also show that our proposed algorithm computes betweenness scores
of all vertices in the sets of sizes 5, 10 and 15, much faster and more
accurate than the most efficient existing algorithms.Comment: arXiv admin note: text overlap with arXiv:1704.0735
Discriminative Distance-Based Network Indices with Application to Link Prediction
In large networks, using the length of shortest paths as the distance measure
has shortcomings. A well-studied shortcoming is that extending it to
disconnected graphs and directed graphs is controversial. The second
shortcoming is that a huge number of vertices may have exactly the same score.
The third shortcoming is that in many applications, the distance between two
vertices not only depends on the length of shortest paths, but also on the
number of shortest paths. In this paper, first we develop a new distance
measure between vertices of a graph that yields discriminative distance-based
centrality indices. This measure is proportional to the length of shortest
paths and inversely proportional to the number of shortest paths. We present
algorithms for exact computation of the proposed discriminative indices.
Second, we develop randomized algorithms that precisely estimate average
discriminative path length and average discriminative eccentricity and show
that they give -approximations of these indices. Third, we
perform extensive experiments over several real-world networks from different
domains. In our experiments, we first show that compared to the traditional
indices, discriminative indices have usually much more discriminability. Then,
we show that our randomized algorithms can very precisely estimate average
discriminative path length and average discriminative eccentricity, using only
few samples. Then, we show that real-world networks have usually a tiny average
discriminative path length, bounded by a constant (e.g., 2). Fourth, in order
to better motivate the usefulness of our proposed distance measure, we present
a novel link prediction method, that uses discriminative distance to decide
which vertices are more likely to form a link in future, and show its superior
performance compared to the well-known existing measures
Theoretically Efficient Parallel Graph Algorithms Can Be Fast and Scalable
There has been significant recent interest in parallel graph processing due
to the need to quickly analyze the large graphs available today. Many graph
codes have been designed for distributed memory or external memory. However,
today even the largest publicly-available real-world graph (the Hyperlink Web
graph with over 3.5 billion vertices and 128 billion edges) can fit in the
memory of a single commodity multicore server. Nevertheless, most experimental
work in the literature report results on much smaller graphs, and the ones for
the Hyperlink graph use distributed or external memory. Therefore, it is
natural to ask whether we can efficiently solve a broad class of graph problems
on this graph in memory.
This paper shows that theoretically-efficient parallel graph algorithms can
scale to the largest publicly-available graphs using a single machine with a
terabyte of RAM, processing them in minutes. We give implementations of
theoretically-efficient parallel algorithms for 20 important graph problems. We
also present the optimizations and techniques that we used in our
implementations, which were crucial in enabling us to process these large
graphs quickly. We show that the running times of our implementations
outperform existing state-of-the-art implementations on the largest real-world
graphs. For many of the problems that we consider, this is the first time they
have been solved on graphs at this scale. We have made the implementations
developed in this work publicly-available as the Graph-Based Benchmark Suite
(GBBS).Comment: This is the full version of the paper appearing in the ACM Symposium
on Parallelism in Algorithms and Architectures (SPAA), 201
A Bag-of-Paths Node Criticality Measure
This work compares several node (and network) criticality measures
quantifying to which extend each node is critical with respect to the
communication flow between nodes of the network, and introduces a new measure
based on the Bag-of-Paths (BoP) framework. Network disconnection simulation
experiments show that the new BoP measure outperforms all the other measures on
a sample of Erdos-Renyi and Albert-Barabasi graphs. Furthermore, a faster
(still O(n^3)), approximate, BoP criticality relying on the Sherman-Morrison
rank-one update of a matrix is introduced for tackling larger networks. This
approximate measure shows similar performances as the original, exact, one
KADABRA is an ADaptive Algorithm for Betweenness via Random Approximation
We present KADABRA, a new algorithm to approximate betweenness centrality in
directed and undirected graphs, which significantly outperforms all previous
approaches on real-world complex networks. The efficiency of the new algorithm
relies on two new theoretical contributions, of independent interest. The first
contribution focuses on sampling shortest paths, a subroutine used by most
algorithms that approximate betweenness centrality. We show that, on realistic
random graph models, we can perform this task in time
with high probability, obtaining a significant speedup with respect to the
worst-case performance. We experimentally show that this new
technique achieves similar speedups on real-world complex networks, as well.
The second contribution is a new rigorous application of the adaptive sampling
technique. This approach decreases the total number of shortest paths that need
to be sampled to compute all betweenness centralities with a given absolute
error, and it also handles more general problems, such as computing the
most central nodes. Furthermore, our analysis is general, and it might be
extended to other settings.Comment: Some typos correcte
- …