813 research outputs found
Computing Top-k Closeness Centrality Faster in Unweighted Graphs
International audienceGiven a connected graph G = (V,E), the closeness centrality of a vertex v is defined as (n-1 / \Sigma_{w \in V} d(v,w). This measure is widely used in the analysis of real-world complex networks, and the problem of selecting the k most central vertices has been deeply analysed in the last decade. However, this problem is computationally not easy, especially for large networks: in the first part of the paper, we prove that it is not solvable in time O(|E|^{2-epsilon) on directed graphs, for any constant epsilon > 0, under reasonable complexity assumptions. Furthermore, we propose a new algorithm for selecting the k most central nodes in a graph: we experimentally show that this algorithm improves significantly both the textbook algorithm, which is based on computing the distance between all pairs of vertices, and the state of the art. For example, we are able to compute the top k nodes in few dozens of seconds in real-world networks with millions of nodes and edges. Finally, as a case study, we compute the 10 most central actors in the IMDB collaboration network, where two actors are linked if they played together in a movie, and in the Wikipedia citation network, which contains a directed edge from a page p to a page q if p contains a link to q
Computing Top-k Closeness Centrality Faster in Unweighted Graphs. (Technical Report)
Centrality indices are widely used analytic measures for the importance of nodes in a network. Closeness centrality is very popular among these measures. For a single node v, it takes the sum of the distances of v to all other nodes into account. The currently best algorithms in practical applications for computing the closeness for all nodes exactly in unweighted graphs are based on breadth-first search (BFS) from every node. Thus, even for sparse graphs, these algorithms require quadratic running time in the worst case, which is prohibitive for large networks.
In many relevant applications, however, it is unnecessary to compute closeness values for all nodes. Instead, one requires only the k nodes with the highest closeness values in descending order. Thus, we present a new algorithm for computing this top-k ranking in unweighted graphs. Following the rationale of previous work, our algorithm significantly reduces the number of traversed edges. It does so by computing upper bounds on the closeness and stopping the current BFS search when k nodes already have higher closeness than the bounds computed for the other nodes.
In our experiments with real-world and synthetic instances of various types, one of these new bounds is good for small-world graphs with low diameter (such as social networks), while the other one excels for graphs with high diameter (such as road networks). Combining them yields an algorithm that is faster than the state of the art for top-k computations for all test instances, by a wide margin for high-diameter graphs
Discriminative Distance-Based Network Indices with Application to Link Prediction
In large networks, using the length of shortest paths as the distance measure
has shortcomings. A well-studied shortcoming is that extending it to
disconnected graphs and directed graphs is controversial. The second
shortcoming is that a huge number of vertices may have exactly the same score.
The third shortcoming is that in many applications, the distance between two
vertices not only depends on the length of shortest paths, but also on the
number of shortest paths. In this paper, first we develop a new distance
measure between vertices of a graph that yields discriminative distance-based
centrality indices. This measure is proportional to the length of shortest
paths and inversely proportional to the number of shortest paths. We present
algorithms for exact computation of the proposed discriminative indices.
Second, we develop randomized algorithms that precisely estimate average
discriminative path length and average discriminative eccentricity and show
that they give -approximations of these indices. Third, we
perform extensive experiments over several real-world networks from different
domains. In our experiments, we first show that compared to the traditional
indices, discriminative indices have usually much more discriminability. Then,
we show that our randomized algorithms can very precisely estimate average
discriminative path length and average discriminative eccentricity, using only
few samples. Then, we show that real-world networks have usually a tiny average
discriminative path length, bounded by a constant (e.g., 2). Fourth, in order
to better motivate the usefulness of our proposed distance measure, we present
a novel link prediction method, that uses discriminative distance to decide
which vertices are more likely to form a link in future, and show its superior
performance compared to the well-known existing measures
Efficient computation of the Shapley value for game-theoretic network centrality
The Shapley valueâprobably the most important normative payoff division scheme in coalitional gamesâhas recently been advocated as a useful measure of centrality in networks. However, although this approach has a variety of real-world applications (including social and organisational networks, biological networks and communication networks), its computational properties have not been widely studied. To date, the only practicable approach to compute Shapley value-based centrality has been via Monte Carlo simulations which are computationally expensive and not guaranteed to give an exact answer. Against this background, this paper presents the first study of the computational aspects of the Shapley value for network centralities. Specifically, we develop exact analytical formulae for Shapley value-based centrality in both weighted and unweighted networks and develop efficient (polynomial time) and exact algorithms based on them. We empirically evaluate these algorithms on two real-life examples (an infrastructure network representing the topology of the Western States Power Grid and a collaboration network from the field of astrophysics) and demonstrate that they deliver significant speedups over the Monte Carlo approach. Fo
A Faster Method to Estimate Closeness Centrality Ranking
Closeness centrality is one way of measuring how central a node is in the
given network. The closeness centrality measure assigns a centrality value to
each node based on its accessibility to the whole network. In real life
applications, we are mainly interested in ranking nodes based on their
centrality values. The classical method to compute the rank of a node first
computes the closeness centrality of all nodes and then compares them to get
its rank. Its time complexity is , where represents total
number of nodes, and represents total number of edges in the network. In
the present work, we propose a heuristic method to fast estimate the closeness
rank of a node in time complexity, where . We
also propose an extended improved method using uniform sampling technique. This
method better estimates the rank and it has the time complexity , where . This is an excellent improvement over the
classical centrality ranking method. The efficiency of the proposed methods is
verified on real world scale-free social networks using absolute and weighted
error functions
- âŠ