Search CORE

17,769 research outputs found

Scaling up Group Closeness Maximization

Author: Bergamini Elisabetta
Gonser Tanya
Meyerhenke Henning
Publication venue: Karlsruher Institut für Technologie
Publication date: 01/01/2017
Field of study

Closeness is a widely-used centrality measure in social network analysis. For a node it indicates the inverse average shortest-path distance to the other nodes of the network. While the identification of the k nodes with highest closeness received significant attention, many applications are actually interested in finding a group of nodes that is central as a whole. For this problem, only recently a greedy algorithm with approximation ratio (1−1/e) has been proposed [Chen et al., ADC 2016]. Since this algorithm’s running time is still expensive for large networks, a heuristic without approximation guarantee has also been proposed in the same paper. In the present paper we develop new techniques to speed up the greedy algorithm without losing its theoretical guarantee. Compared to a straightforward implementation, our approach is orders of magnitude faster and, compared to the heuristic proposed by Chen et al., we always find a solution with better quality in a comparable running time in our experiments. Our method Greedy++ allows us to approximate the group with maximum closeness on networks with up to hundreds of millions of edges in minutes or at most a few hours. To have the same theoretical guarantee, the greedy approach by [Chen et al., ADC 2016] would take several days already on networks with hundreds of thousands of edges. In a comparison with the optimum, our experiments show that the solution found by Greedy++ is actually much better than the theoretical guarantee. Over all tested networks, the empirical approximation ratio is never lower than 0.97. Finally, we study for the first time the correlation between the top-k nodes with highest closeness and an approximation of the most central group in large complex networks and show that the overlap between the two is relatively small

Crossref

KITopen

Towards Scalable Network Delay Minimization

Author: Bogdanov Petko
Medya Sourav
Singh Ambuj
Publication venue
Publication date: 26/09/2016
Field of study

Reduction of end-to-end network delays is an optimization task with applications in multiple domains. Low delays enable improved information flow in social networks, quick spread of ideas in collaboration networks, low travel times for vehicles on road networks and increased rate of packets in the case of communication networks. Delay reduction can be achieved by both improving the propagation capabilities of individual nodes and adding additional edges in the network. One of the main challenges in such design problems is that the effects of local changes are not independent, and as a consequence, there is a combinatorial search-space of possible improvements. Thus, minimizing the cumulative propagation delay requires novel scalable and data-driven approaches. In this paper, we consider the problem of network delay minimization via node upgrades. Although the problem is NP-hard, we show that probabilistic approximation for a restricted version can be obtained. We design scalable and high-quality techniques for the general setting based on sampling and targeted to different models of delay distribution. Our methods scale almost linearly with the graph size and consistently outperform competitors in quality

arXiv.org e-Print Archive

Crossref

Fast Shortest Path Distance Estimation in Large Networks

Author: Castillo Carlos
Francesco Bonchi
Gionis Aristides
Potamias Michalis
Publication venue: Boston University Computer Science Department
Publication date: 09/03/2009
Field of study

We study the problem of preprocessing a large graph so that point-to-point shortest-path queries can be answered very fast. Computing shortest paths is a well studied problem, but exact algorithms do not scale to huge graphs encountered on the web, social networks, and other applications. In this paper we focus on approximate methods for distance estimation, in particular using landmark-based distance indexing. This approach involves selecting a subset of nodes as landmarks and computing (offline) the distances from each node in the graph to those landmarks. At runtime, when the distance between a pair of nodes is needed, we can estimate it quickly by combining the precomputed distances of the two nodes to the landmarks. We prove that selecting the optimal set of landmarks is an NP-hard problem, and thus heuristic solutions need to be employed. Given a budget of memory for the index, which translates directly into a budget of landmarks, different landmark selection strategies can yield dramatically different results in terms of accuracy. A number of simple methods that scale well to large graphs are therefore developed and experimentally compared. The simplest methods choose central nodes of the graph, while the more elaborate ones select central nodes that are also far away from one another. The efficiency of the suggested techniques is tested experimentally using five different real world graphs with millions of edges; for a given accuracy, they require as much as 250 times less space than the current approach in the literature which considers selecting landmarks at random. Finally, we study applications of our method in two problems arising naturally in large-scale networks, namely, social search and community detection.Yahoo! Research (internship

Boston University Institutional Repository (OpenBU)

Efficient Exact and Approximate Algorithms for Computing Betweenness Centrality in Directed Graphs

Author: Abdessalem Talel
Bifet Albert
Chehreghani Mostafa Haghir
Publication venue
Publication date: 28/08/2017
Field of study

Graphs are an important tool to model data in different domains, including social networks, bioinformatics and the world wide web. Most of the networks formed in these domains are directed graphs, where all the edges have a direction and they are not symmetric. Betweenness centrality is an important index widely used to analyze networks. In this paper, first given a directed network

G

and a vertex

r \in V(G)

, we propose a new exact algorithm to compute betweenness score of

r

. Our algorithm pre-computes a set

\mathcal{RV}(r)

, which is used to prune a huge amount of computations that do not contribute in the betweenness score of

r

. Time complexity of our exact algorithm depends on

|\mathcal{RV}(r)|

and it is respectively

\Theta(|\mathcal{RV}(r)|\cdot|E(G)|)

and

\Theta(|\mathcal{RV}(r)|\cdot|E(G)|+|\mathcal{RV}(r)|\cdot|V(G)|\log |V(G)|)

for unweighted graphs and weighted graphs with positive weights.

|\mathcal{RV}(r)|

is bounded from above by

|V(G)|-1

and in most cases, it is a small constant. Then, for the cases where

\mathcal{RV}(r)

is large, we present a simple randomized algorithm that samples from

\mathcal{RV}(r)

and performs computations for only the sampled elements. We show that this algorithm provides an

(\epsilon,\delta)

-approximation of the betweenness score of

r

. Finally, we perform extensive experiments over several real-world datasets from different domains for several randomly chosen vertices as well as for the vertices with the highest betweenness scores. Our experiments reveal that in most cases, our algorithm significantly outperforms the most efficient existing randomized algorithms, in terms of both running time and accuracy. Our experiments also show that our proposed algorithm computes betweenness scores of all vertices in the sets of sizes 5, 10 and 15, much faster and more accurate than the most efficient existing algorithms.Comment: arXiv admin note: text overlap with arXiv:1704.0735

arXiv.org e-Print Archive

Episciences.org

Fully-dynamic Approximation of Betweenness Centrality

Author: A. D’Andrea
D. Frigioni
D.A. Bader
G. Ramalingam
K. Goel
L. Roditty
M. Kas
M. Nasre
R. Bauer
U. Brandes
U. Brandes
Publication venue
Publication date: 03/07/2015
Field of study

Betweenness is a well-known centrality measure that ranks the nodes of a network according to their participation in shortest paths. Since an exact computation is prohibitive in large networks, several approximation algorithms have been proposed. Besides that, recent years have seen the publication of dynamic algorithms for efficient recomputation of betweenness in evolving networks. In previous work we proposed the first semi-dynamic algorithms that recompute an approximation of betweenness in connected graphs after batches of edge insertions. In this paper we propose the first fully-dynamic approximation algorithms (for weighted and unweighted undirected graphs that need not to be connected) with a provable guarantee on the maximum approximation error. The transfer to fully-dynamic and disconnected graphs implies additional algorithmic problems that could be of independent interest. In particular, we propose a new upper bound on the vertex diameter for weighted undirected graphs. For both weighted and unweighted graphs, we also propose the first fully-dynamic algorithms that keep track of such upper bound. In addition, we extend our former algorithm for semi-dynamic BFS to batches of both edge insertions and deletions. Using approximation, our algorithms are the first to make in-memory computation of betweenness in fully-dynamic networks with millions of edges feasible. Our experiments show that they can achieve substantial speedups compared to recomputation, up to several orders of magnitude

arXiv.org e-Print Archive

Crossref

The Minimum Wiener Connector

Author: Burt R.
Hwang D. S. R.
Jacobs K. M.
Stefanovic D.
Vogelstein B.
Zhang X.-D.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 16/10/2016
Field of study

The Wiener index of a graph is the sum of all pairwise shortest-path distances between its vertices. In this paper we study the novel problem of finding a minimum Wiener connector: given a connected graph

G=(V,E)

and a set

Q\subseteq V

of query vertices, find a subgraph of

G

that connects all query vertices and has minimum Wiener index. We show that The Minimum Wiener Connector admits a polynomial-time (albeit impractical) exact algorithm for the special case where the number of query vertices is bounded. We show that in general the problem is NP-hard, and has no PTAS unless

\mathbf{P} = \mathbf{NP}

. Our main contribution is a constant-factor approximation algorithm running in time

\widetilde{O}(|Q||E|)

. A thorough experimentation on a large variety of real-world graphs confirms that our method returns smaller and denser solutions than other methods, and does so by adding to the query set

Q

a small number of important vertices (i.e., vertices with high centrality).Comment: Published in Proceedings of the 2015 ACM SIGMOD International Conference on Management of Dat

arXiv.org e-Print Archive

Crossref