Betweenness centrality is a classic measure that quantifies the importance of
a graph element (vertex or edge) according to the fraction of shortest paths
passing through it. This measure is notoriously expensive to compute, and the
best known algorithm runs in O(nm) time. The problems of efficiency and
scalability are exacerbated in a dynamic setting, where the input is an
evolving graph seen edge by edge, and the goal is to keep the betweenness
centrality up to date. In this paper we propose the first truly scalable
algorithm for online computation of betweenness centrality of both vertices and
edges in an evolving graph where new edges are added and existing edges are
removed. Our algorithm is carefully engineered with out-of-core techniques and
tailored for modern parallel stream processing engines that run on clusters of
shared-nothing commodity hardware. Hence, it is amenable to real-world
deployment. We experiment on graphs that are two orders of magnitude larger
than previous studies. Our method is able to keep the betweenness centrality
measures up to date online, i.e., the time to update the measures is smaller
than the inter-arrival time between two consecutive updates.Comment: 15 pages, 9 Figures, accepted for publication in IEEE Transactions on
Knowledge and Data Engineerin