937 research outputs found
Scalable Online Betweenness Centrality in Evolving Graphs
Betweenness centrality is a classic measure that quantifies the importance of
a graph element (vertex or edge) according to the fraction of shortest paths
passing through it. This measure is notoriously expensive to compute, and the
best known algorithm runs in O(nm) time. The problems of efficiency and
scalability are exacerbated in a dynamic setting, where the input is an
evolving graph seen edge by edge, and the goal is to keep the betweenness
centrality up to date. In this paper we propose the first truly scalable
algorithm for online computation of betweenness centrality of both vertices and
edges in an evolving graph where new edges are added and existing edges are
removed. Our algorithm is carefully engineered with out-of-core techniques and
tailored for modern parallel stream processing engines that run on clusters of
shared-nothing commodity hardware. Hence, it is amenable to real-world
deployment. We experiment on graphs that are two orders of magnitude larger
than previous studies. Our method is able to keep the betweenness centrality
measures up to date online, i.e., the time to update the measures is smaller
than the inter-arrival time between two consecutive updates.Comment: 15 pages, 9 Figures, accepted for publication in IEEE Transactions on
Knowledge and Data Engineerin
Fully-dynamic Approximation of Betweenness Centrality
Betweenness is a well-known centrality measure that ranks the nodes of a
network according to their participation in shortest paths. Since an exact
computation is prohibitive in large networks, several approximation algorithms
have been proposed. Besides that, recent years have seen the publication of
dynamic algorithms for efficient recomputation of betweenness in evolving
networks. In previous work we proposed the first semi-dynamic algorithms that
recompute an approximation of betweenness in connected graphs after batches of
edge insertions.
In this paper we propose the first fully-dynamic approximation algorithms
(for weighted and unweighted undirected graphs that need not to be connected)
with a provable guarantee on the maximum approximation error. The transfer to
fully-dynamic and disconnected graphs implies additional algorithmic problems
that could be of independent interest. In particular, we propose a new upper
bound on the vertex diameter for weighted undirected graphs. For both weighted
and unweighted graphs, we also propose the first fully-dynamic algorithms that
keep track of such upper bound. In addition, we extend our former algorithm for
semi-dynamic BFS to batches of both edge insertions and deletions.
Using approximation, our algorithms are the first to make in-memory
computation of betweenness in fully-dynamic networks with millions of edges
feasible. Our experiments show that they can achieve substantial speedups
compared to recomputation, up to several orders of magnitude
Extraction and Analysis of Facebook Friendship Relations
Online Social Networks (OSNs) are a unique Web and social phenomenon, affecting tastes and behaviors of their users and helping them to maintain/create friendships. It is interesting to analyze the growth and evolution of Online Social Networks both from the point of view of marketing and other of new services and from a scientific viewpoint, since their structure and evolution may share similarities with real-life social networks. In social sciences, several techniques for analyzing (online) social networks have been developed, to evaluate quantitative properties (e.g., defining metrics and measures of structural characteristics of the networks) or qualitative aspects (e.g., studying the attachment model for the network evolution, the binary trust relationships, and the link prediction problem).\ud
However, OSN analysis poses novel challenges both to Computer and Social scientists. We present our long-term research effort in analyzing Facebook, the largest and arguably most successful OSN today: it gathers more than 500 million users. Access to data about Facebook users and their friendship relations, is restricted; thus, we acquired the necessary information directly from the front-end of the Web site, in order to reconstruct a sub-graph representing anonymous interconnections among a significant subset of users. We describe our ad-hoc, privacy-compliant crawler for Facebook data extraction. To minimize bias, we adopt two different graph mining techniques: breadth-first search (BFS) and rejection sampling. To analyze the structural properties of samples consisting of millions of nodes, we developed a specific tool for analyzing quantitative and qualitative properties of social networks, adopting and improving existing Social Network Analysis (SNA) techniques and algorithms
Uncovering nodes that spread information between communities in social networks
From many datasets gathered in online social networks, well defined community
structures have been observed. A large number of users participate in these
networks and the size of the resulting graphs poses computational challenges.
There is a particular demand in identifying the nodes responsible for
information flow between communities; for example, in temporal Twitter networks
edges between communities play a key role in propagating spikes of activity
when the connectivity between communities is sparse and few edges exist between
different clusters of nodes. The new algorithm proposed here is aimed at
revealing these key connections by measuring a node's vicinity to nodes of
another community. We look at the nodes which have edges in more than one
community and the locality of nodes around them which influence the information
received and broadcasted to them. The method relies on independent random walks
of a chosen fixed number of steps, originating from nodes with edges in more
than one community. For the large networks that we have in mind, existing
measures such as betweenness centrality are difficult to compute, even with
recent methods that approximate the large number of operations required. We
therefore design an algorithm that scales up to the demand of current big data
requirements and has the ability to harness parallel processing capabilities.
The new algorithm is illustrated on synthetic data, where results can be judged
carefully, and also on a real, large scale Twitter activity data, where new
insights can be gained
Faster Betweenness Centrality Updates in Evolving Networks
Finding central nodes is a fundamental problem in network analysis.
Betweenness centrality is a well-known measure which quantifies the importance
of a node based on the fraction of shortest paths going though it. Due to the
dynamic nature of many today's networks, algorithms that quickly update
centrality scores have become a necessity. For betweenness, several dynamic
algorithms have been proposed over the years, targeting different update types
(incremental- and decremental-only, fully-dynamic). In this paper we introduce
a new dynamic algorithm for updating betweenness centrality after an edge
insertion or an edge weight decrease. Our method is a combination of two
independent contributions: a faster algorithm for updating pairwise distances
as well as number of shortest paths, and a faster algorithm for updating
dependencies. Whereas the worst-case running time of our algorithm is the same
as recomputation, our techniques considerably reduce the number of operations
performed by existing dynamic betweenness algorithms.Comment: Accepted at the 16th International Symposium on Experimental
Algorithms (SEA 2017
- …