80,369 research outputs found
Fast Distributed PageRank Computation
Over the last decade, PageRank has gained importance in a wide range of
applications and domains, ever since it first proved to be effective in
determining node importance in large graphs (and was a pioneering idea behind
Google's search engine). In distributed computing alone, PageRank vector, or
more generally random walk based quantities have been used for several
different applications ranging from determining important nodes, load
balancing, search, and identifying connectivity structures. Surprisingly,
however, there has been little work towards designing provably efficient
fully-distributed algorithms for computing PageRank. The difficulty is that
traditional matrix-vector multiplication style iterative methods may not always
adapt well to the distributed setting owing to communication bandwidth
restrictions and convergence rates.
In this paper, we present fast random walk-based distributed algorithms for
computing PageRanks in general graphs and prove strong bounds on the round
complexity. We first present a distributed algorithm that takes O\big(\log
n/\eps \big) rounds with high probability on any graph (directed or
undirected), where is the network size and \eps is the reset probability
used in the PageRank computation (typically \eps is a fixed constant). We
then present a faster algorithm that takes O\big(\sqrt{\log n}/\eps \big)
rounds in undirected graphs. Both of the above algorithms are scalable, as each
node sends only small (\polylog n) number of bits over each edge per round.
To the best of our knowledge, these are the first fully distributed algorithms
for computing PageRank vector with provably efficient running time.Comment: 14 page
On Compact Routing for the Internet
While there exist compact routing schemes designed for grids, trees, and
Internet-like topologies that offer routing tables of sizes that scale
logarithmically with the network size, we demonstrate in this paper that in
view of recent results in compact routing research, such logarithmic scaling on
Internet-like topologies is fundamentally impossible in the presence of
topology dynamics or topology-independent (flat) addressing. We use analytic
arguments to show that the number of routing control messages per topology
change cannot scale better than linearly on Internet-like topologies. We also
employ simulations to confirm that logarithmic routing table size scaling gets
broken by topology-independent addressing, a cornerstone of popular
locator-identifier split proposals aiming at improving routing scaling in the
presence of network topology dynamics or host mobility. These pessimistic
findings lead us to the conclusion that a fundamental re-examination of
assumptions behind routing models and abstractions is needed in order to find a
routing architecture that would be able to scale ``indefinitely.''Comment: This is a significantly revised, journal version of cs/050802
Trip-Based Public Transit Routing Using Condensed Search Trees
We study the problem of planning Pareto-optimal journeys in public transit
networks. Most existing algorithms and speed-up techniques work by computing
subjourneys to intermediary stops until the destination is reached. In
contrast, the trip-based model focuses on trips and transfers between them,
constructing journeys as a sequence of trips. In this paper, we develop a
speed-up technique for this model inspired by principles behind existing
state-of-the-art speed-up techniques, Transfer Pattern and Hub Labelling. The
resulting algorithm allows us to compute Pareto-optimal (with respect to
arrival time and number of transfers) 24-hour profiles on very large real-world
networks in less than half a millisecond. Compared to the current state of the
art for bicriteria queries on public transit networks, this is up to two orders
of magnitude faster, while increasing preprocessing overhead by at most one
order of magnitude
- …