2,532 research outputs found
Fast Shortest Path Distance Estimation in Large Networks
We study the problem of preprocessing a large graph so that point-to-point shortest-path queries can be answered very fast. Computing shortest paths is a well studied problem, but exact algorithms do not scale to huge graphs encountered on the web, social networks, and other applications.
In this paper we focus on approximate methods for distance estimation, in particular using landmark-based distance indexing. This approach involves selecting a subset of nodes as landmarks and computing (offline) the distances from each node in the graph to those landmarks. At runtime, when the distance between a pair of nodes is needed, we can estimate it quickly by combining the precomputed distances of the two nodes to the landmarks.
We prove that selecting the optimal set of landmarks is an NP-hard problem, and thus heuristic solutions need to be employed. Given a budget of memory for the index, which translates directly into a budget of landmarks, different landmark selection strategies can yield dramatically different results in terms of accuracy. A number of simple methods that scale well to large graphs are therefore developed and experimentally compared. The simplest methods choose central nodes of the graph, while the more elaborate ones select central nodes that are also far away from one another. The efficiency of the suggested techniques is tested experimentally using five different real world graphs with millions of edges; for a given accuracy, they require as much as 250 times less space than the current approach in the literature which considers selecting landmarks at random.
Finally, we study applications of our method in two problems arising naturally in large-scale networks, namely, social search and community detection.Yahoo! Research (internship
TopCom: Index for Shortest Distance Query in Directed Graph
Finding shortest distance between two vertices in a graph is an important
problem due to its numerous applications in diverse domains, including
geo-spatial databases, social network analysis, and information retrieval.
Classical algorithms (such as, Dijkstra) solve this problem in polynomial time,
but these algorithms cannot provide real-time response for a large number of
bursty queries on a large graph. So, indexing based solutions that pre-process
the graph for efficiently answering (exactly or approximately) a large number
of distance queries in real-time is becoming increasingly popular. Existing
solutions have varying performance in terms of index size, index building time,
query time, and accuracy. In this work, we propose T OP C OM , a novel
indexing-based solution for exactly answering distance queries. Our experiments
with two of the existing state-of-the-art methods (IS-Label and TreeMap) show
the superiority of T OP C OM over these two methods considering scalability and
query time. Besides, indexing of T OP C OM exploits the DAG (directed acyclic
graph) structure in the graph, which makes it significantly faster than the
existing methods if the SCCs (strongly connected component) of the input graph
are relatively small
Distance Oracles for Time-Dependent Networks
We present the first approximate distance oracle for sparse directed networks
with time-dependent arc-travel-times determined by continuous, piecewise
linear, positive functions possessing the FIFO property.
Our approach precomputes approximate distance summaries from
selected landmark vertices to all other vertices in the network. Our oracle
uses subquadratic space and time preprocessing, and provides two sublinear-time
query algorithms that deliver constant and approximate
shortest-travel-times, respectively, for arbitrary origin-destination pairs in
the network, for any constant . Our oracle is based only on
the sparsity of the network, along with two quite natural assumptions about
travel-time functions which allow the smooth transition towards asymmetric and
time-dependent distance metrics.Comment: A preliminary version appeared as Technical Report ECOMPASS-TR-025 of
EU funded research project eCOMPASS (http://www.ecompass-project.eu/). An
extended abstract also appeared in the 41st International Colloquium on
Automata, Languages, and Programming (ICALP 2014, track-A
Compact Routing on Internet-Like Graphs
The Thorup-Zwick (TZ) routing scheme is the first generic stretch-3 routing
scheme delivering a nearly optimal local memory upper bound. Using both direct
analysis and simulation, we calculate the stretch distribution of this routing
scheme on random graphs with power-law node degree distributions, . We find that the average stretch is very low and virtually
independent of . In particular, for the Internet interdomain graph,
, the average stretch is around 1.1, with up to 70% of paths
being shortest. As the network grows, the average stretch slowly decreases. The
routing table is very small, too. It is well below its upper bounds, and its
size is around 50 records for -node networks. Furthermore, we find that
both the average shortest path length (i.e. distance) and width of
the distance distribution observed in the real Internet inter-AS graph
have values that are very close to the minimums of the average stretch in the
- and -directions. This leads us to the discovery of a unique
critical quasi-stationary point of the average TZ stretch as a function of
and . The Internet distance distribution is located in a
close neighborhood of this point. This observation suggests the analytical
structure of the average stretch function may be an indirect indicator of some
hidden optimization criteria influencing the Internet's interdomain topology
evolution.Comment: 29 pages, 16 figure
- …