39,774 research outputs found
Fast Shortest Path Distance Estimation in Large Networks
We study the problem of preprocessing a large graph so that point-to-point shortest-path queries can be answered very fast. Computing shortest paths is a well studied problem, but exact algorithms do not scale to huge graphs encountered on the web, social networks, and other applications.
In this paper we focus on approximate methods for distance estimation, in particular using landmark-based distance indexing. This approach involves selecting a subset of nodes as landmarks and computing (offline) the distances from each node in the graph to those landmarks. At runtime, when the distance between a pair of nodes is needed, we can estimate it quickly by combining the precomputed distances of the two nodes to the landmarks.
We prove that selecting the optimal set of landmarks is an NP-hard problem, and thus heuristic solutions need to be employed. Given a budget of memory for the index, which translates directly into a budget of landmarks, different landmark selection strategies can yield dramatically different results in terms of accuracy. A number of simple methods that scale well to large graphs are therefore developed and experimentally compared. The simplest methods choose central nodes of the graph, while the more elaborate ones select central nodes that are also far away from one another. The efficiency of the suggested techniques is tested experimentally using five different real world graphs with millions of edges; for a given accuracy, they require as much as 250 times less space than the current approach in the literature which considers selecting landmarks at random.
Finally, we study applications of our method in two problems arising naturally in large-scale networks, namely, social search and community detection.Yahoo! Research (internship
Using Incremental Many-to-One Queries to Build a Fast and Tight Heuristic for A* in Road Networks
We study exact, efficient, and practical algorithms for route planning applications in large road networks. On the one hand, such algorithms should be able to answer shortest path queries within milliseconds. On the other hand, routing applications often require integrating the current traffic situation, planning ahead with predictions for future traffic, respecting forbidden turns, and many other features depending on the specific application. Therefore, such algorithms must be flexible and able to support a variety of problem variants. In this work, we revisit the A* algorithm to build a simple, extensible, and unified algorithmic framework applicable to many route planning problems. A* has been previously used for routing in road networks. However, its performance was not competitive because no sufficiently fast and tight distance estimation function was available. We present a novel, efficient, and accurate A* heuristic using Contraction Hierarchies, another popular speedup technique. The core of our heuristic is a new Contraction Hierarchies query algorithm called Lazy RPHAST, which can efficiently compute shortest distances from many incrementally provided sources toward a common target. Additionally, we describe A* optimizations to accelerate the processing of low-degree vertices, which are typical in road networks, and present a new pruning criterion for symmetrical bidirectional A*. An extensive experimental study confirms the practicality of our approach for many applications
Fast approximation of centrality and distances in hyperbolic graphs
We show that the eccentricities (and thus the centrality indices) of all
vertices of a -hyperbolic graph can be computed in linear
time with an additive one-sided error of at most , i.e., after a
linear time preprocessing, for every vertex of one can compute in
time an estimate of its eccentricity such that
for a small constant . We
prove that every -hyperbolic graph has a shortest path tree,
constructible in linear time, such that for every vertex of ,
. These results are based on an
interesting monotonicity property of the eccentricity function of hyperbolic
graphs: the closer a vertex is to the center of , the smaller its
eccentricity is. We also show that the distance matrix of with an additive
one-sided error of at most can be computed in
time, where is a small constant. Recent empirical studies show that
many real-world graphs (including Internet application networks, web networks,
collaboration networks, social networks, biological networks, and others) have
small hyperbolicity. So, we analyze the performance of our algorithms for
approximating centrality and distance matrix on a number of real-world
networks. Our experimental results show that the obtained estimates are even
better than the theoretical bounds.Comment: arXiv admin note: text overlap with arXiv:1506.01799 by other author
TopCom: Index for Shortest Distance Query in Directed Graph
Finding shortest distance between two vertices in a graph is an important
problem due to its numerous applications in diverse domains, including
geo-spatial databases, social network analysis, and information retrieval.
Classical algorithms (such as, Dijkstra) solve this problem in polynomial time,
but these algorithms cannot provide real-time response for a large number of
bursty queries on a large graph. So, indexing based solutions that pre-process
the graph for efficiently answering (exactly or approximately) a large number
of distance queries in real-time is becoming increasingly popular. Existing
solutions have varying performance in terms of index size, index building time,
query time, and accuracy. In this work, we propose T OP C OM , a novel
indexing-based solution for exactly answering distance queries. Our experiments
with two of the existing state-of-the-art methods (IS-Label and TreeMap) show
the superiority of T OP C OM over these two methods considering scalability and
query time. Besides, indexing of T OP C OM exploits the DAG (directed acyclic
graph) structure in the graph, which makes it significantly faster than the
existing methods if the SCCs (strongly connected component) of the input graph
are relatively small
Discriminative Distance-Based Network Indices with Application to Link Prediction
In large networks, using the length of shortest paths as the distance measure
has shortcomings. A well-studied shortcoming is that extending it to
disconnected graphs and directed graphs is controversial. The second
shortcoming is that a huge number of vertices may have exactly the same score.
The third shortcoming is that in many applications, the distance between two
vertices not only depends on the length of shortest paths, but also on the
number of shortest paths. In this paper, first we develop a new distance
measure between vertices of a graph that yields discriminative distance-based
centrality indices. This measure is proportional to the length of shortest
paths and inversely proportional to the number of shortest paths. We present
algorithms for exact computation of the proposed discriminative indices.
Second, we develop randomized algorithms that precisely estimate average
discriminative path length and average discriminative eccentricity and show
that they give -approximations of these indices. Third, we
perform extensive experiments over several real-world networks from different
domains. In our experiments, we first show that compared to the traditional
indices, discriminative indices have usually much more discriminability. Then,
we show that our randomized algorithms can very precisely estimate average
discriminative path length and average discriminative eccentricity, using only
few samples. Then, we show that real-world networks have usually a tiny average
discriminative path length, bounded by a constant (e.g., 2). Fourth, in order
to better motivate the usefulness of our proposed distance measure, we present
a novel link prediction method, that uses discriminative distance to decide
which vertices are more likely to form a link in future, and show its superior
performance compared to the well-known existing measures
- …