39,774 research outputs found

    Fast Shortest Path Distance Estimation in Large Networks

    Full text link
    We study the problem of preprocessing a large graph so that point-to-point shortest-path queries can be answered very fast. Computing shortest paths is a well studied problem, but exact algorithms do not scale to huge graphs encountered on the web, social networks, and other applications. In this paper we focus on approximate methods for distance estimation, in particular using landmark-based distance indexing. This approach involves selecting a subset of nodes as landmarks and computing (offline) the distances from each node in the graph to those landmarks. At runtime, when the distance between a pair of nodes is needed, we can estimate it quickly by combining the precomputed distances of the two nodes to the landmarks. We prove that selecting the optimal set of landmarks is an NP-hard problem, and thus heuristic solutions need to be employed. Given a budget of memory for the index, which translates directly into a budget of landmarks, different landmark selection strategies can yield dramatically different results in terms of accuracy. A number of simple methods that scale well to large graphs are therefore developed and experimentally compared. The simplest methods choose central nodes of the graph, while the more elaborate ones select central nodes that are also far away from one another. The efficiency of the suggested techniques is tested experimentally using five different real world graphs with millions of edges; for a given accuracy, they require as much as 250 times less space than the current approach in the literature which considers selecting landmarks at random. Finally, we study applications of our method in two problems arising naturally in large-scale networks, namely, social search and community detection.Yahoo! Research (internship

    Using Incremental Many-to-One Queries to Build a Fast and Tight Heuristic for A* in Road Networks

    Get PDF
    We study exact, efficient, and practical algorithms for route planning applications in large road networks. On the one hand, such algorithms should be able to answer shortest path queries within milliseconds. On the other hand, routing applications often require integrating the current traffic situation, planning ahead with predictions for future traffic, respecting forbidden turns, and many other features depending on the specific application. Therefore, such algorithms must be flexible and able to support a variety of problem variants. In this work, we revisit the A* algorithm to build a simple, extensible, and unified algorithmic framework applicable to many route planning problems. A* has been previously used for routing in road networks. However, its performance was not competitive because no sufficiently fast and tight distance estimation function was available. We present a novel, efficient, and accurate A* heuristic using Contraction Hierarchies, another popular speedup technique. The core of our heuristic is a new Contraction Hierarchies query algorithm called Lazy RPHAST, which can efficiently compute shortest distances from many incrementally provided sources toward a common target. Additionally, we describe A* optimizations to accelerate the processing of low-degree vertices, which are typical in road networks, and present a new pruning criterion for symmetrical bidirectional A*. An extensive experimental study confirms the practicality of our approach for many applications

    Fast approximation of centrality and distances in hyperbolic graphs

    Full text link
    We show that the eccentricities (and thus the centrality indices) of all vertices of a δ\delta-hyperbolic graph G=(V,E)G=(V,E) can be computed in linear time with an additive one-sided error of at most cδc\delta, i.e., after a linear time preprocessing, for every vertex vv of GG one can compute in O(1)O(1) time an estimate e^(v)\hat{e}(v) of its eccentricity eccG(v)ecc_G(v) such that eccG(v)e^(v)eccG(v)+cδecc_G(v)\leq \hat{e}(v)\leq ecc_G(v)+ c\delta for a small constant cc. We prove that every δ\delta-hyperbolic graph GG has a shortest path tree, constructible in linear time, such that for every vertex vv of GG, eccG(v)eccT(v)eccG(v)+cδecc_G(v)\leq ecc_T(v)\leq ecc_G(v)+ c\delta. These results are based on an interesting monotonicity property of the eccentricity function of hyperbolic graphs: the closer a vertex is to the center of GG, the smaller its eccentricity is. We also show that the distance matrix of GG with an additive one-sided error of at most cδc'\delta can be computed in O(V2log2V)O(|V|^2\log^2|V|) time, where c<cc'< c is a small constant. Recent empirical studies show that many real-world graphs (including Internet application networks, web networks, collaboration networks, social networks, biological networks, and others) have small hyperbolicity. So, we analyze the performance of our algorithms for approximating centrality and distance matrix on a number of real-world networks. Our experimental results show that the obtained estimates are even better than the theoretical bounds.Comment: arXiv admin note: text overlap with arXiv:1506.01799 by other author

    TopCom: Index for Shortest Distance Query in Directed Graph

    Get PDF
    Finding shortest distance between two vertices in a graph is an important problem due to its numerous applications in diverse domains, including geo-spatial databases, social network analysis, and information retrieval. Classical algorithms (such as, Dijkstra) solve this problem in polynomial time, but these algorithms cannot provide real-time response for a large number of bursty queries on a large graph. So, indexing based solutions that pre-process the graph for efficiently answering (exactly or approximately) a large number of distance queries in real-time is becoming increasingly popular. Existing solutions have varying performance in terms of index size, index building time, query time, and accuracy. In this work, we propose T OP C OM , a novel indexing-based solution for exactly answering distance queries. Our experiments with two of the existing state-of-the-art methods (IS-Label and TreeMap) show the superiority of T OP C OM over these two methods considering scalability and query time. Besides, indexing of T OP C OM exploits the DAG (directed acyclic graph) structure in the graph, which makes it significantly faster than the existing methods if the SCCs (strongly connected component) of the input graph are relatively small

    Discriminative Distance-Based Network Indices with Application to Link Prediction

    Full text link
    In large networks, using the length of shortest paths as the distance measure has shortcomings. A well-studied shortcoming is that extending it to disconnected graphs and directed graphs is controversial. The second shortcoming is that a huge number of vertices may have exactly the same score. The third shortcoming is that in many applications, the distance between two vertices not only depends on the length of shortest paths, but also on the number of shortest paths. In this paper, first we develop a new distance measure between vertices of a graph that yields discriminative distance-based centrality indices. This measure is proportional to the length of shortest paths and inversely proportional to the number of shortest paths. We present algorithms for exact computation of the proposed discriminative indices. Second, we develop randomized algorithms that precisely estimate average discriminative path length and average discriminative eccentricity and show that they give (ϵ,δ)(\epsilon,\delta)-approximations of these indices. Third, we perform extensive experiments over several real-world networks from different domains. In our experiments, we first show that compared to the traditional indices, discriminative indices have usually much more discriminability. Then, we show that our randomized algorithms can very precisely estimate average discriminative path length and average discriminative eccentricity, using only few samples. Then, we show that real-world networks have usually a tiny average discriminative path length, bounded by a constant (e.g., 2). Fourth, in order to better motivate the usefulness of our proposed distance measure, we present a novel link prediction method, that uses discriminative distance to decide which vertices are more likely to form a link in future, and show its superior performance compared to the well-known existing measures
    corecore