2,532 research outputs found

    Fast Shortest Path Distance Estimation in Large Networks

    Full text link
    We study the problem of preprocessing a large graph so that point-to-point shortest-path queries can be answered very fast. Computing shortest paths is a well studied problem, but exact algorithms do not scale to huge graphs encountered on the web, social networks, and other applications. In this paper we focus on approximate methods for distance estimation, in particular using landmark-based distance indexing. This approach involves selecting a subset of nodes as landmarks and computing (offline) the distances from each node in the graph to those landmarks. At runtime, when the distance between a pair of nodes is needed, we can estimate it quickly by combining the precomputed distances of the two nodes to the landmarks. We prove that selecting the optimal set of landmarks is an NP-hard problem, and thus heuristic solutions need to be employed. Given a budget of memory for the index, which translates directly into a budget of landmarks, different landmark selection strategies can yield dramatically different results in terms of accuracy. A number of simple methods that scale well to large graphs are therefore developed and experimentally compared. The simplest methods choose central nodes of the graph, while the more elaborate ones select central nodes that are also far away from one another. The efficiency of the suggested techniques is tested experimentally using five different real world graphs with millions of edges; for a given accuracy, they require as much as 250 times less space than the current approach in the literature which considers selecting landmarks at random. Finally, we study applications of our method in two problems arising naturally in large-scale networks, namely, social search and community detection.Yahoo! Research (internship

    TopCom: Index for Shortest Distance Query in Directed Graph

    Get PDF
    Finding shortest distance between two vertices in a graph is an important problem due to its numerous applications in diverse domains, including geo-spatial databases, social network analysis, and information retrieval. Classical algorithms (such as, Dijkstra) solve this problem in polynomial time, but these algorithms cannot provide real-time response for a large number of bursty queries on a large graph. So, indexing based solutions that pre-process the graph for efficiently answering (exactly or approximately) a large number of distance queries in real-time is becoming increasingly popular. Existing solutions have varying performance in terms of index size, index building time, query time, and accuracy. In this work, we propose T OP C OM , a novel indexing-based solution for exactly answering distance queries. Our experiments with two of the existing state-of-the-art methods (IS-Label and TreeMap) show the superiority of T OP C OM over these two methods considering scalability and query time. Besides, indexing of T OP C OM exploits the DAG (directed acyclic graph) structure in the graph, which makes it significantly faster than the existing methods if the SCCs (strongly connected component) of the input graph are relatively small

    Distance Oracles for Time-Dependent Networks

    Full text link
    We present the first approximate distance oracle for sparse directed networks with time-dependent arc-travel-times determined by continuous, piecewise linear, positive functions possessing the FIFO property. Our approach precomputes (1+ϵ)(1+\epsilon)-approximate distance summaries from selected landmark vertices to all other vertices in the network. Our oracle uses subquadratic space and time preprocessing, and provides two sublinear-time query algorithms that deliver constant and (1+σ)(1+\sigma)-approximate shortest-travel-times, respectively, for arbitrary origin-destination pairs in the network, for any constant σ>ϵ\sigma > \epsilon. Our oracle is based only on the sparsity of the network, along with two quite natural assumptions about travel-time functions which allow the smooth transition towards asymmetric and time-dependent distance metrics.Comment: A preliminary version appeared as Technical Report ECOMPASS-TR-025 of EU funded research project eCOMPASS (http://www.ecompass-project.eu/). An extended abstract also appeared in the 41st International Colloquium on Automata, Languages, and Programming (ICALP 2014, track-A

    Compact Routing on Internet-Like Graphs

    Full text link
    The Thorup-Zwick (TZ) routing scheme is the first generic stretch-3 routing scheme delivering a nearly optimal local memory upper bound. Using both direct analysis and simulation, we calculate the stretch distribution of this routing scheme on random graphs with power-law node degree distributions, PkkγP_k \sim k^{-\gamma}. We find that the average stretch is very low and virtually independent of γ\gamma. In particular, for the Internet interdomain graph, γ2.1\gamma \sim 2.1, the average stretch is around 1.1, with up to 70% of paths being shortest. As the network grows, the average stretch slowly decreases. The routing table is very small, too. It is well below its upper bounds, and its size is around 50 records for 10410^4-node networks. Furthermore, we find that both the average shortest path length (i.e. distance) dˉ\bar{d} and width of the distance distribution σ\sigma observed in the real Internet inter-AS graph have values that are very close to the minimums of the average stretch in the dˉ\bar{d}- and σ\sigma-directions. This leads us to the discovery of a unique critical quasi-stationary point of the average TZ stretch as a function of dˉ\bar{d} and σ\sigma. The Internet distance distribution is located in a close neighborhood of this point. This observation suggests the analytical structure of the average stretch function may be an indirect indicator of some hidden optimization criteria influencing the Internet's interdomain topology evolution.Comment: 29 pages, 16 figure
    corecore