28 research outputs found

    Metric embeddings with relaxed guarantees

    Get PDF
    We consider the problem of embedding finite metrics with slack: We seek to produce embeddings with small dimension and distortion while allowing a (small) constant fraction of all distances to be arbitrarily distorted. This definition is motivated by recent research in the networking community, which achieved striking empirical success at embedding Internet latencies with low distortion into low-dimensional Euclidean space, provided that some small slack is allowed. Answering an open question of Kleinberg, Slivkins, and Wexler [in Proceedings of the 45th IEEE Symposium on Foundations of Computer Science, 2004], we show that provable guarantees of this type can in fact be achieved in general: Any finite metric space can be embedded, with constant slack and constant distortion, into constant-dimensional Euclidean space. We then show that there exist stronger embeddings into l 1 which exhibit gracefully degrading distortion: There is a single embedding into l 1 that achieves distortion at most O (log 1/∈) on all but at most an ∈ fraction of distances simultaneously for all ∈ > 0. We extend this with distortion O (log 1/∈) 1/p to maps into general l p, p ≥ 1, for several classes of metrics, including those with bounded doubling dimension and those arising from the shortest-path metric of a graph with an excluded minor. Finally, we show that many of our constructions are tight and give a general technique to obtain lower bounds for ∈-slack embeddings from lower bounds for low-distortion embeddings. © 2009 Society for Industrial and Applied Mathematics.published_or_final_versio

    Non-Metric Coordinates for Predicting Network Proximity

    Get PDF

    On Compact Routing for the Internet

    Full text link
    While there exist compact routing schemes designed for grids, trees, and Internet-like topologies that offer routing tables of sizes that scale logarithmically with the network size, we demonstrate in this paper that in view of recent results in compact routing research, such logarithmic scaling on Internet-like topologies is fundamentally impossible in the presence of topology dynamics or topology-independent (flat) addressing. We use analytic arguments to show that the number of routing control messages per topology change cannot scale better than linearly on Internet-like topologies. We also employ simulations to confirm that logarithmic routing table size scaling gets broken by topology-independent addressing, a cornerstone of popular locator-identifier split proposals aiming at improving routing scaling in the presence of network topology dynamics or host mobility. These pessimistic findings lead us to the conclusion that a fundamental re-examination of assumptions behind routing models and abstractions is needed in order to find a routing architecture that would be able to scale ``indefinitely.''Comment: This is a significantly revised, journal version of cs/050802

    Fast Shortest Path Distance Estimation in Large Networks

    Full text link
    We study the problem of preprocessing a large graph so that point-to-point shortest-path queries can be answered very fast. Computing shortest paths is a well studied problem, but exact algorithms do not scale to huge graphs encountered on the web, social networks, and other applications. In this paper we focus on approximate methods for distance estimation, in particular using landmark-based distance indexing. This approach involves selecting a subset of nodes as landmarks and computing (offline) the distances from each node in the graph to those landmarks. At runtime, when the distance between a pair of nodes is needed, we can estimate it quickly by combining the precomputed distances of the two nodes to the landmarks. We prove that selecting the optimal set of landmarks is an NP-hard problem, and thus heuristic solutions need to be employed. Given a budget of memory for the index, which translates directly into a budget of landmarks, different landmark selection strategies can yield dramatically different results in terms of accuracy. A number of simple methods that scale well to large graphs are therefore developed and experimentally compared. The simplest methods choose central nodes of the graph, while the more elaborate ones select central nodes that are also far away from one another. The efficiency of the suggested techniques is tested experimentally using five different real world graphs with millions of edges; for a given accuracy, they require as much as 250 times less space than the current approach in the literature which considers selecting landmarks at random. Finally, we study applications of our method in two problems arising naturally in large-scale networks, namely, social search and community detection.Yahoo! Research (internship

    Embedding Metrics into Ultrametrics and Graphs into Spanning Trees with Constant Average Distortion

    Full text link
    This paper addresses the basic question of how well can a tree approximate distances of a metric space or a graph. Given a graph, the problem of constructing a spanning tree in a graph which strongly preserves distances in the graph is a fundamental problem in network design. We present scaling distortion embeddings where the distortion scales as a function of ϵ\epsilon, with the guarantee that for each ϵ\epsilon the distortion of a fraction 1ϵ1-\epsilon of all pairs is bounded accordingly. Such a bound implies, in particular, that the \emph{average distortion} and q\ell_q-distortions are small. Specifically, our embeddings have \emph{constant} average distortion and O(logn)O(\sqrt{\log n}) 2\ell_2-distortion. This follows from the following results: we prove that any metric space embeds into an ultrametric with scaling distortion O(1/ϵ)O(\sqrt{1/\epsilon}). For the graph setting we prove that any weighted graph contains a spanning tree with scaling distortion O(1/ϵ)O(\sqrt{1/\epsilon}). These bounds are tight even for embedding in arbitrary trees. For probabilistic embedding into spanning trees we prove a scaling distortion of O~(log2(1/ϵ))\tilde{O}(\log^2 (1/\epsilon)), which implies \emph{constant} q\ell_q-distortion for every fixed q<q<\infty.Comment: Extended abstrat apears in SODA 200

    Approximating Approximate Distance Oracles

    Get PDF
    Given a finite metric space (V,d), an approximate distance oracle is a data structure which, when queried on two points u,v in V, returns an approximation to the the actual distance between u and v which is within some bounded stretch factor of the true distance. There has been significant work on the tradeoff between the important parameters of approximate distance oracles (and in particular between the size, stretch, and query time), but in this paper we take a different point of view, that of per-instance optimization. If we are given an particular input metric space and stretch bound, can we find the smallest possible approximate distance oracle for that particular input? Since this question is not even well-defined, we restrict our attention to well-known classes of approximate distance oracles, and study whether we can optimize over those classes. In particular, we give an O(log n)-approximation to the problem of finding the smallest stretch 3 Thorup-Zwick distance oracle, as well as the problem of finding the smallest Pv{a}trac{s}cu-Roditty distance oracle. We also prove a matching Omega(log n) lower bound for both problems, and an Omega(n^{frac{1}{k}-frac{1}{2^{k-1}}}) integrality gap for the more general stretch (2k-1) Thorup-Zwick distance oracle. We also consider the problem of approximating the best TZ or PR approximate distance oracle with outliers, and show that more advanced techniques (SDP relaxations in particular) allow us to optimize even in the presence of outliers
    corecore