8,470 research outputs found

    Probabilistic Analysis of Optimization Problems on Generalized Random Shortest Path Metrics

    Get PDF
    Simple heuristics often show a remarkable performance in practice for optimization problems. Worst-case analysis often falls short of explaining this performance. Because of this, "beyond worst-case analysis" of algorithms has recently gained a lot of attention, including probabilistic analysis of algorithms. The instances of many optimization problems are essentially a discrete metric space. Probabilistic analysis for such metric optimization problems has nevertheless mostly been conducted on instances drawn from Euclidean space, which provides a structure that is usually heavily exploited in the analysis. However, most instances from practice are not Euclidean. Little work has been done on metric instances drawn from other, more realistic, distributions. Some initial results have been obtained by Bringmann et al. (Algorithmica, 2013), who have used random shortest path metrics on complete graphs to analyze heuristics. The goal of this paper is to generalize these findings to non-complete graphs, especially Erd\H{o}s-R\'enyi random graphs. A random shortest path metric is constructed by drawing independent random edge weights for each edge in the graph and setting the distance between every pair of vertices to the length of a shortest path between them with respect to the drawn weights. For such instances, we prove that the greedy heuristic for the minimum distance maximum matching problem, the nearest neighbor and insertion heuristics for the traveling salesman problem, and a trivial heuristic for the kk-median problem all achieve a constant expected approximation ratio. Additionally, we show a polynomial upper bound for the expected number of iterations of the 2-opt heuristic for the traveling salesman problem.Comment: An extended abstract appeared in the proceedings of WALCOM 201

    Probabilistic Analysis of Facility Location on Random Shortest Path Metrics

    Get PDF
    The facility location problem is an NP-hard optimization problem. Therefore, approximation algorithms are often used to solve large instances. Such algorithms often perform much better than worst-case analysis suggests. Therefore, probabilistic analysis is a widely used tool to analyze such algorithms. Most research on probabilistic analysis of NP-hard optimization problems involving metric spaces, such as the facility location problem, has been focused on Euclidean instances, and also instances with independent (random) edge lengths, which are non-metric, have been researched. We would like to extend this knowledge to other, more general, metrics. We investigate the facility location problem using random shortest path metrics. We analyze some probabilistic properties for a simple greedy heuristic which gives a solution to the facility location problem: opening the κ\kappa cheapest facilities (with κ\kappa only depending on the facility opening costs). If the facility opening costs are such that κ\kappa is not too large, then we show that this heuristic is asymptotically optimal. On the other hand, for large values of κ\kappa, the analysis becomes more difficult, and we provide a closed-form expression as upper bound for the expected approximation ratio. In the special case where all facility opening costs are equal this closed-form expression reduces to O(ln(n)4)O(\sqrt[4]{\ln(n)}) or O(1)O(1) or even 1+o(1)1+o(1) if the opening costs are sufficiently small.Comment: A preliminary version accepted to CiE 201

    Probabilistic Analyses of Combinatorial Optimization Problems on Random Shortest Path Metrics

    Get PDF
    Simple heuristics for combinatorial optimization problems often show a remarkable performance in practice. Worst-case analysis often falls short of explaining this performance. Because of this, ‘beyond worst-case analysis’ of algorithms has recently gained a lot of attention, including probabilistic analysis of algorithms.The instances of many combinatorial optimization problems are essentially a discrete metric space. Probabilistic analysis for such metric optimization problems has nevertheless mostly been conducted on instances drawn from Euclidean space, which provides a structure that is usually heavily exploited in the analysis. However, most instances from practice are not Euclidean. Little work has been done on metric instances drawn from other, more realistic, distributions. Some initial results have been obtained by Bringmann et al. (Algorithmica, 2015), who have used random shortest path metrics generated from complete graphs to analyse heuristics.In this thesis we look at several variations of the random shortest path metrics, and perform probabilistic analyses for some simple heuristics for several combinatorial optimization problems on these random metric spaces. A random shortest path metric is constructed by drawing independent random edge weights for each edge in a graph and setting the distance between every pair of vertices to the length of a shortest path between them, with respect to the drawn weights.We provide some basic properties of the distances between vertices in random shortest path metrics. Using these properties, we perform several probabilistic analyses. For random shortest path metrics generated from (dense) Erdős-Rényi random graphs we show that the greedy heuristic for the minimum-distance perfect matching problem, the nearest neighbor and insertion heuristics for the traveling salesman problem, and a trivial heuristic for the k-median problem all achieve a constant expected approximation ratio. Additionally, we show a polynomial upper bound for the expected number of iterations of the 2-opt heuristic for the traveling salesman problem in this model.For random shortest path metrics generated from sparse graphs we show that the greedy heuristic for the minimum-distance perfect matching problem, and the nearest neighbor and insertion heuristics for the traveling salesman problem all achieve a constant expected approximation ratio. Additionally, we show that the 2-opt heuristic for the traveling salesman problem also achieves a constant expected approximation ratio in this model. For random shortest path metrics generated from complete graphs we analyse a simple greedy heuristic for the facility location problem: opening the κ cheapest facilities (with κ only depending on the facility opening costs). If the facility opening costs are such that κ is not too large, then we show that this heuristic is asymptotically optimal. For large values of κ we provide a closed-form expression as upper bound for the expected approximation ratio and we evaluate this expression for the special case where all facility opening costs are equal.Moreover, we show in this model that a simple 2-approximation algorithm for the Steiner tree problem is asymptotically optimal as long as the number of terminals is not too large. We also present some numerical results that imply that the 2-opt heuristic for the traveling salesman problem seems to perform rather poorly in this model

    The path inference filter: model-based low-latency map matching of probe vehicle data

    Full text link
    We consider the problem of reconstructing vehicle trajectories from sparse sequences of GPS points, for which the sampling interval is between 10 seconds and 2 minutes. We introduce a new class of algorithms, called altogether path inference filter (PIF), that maps GPS data in real time, for a variety of trade-offs and scenarios, and with a high throughput. Numerous prior approaches in map-matching can be shown to be special cases of the path inference filter presented in this article. We present an efficient procedure for automatically training the filter on new data, with or without ground truth observations. The framework is evaluated on a large San Francisco taxi dataset and is shown to improve upon the current state of the art. This filter also provides insights about driving patterns of drivers. The path inference filter has been deployed at an industrial scale inside the Mobile Millennium traffic information system, and is used to map fleets of data in San Francisco, Sacramento, Stockholm and Porto.Comment: Preprint, 23 pages and 23 figure

    Evolutionary distances in the twilight zone -- a rational kernel approach

    Get PDF
    Phylogenetic tree reconstruction is traditionally based on multiple sequence alignments (MSAs) and heavily depends on the validity of this information bottleneck. With increasing sequence divergence, the quality of MSAs decays quickly. Alignment-free methods, on the other hand, are based on abstract string comparisons and avoid potential alignment problems. However, in general they are not biologically motivated and ignore our knowledge about the evolution of sequences. Thus, it is still a major open question how to define an evolutionary distance metric between divergent sequences that makes use of indel information and known substitution models without the need for a multiple alignment. Here we propose a new evolutionary distance metric to close this gap. It uses finite-state transducers to create a biologically motivated similarity score which models substitutions and indels, and does not depend on a multiple sequence alignment. The sequence similarity score is defined in analogy to pairwise alignments and additionally has the positive semi-definite property. We describe its derivation and show in simulation studies and real-world examples that it is more accurate in reconstructing phylogenies than competing methods. The result is a new and accurate way of determining evolutionary distances in and beyond the twilight zone of sequence alignments that is suitable for large datasets.Comment: to appear in PLoS ON

    Topology Discovery of Sparse Random Graphs With Few Participants

    Get PDF
    We consider the task of topology discovery of sparse random graphs using end-to-end random measurements (e.g., delay) between a subset of nodes, referred to as the participants. The rest of the nodes are hidden, and do not provide any information for topology discovery. We consider topology discovery under two routing models: (a) the participants exchange messages along the shortest paths and obtain end-to-end measurements, and (b) additionally, the participants exchange messages along the second shortest path. For scenario (a), our proposed algorithm results in a sub-linear edit-distance guarantee using a sub-linear number of uniformly selected participants. For scenario (b), we obtain a much stronger result, and show that we can achieve consistent reconstruction when a sub-linear number of uniformly selected nodes participate. This implies that accurate discovery of sparse random graphs is tractable using an extremely small number of participants. We finally obtain a lower bound on the number of participants required by any algorithm to reconstruct the original random graph up to a given edit distance. We also demonstrate that while consistent discovery is tractable for sparse random graphs using a small number of participants, in general, there are graphs which cannot be discovered by any algorithm even with a significant number of participants, and with the availability of end-to-end information along all the paths between the participants.Comment: A shorter version appears in ACM SIGMETRICS 2011. This version is scheduled to appear in J. on Random Structures and Algorithm

    Metric Embedding via Shortest Path Decompositions

    Full text link
    We study the problem of embedding shortest-path metrics of weighted graphs into p\ell_p spaces. We introduce a new embedding technique based on low-depth decompositions of a graph via shortest paths. The notion of Shortest Path Decomposition depth is inductively defined: A (weighed) path graph has shortest path decomposition (SPD) depth 11. General graph has an SPD of depth kk if it contains a shortest path whose deletion leads to a graph, each of whose components has SPD depth at most k1k-1. In this paper we give an O(kmin{1p,12})O(k^{\min\{\frac{1}{p},\frac{1}{2}\}})-distortion embedding for graphs of SPD depth at most kk. This result is asymptotically tight for any fixed p>1p>1, while for p=1p=1 it is tight up to second order terms. As a corollary of this result, we show that graphs having pathwidth kk embed into p\ell_p with distortion O(kmin{1p,12})O(k^{\min\{\frac{1}{p},\frac{1}{2}\}}). For p=1p=1, this improves over the best previous bound of Lee and Sidiropoulos that was exponential in kk; moreover, for other values of pp it gives the first embeddings whose distortion is independent of the graph size nn. Furthermore, we use the fact that planar graphs have SPD depth O(logn)O(\log n) to give a new proof that any planar graph embeds into 1\ell_1 with distortion O(logn)O(\sqrt{\log n}). Our approach also gives new results for graphs with bounded treewidth, and for graphs excluding a fixed minor
    corecore