8,470 research outputs found
Probabilistic Analysis of Optimization Problems on Generalized Random Shortest Path Metrics
Simple heuristics often show a remarkable performance in practice for
optimization problems. Worst-case analysis often falls short of explaining this
performance. Because of this, "beyond worst-case analysis" of algorithms has
recently gained a lot of attention, including probabilistic analysis of
algorithms.
The instances of many optimization problems are essentially a discrete metric
space. Probabilistic analysis for such metric optimization problems has
nevertheless mostly been conducted on instances drawn from Euclidean space,
which provides a structure that is usually heavily exploited in the analysis.
However, most instances from practice are not Euclidean. Little work has been
done on metric instances drawn from other, more realistic, distributions. Some
initial results have been obtained by Bringmann et al. (Algorithmica, 2013),
who have used random shortest path metrics on complete graphs to analyze
heuristics.
The goal of this paper is to generalize these findings to non-complete
graphs, especially Erd\H{o}s-R\'enyi random graphs. A random shortest path
metric is constructed by drawing independent random edge weights for each edge
in the graph and setting the distance between every pair of vertices to the
length of a shortest path between them with respect to the drawn weights. For
such instances, we prove that the greedy heuristic for the minimum distance
maximum matching problem, the nearest neighbor and insertion heuristics for the
traveling salesman problem, and a trivial heuristic for the -median problem
all achieve a constant expected approximation ratio. Additionally, we show a
polynomial upper bound for the expected number of iterations of the 2-opt
heuristic for the traveling salesman problem.Comment: An extended abstract appeared in the proceedings of WALCOM 201
Probabilistic Analysis of Facility Location on Random Shortest Path Metrics
The facility location problem is an NP-hard optimization problem. Therefore,
approximation algorithms are often used to solve large instances. Such
algorithms often perform much better than worst-case analysis suggests.
Therefore, probabilistic analysis is a widely used tool to analyze such
algorithms. Most research on probabilistic analysis of NP-hard optimization
problems involving metric spaces, such as the facility location problem, has
been focused on Euclidean instances, and also instances with independent
(random) edge lengths, which are non-metric, have been researched. We would
like to extend this knowledge to other, more general, metrics.
We investigate the facility location problem using random shortest path
metrics. We analyze some probabilistic properties for a simple greedy heuristic
which gives a solution to the facility location problem: opening the
cheapest facilities (with only depending on the facility opening
costs). If the facility opening costs are such that is not too large,
then we show that this heuristic is asymptotically optimal. On the other hand,
for large values of , the analysis becomes more difficult, and we
provide a closed-form expression as upper bound for the expected approximation
ratio. In the special case where all facility opening costs are equal this
closed-form expression reduces to or or even
if the opening costs are sufficiently small.Comment: A preliminary version accepted to CiE 201
Probabilistic Analyses of Combinatorial Optimization Problems on Random Shortest Path Metrics
Simple heuristics for combinatorial optimization problems often show a remarkable performance in practice. Worst-case analysis often falls short of explaining this performance. Because of this, ‘beyond worst-case analysis’ of algorithms has recently gained a lot of attention, including probabilistic analysis of algorithms.The instances of many combinatorial optimization problems are essentially a discrete metric space. Probabilistic analysis for such metric optimization problems has nevertheless mostly been conducted on instances drawn from Euclidean space, which provides a structure that is usually heavily exploited in the analysis. However, most instances from practice are not Euclidean. Little work has been done on metric instances drawn from other, more realistic, distributions. Some initial results have been obtained by Bringmann et al. (Algorithmica, 2015), who have used random shortest path metrics generated from complete graphs to analyse heuristics.In this thesis we look at several variations of the random shortest path metrics, and perform probabilistic analyses for some simple heuristics for several combinatorial optimization problems on these random metric spaces. A random shortest path metric is constructed by drawing independent random edge weights for each edge in a graph and setting the distance between every pair of vertices to the length of a shortest path between them, with respect to the drawn weights.We provide some basic properties of the distances between vertices in random shortest path metrics. Using these properties, we perform several probabilistic analyses. For random shortest path metrics generated from (dense) Erdős-Rényi random graphs we show that the greedy heuristic for the minimum-distance perfect matching problem, the nearest neighbor and insertion heuristics for the traveling salesman problem, and a trivial heuristic for the k-median problem all achieve a constant expected approximation ratio. Additionally, we show a polynomial upper bound for the expected number of iterations of the 2-opt heuristic for the traveling salesman problem in this model.For random shortest path metrics generated from sparse graphs we show that the greedy heuristic for the minimum-distance perfect matching problem, and the nearest neighbor and insertion heuristics for the traveling salesman problem all achieve a constant expected approximation ratio. Additionally, we show that the 2-opt heuristic for the traveling salesman problem also achieves a constant expected approximation ratio in this model. For random shortest path metrics generated from complete graphs we analyse a simple greedy heuristic for the facility location problem: opening the κ cheapest facilities (with κ only depending on the facility opening costs). If the facility opening costs are such that κ is not too large, then we show that this heuristic is asymptotically optimal. For large values of κ we provide a closed-form expression as upper bound for the expected approximation ratio and we evaluate this expression for the special case where all facility opening costs are equal.Moreover, we show in this model that a simple 2-approximation algorithm for the Steiner tree problem is asymptotically optimal as long as the number of terminals is not too large. We also present some numerical results that imply that the 2-opt heuristic for the traveling salesman problem seems to perform rather poorly in this model
The path inference filter: model-based low-latency map matching of probe vehicle data
We consider the problem of reconstructing vehicle trajectories from sparse
sequences of GPS points, for which the sampling interval is between 10 seconds
and 2 minutes. We introduce a new class of algorithms, called altogether path
inference filter (PIF), that maps GPS data in real time, for a variety of
trade-offs and scenarios, and with a high throughput. Numerous prior approaches
in map-matching can be shown to be special cases of the path inference filter
presented in this article. We present an efficient procedure for automatically
training the filter on new data, with or without ground truth observations. The
framework is evaluated on a large San Francisco taxi dataset and is shown to
improve upon the current state of the art. This filter also provides insights
about driving patterns of drivers. The path inference filter has been deployed
at an industrial scale inside the Mobile Millennium traffic information system,
and is used to map fleets of data in San Francisco, Sacramento, Stockholm and
Porto.Comment: Preprint, 23 pages and 23 figure
Evolutionary distances in the twilight zone -- a rational kernel approach
Phylogenetic tree reconstruction is traditionally based on multiple sequence
alignments (MSAs) and heavily depends on the validity of this information
bottleneck. With increasing sequence divergence, the quality of MSAs decays
quickly. Alignment-free methods, on the other hand, are based on abstract
string comparisons and avoid potential alignment problems. However, in general
they are not biologically motivated and ignore our knowledge about the
evolution of sequences. Thus, it is still a major open question how to define
an evolutionary distance metric between divergent sequences that makes use of
indel information and known substitution models without the need for a multiple
alignment. Here we propose a new evolutionary distance metric to close this
gap. It uses finite-state transducers to create a biologically motivated
similarity score which models substitutions and indels, and does not depend on
a multiple sequence alignment. The sequence similarity score is defined in
analogy to pairwise alignments and additionally has the positive semi-definite
property. We describe its derivation and show in simulation studies and
real-world examples that it is more accurate in reconstructing phylogenies than
competing methods. The result is a new and accurate way of determining
evolutionary distances in and beyond the twilight zone of sequence alignments
that is suitable for large datasets.Comment: to appear in PLoS ON
Topology Discovery of Sparse Random Graphs With Few Participants
We consider the task of topology discovery of sparse random graphs using
end-to-end random measurements (e.g., delay) between a subset of nodes,
referred to as the participants. The rest of the nodes are hidden, and do not
provide any information for topology discovery. We consider topology discovery
under two routing models: (a) the participants exchange messages along the
shortest paths and obtain end-to-end measurements, and (b) additionally, the
participants exchange messages along the second shortest path. For scenario
(a), our proposed algorithm results in a sub-linear edit-distance guarantee
using a sub-linear number of uniformly selected participants. For scenario (b),
we obtain a much stronger result, and show that we can achieve consistent
reconstruction when a sub-linear number of uniformly selected nodes
participate. This implies that accurate discovery of sparse random graphs is
tractable using an extremely small number of participants. We finally obtain a
lower bound on the number of participants required by any algorithm to
reconstruct the original random graph up to a given edit distance. We also
demonstrate that while consistent discovery is tractable for sparse random
graphs using a small number of participants, in general, there are graphs which
cannot be discovered by any algorithm even with a significant number of
participants, and with the availability of end-to-end information along all the
paths between the participants.Comment: A shorter version appears in ACM SIGMETRICS 2011. This version is
scheduled to appear in J. on Random Structures and Algorithm
Metric Embedding via Shortest Path Decompositions
We study the problem of embedding shortest-path metrics of weighted graphs
into spaces. We introduce a new embedding technique based on low-depth
decompositions of a graph via shortest paths. The notion of Shortest Path
Decomposition depth is inductively defined: A (weighed) path graph has shortest
path decomposition (SPD) depth . General graph has an SPD of depth if it
contains a shortest path whose deletion leads to a graph, each of whose
components has SPD depth at most . In this paper we give an
-distortion embedding for graphs of SPD
depth at most . This result is asymptotically tight for any fixed ,
while for it is tight up to second order terms.
As a corollary of this result, we show that graphs having pathwidth embed
into with distortion . For
, this improves over the best previous bound of Lee and Sidiropoulos that
was exponential in ; moreover, for other values of it gives the first
embeddings whose distortion is independent of the graph size . Furthermore,
we use the fact that planar graphs have SPD depth to give a new
proof that any planar graph embeds into with distortion . Our approach also gives new results for graphs with bounded treewidth,
and for graphs excluding a fixed minor
- …