525 research outputs found
Maximum Scatter TSP in Doubling Metrics
We study the problem of finding a tour of points in which every edge is
long. More precisely, we wish to find a tour that visits every point exactly
once, maximizing the length of the shortest edge in the tour. The problem is
known as Maximum Scatter TSP, and was introduced by Arkin et al. (SODA 1997),
motivated by applications in manufacturing and medical imaging. Arkin et al.
gave a -approximation for the metric version of the problem and showed
that this is the best possible ratio achievable in polynomial time (assuming ). Arkin et al. raised the question of whether a better approximation
ratio can be obtained in the Euclidean plane.
We answer this question in the affirmative in a more general setting, by
giving a -approximation algorithm for -dimensional doubling
metrics, with running time , where . As a corollary we obtain (i) an
efficient polynomial-time approximation scheme (EPTAS) for all constant
dimensions , (ii) a polynomial-time approximation scheme (PTAS) for
dimension , for a sufficiently large constant , and (iii)
a PTAS for constant and . Furthermore, we
show the dependence on in our approximation scheme to be essentially
optimal, unless Satisfiability can be solved in subexponential time
Reordering Rows for Better Compression: Beyond the Lexicographic Order
Sorting database tables before compressing them improves the compression
rate. Can we do better than the lexicographical order? For minimizing the
number of runs in a run-length encoding compression scheme, the best approaches
to row-ordering are derived from traveling salesman heuristics, although there
is a significant trade-off between running time and compression. A new
heuristic, Multiple Lists, which is a variant on Nearest Neighbor that trades
off compression for a major running-time speedup, is a good option for very
large tables. However, for some compression schemes, it is more important to
generate long runs rather than few runs. For this case, another novel
heuristic, Vortex, is promising. We find that we can improve run-length
encoding up to a factor of 3 whereas we can improve prefix coding by up to 80%:
these gains are on top of the gains due to lexicographically sorting the table.
We prove that the new row reordering is optimal (within 10%) at minimizing the
runs of identical values within columns, in a few cases.Comment: to appear in ACM TOD
Optimal Random Matchings, Tours, and Spanning Trees in Hierarchically Separated Trees
We derive tight bounds on the expected weights of several combinatorial
optimization problems for random point sets of size distributed among the
leaves of a balanced hierarchically separated tree. We consider {\it
monochromatic} and {\it bichromatic} versions of the minimum matching, minimum
spanning tree, and traveling salesman problems. We also present tight
concentration results for the monochromatic problems.Comment: 24 pages, to appear in TC
A Mathematical Unification of Geometric Crossovers Defined on Phenotype Space
Geometric crossover is a representation-independent definition of crossover based on the distance of the search space interpreted as a metric space. It generalizes the traditional crossover for binary strings and other important recombination operators for the most frequently used representations. Using a distance tailored to the problem at hand, the abstract definition of crossover can be used to design new problem specific crossovers that embed problem knowledge in the search. This paper is motivated by the fact that genotype-phenotype mapping can be theoretically interpreted using the concept of quotient space in mathematics. In this paper, we study a metric transformation, the quotient metric space, that gives rise to the notion of quotient geometric crossover. This turns out to be a very versatile notion. We give many example applications of the quotient geometric crossover
Fast Hierarchical Clustering and Other Applications of Dynamic Closest Pairs
We develop data structures for dynamic closest pair problems with arbitrary
distance functions, that do not necessarily come from any geometric structure
on the objects. Based on a technique previously used by the author for
Euclidean closest pairs, we show how to insert and delete objects from an
n-object set, maintaining the closest pair, in O(n log^2 n) time per update and
O(n) space. With quadratic space, we can instead use a quadtree-like structure
to achieve an optimal time bound, O(n) per update. We apply these data
structures to hierarchical clustering, greedy matching, and TSP heuristics, and
discuss other potential applications in machine learning, Groebner bases, and
local improvement algorithms for partition and placement problems. Experiments
show our new methods to be faster in practice than previously used heuristics.Comment: 20 pages, 9 figures. A preliminary version of this paper appeared at
the 9th ACM-SIAM Symp. on Discrete Algorithms, San Francisco, 1998, pp.
619-628. For source code and experimental results, see
http://www.ics.uci.edu/~eppstein/projects/pairs
Analysis of combinatorial search spaces for a class of NP-hard problems, An
2011 Spring.Includes bibliographical references.Given a finite but very large set of states X and a real-valued objective function ƒ defined on X, combinatorial optimization refers to the problem of finding elements of X that maximize (or minimize) ƒ. Many combinatorial search algorithms employ some perturbation operator to hill-climb in the search space. Such perturbative local search algorithms are state of the art for many classes of NP-hard combinatorial optimization problems such as maximum k-satisfiability, scheduling, and problems of graph theory. In this thesis we analyze combinatorial search spaces by expanding the objective function into a (sparse) series of basis functions. While most analyses of the distribution of function values in the search space must rely on empirical sampling, the basis function expansion allows us to directly study the distribution of function values across regions of states for combinatorial problems without the need for sampling. We concentrate on objective functions that can be expressed as bounded pseudo-Boolean functions which are NP-hard to solve in general. We use the basis expansion to construct a polynomial-time algorithm for exactly computing constant-degree moments of the objective function ƒ over arbitrarily large regions of the search space. On functions with restricted codomains, these moments are related to the true distribution by a system of linear equations. Given low moments supplied by our algorithm, we construct bounds of the true distribution of ƒ over regions of the space using a linear programming approach. A straightforward relaxation allows us to efficiently approximate the distribution and hence quickly estimate the count of states in a given region that have certain values under the objective function. The analysis is also useful for characterizing properties of specific combinatorial problems. For instance, by connecting search space analysis to the theory of inapproximability, we prove that the bound specified by Grover's maximum principle for the Max-Ek-Lin-2 problem is sharp. Moreover, we use the framework to prove certain configurations are forbidden in regions of the Max-3-Sat search space, supplying the first theoretical confirmation of empirical results by others. Finally, we show that theoretical results can be used to drive the design of algorithms in a principled manner by using the search space analysis developed in this thesis in algorithmic applications. First, information obtained from our moment retrieving algorithm can be used to direct a hill-climbing search across plateaus in the Max-k-Sat search space. Second, the analysis can be used to control the mutation rate on a (1+1) evolutionary algorithm on bounded pseudo-Boolean functions so that the offspring of each search point is maximized in expectation. For these applications, knowledge of the search space structure supplied by the analysis translates to significant gains in the performance of search
- …