1,623 research outputs found
The Geometric Maximum Traveling Salesman Problem
We consider the traveling salesman problem when the cities are points in R^d
for some fixed d and distances are computed according to geometric distances,
determined by some norm. We show that for any polyhedral norm, the problem of
finding a tour of maximum length can be solved in polynomial time. If
arithmetic operations are assumed to take unit time, our algorithms run in time
O(n^{f-2} log n), where f is the number of facets of the polyhedron determining
the polyhedral norm. Thus for example we have O(n^2 log n) algorithms for the
cases of points in the plane under the Rectilinear and Sup norms. This is in
contrast to the fact that finding a minimum length tour in each case is
NP-hard. Our approach can be extended to the more general case of quasi-norms
with not necessarily symmetric unit ball, where we get a complexity of
O(n^{2f-2} log n).
For the special case of two-dimensional metrics with f=4 (which includes the
Rectilinear and Sup norms), we present a simple algorithm with O(n) running
time. The algorithm does not use any indirect addressing, so its running time
remains valid even in comparison based models in which sorting requires Omega(n
\log n) time. The basic mechanism of the algorithm provides some intuition on
why polyhedral norms allow fast algorithms.
Complementing the results on simplicity for polyhedral norms, we prove that
for the case of Euclidean distances in R^d for d>2, the Maximum TSP is NP-hard.
This sheds new light on the well-studied difficulties of Euclidean distances.Comment: 24 pages, 6 figures; revised to appear in Journal of the ACM.
(clarified some minor points, fixed typos
Low Rank Approximation of Binary Matrices: Column Subset Selection and Generalizations
Low rank matrix approximation is an important tool in machine learning. Given
a data matrix, low rank approximation helps to find factors, patterns and
provides concise representations for the data. Research on low rank
approximation usually focus on real matrices. However, in many applications
data are binary (categorical) rather than continuous. This leads to the problem
of low rank approximation of binary matrix. Here we are given a
binary matrix and a small integer . The goal is to find two binary
matrices and of sizes and respectively, so
that the Frobenius norm of is minimized. There are two models of this
problem, depending on the definition of the dot product of binary vectors: The
model and the Boolean semiring model. Unlike low rank
approximation of real matrix which can be efficiently solved by Singular Value
Decomposition, approximation of binary matrix is -hard even for .
In this paper, we consider the problem of Column Subset Selection (CSS), in
which one low rank matrix must be formed by columns of the data matrix. We
characterize the approximation ratio of CSS for binary matrices. For
model, we show the approximation ratio of CSS is bounded by
and this bound is asymptotically tight. For
Boolean model, it turns out that CSS is no longer sufficient to obtain a bound.
We then develop a Generalized CSS (GCSS) procedure in which the columns of one
low rank matrix are generated from Boolean formulas operating bitwise on
columns of the data matrix. We show the approximation ratio of GCSS is bounded
by , and the exponential dependency on is inherent.Comment: 38 page
Non-Abelian Analogs of Lattice Rounding
Lattice rounding in Euclidean space can be viewed as finding the nearest
point in the orbit of an action by a discrete group, relative to the norm
inherited from the ambient space. Using this point of view, we initiate the
study of non-abelian analogs of lattice rounding involving matrix groups. In
one direction, we give an algorithm for solving a normed word problem when the
inputs are random products over a basis set, and give theoretical justification
for its success. In another direction, we prove a general inapproximability
result which essentially rules out strong approximation algorithms (i.e., whose
approximation factors depend only on dimension) analogous to LLL in the general
case.Comment: 30 page
Prizing on Paths: A PTAS for the Highway Problem
In the highway problem, we are given an n-edge line graph (the highway), and
a set of paths (the drivers), each one with its own budget. For a given
assignment of edge weights (the tolls), the highway owner collects from each
driver the weight of the associated path, when it does not exceed the budget of
the driver, and zero otherwise. The goal is choosing weights so as to maximize
the profit.
A lot of research has been devoted to this apparently simple problem. The
highway problem was shown to be strongly NP-hard only recently
[Elbassioni,Raman,Ray-'09]. The best-known approximation is O(\log n/\log\log
n) [Gamzu,Segev-'10], which improves on the previous-best O(\log n)
approximation [Balcan,Blum-'06].
In this paper we present a PTAS for the highway problem, hence closing the
complexity status of the problem. Our result is based on a novel randomized
dissection approach, which has some points in common with Arora's quadtree
dissection for Euclidean network design [Arora-'98]. The basic idea is
enclosing the highway in a bounding path, such that both the size of the
bounding path and the position of the highway in it are random variables. Then
we consider a recursive O(1)-ary dissection of the bounding path, in subpaths
of uniform optimal weight. Since the optimal weights are unknown, we construct
the dissection in a bottom-up fashion via dynamic programming, while computing
the approximate solution at the same time. Our algorithm can be easily
derandomized. We demonstrate the versatility of our technique by presenting
PTASs for two variants of the highway problem: the tollbooth problem with a
constant number of leaves and the maximum-feasibility subsystem problem on
interval matrices. In both cases the previous best approximation factors are
polylogarithmic [Gamzu,Segev-'10,Elbassioni,Raman,Ray,Sitters-'09]
Optimal Recombination in Genetic Algorithms
This paper surveys results on complexity of the optimal recombination problem
(ORP), which consists in finding the best possible offspring as a result of a
recombination operator in a genetic algorithm, given two parent solutions. We
consider efficient reductions of the ORPs, allowing to establish polynomial
solvability or NP-hardness of the ORPs, as well as direct proofs of hardness
results
Input Sparsity and Hardness for Robust Subspace Approximation
In the subspace approximation problem, we seek a k-dimensional subspace F of
R^d that minimizes the sum of p-th powers of Euclidean distances to a given set
of n points a_1, ..., a_n in R^d, for p >= 1. More generally than minimizing
sum_i dist(a_i,F)^p,we may wish to minimize sum_i M(dist(a_i,F)) for some loss
function M(), for example, M-Estimators, which include the Huber and Tukey loss
functions. Such subspaces provide alternatives to the singular value
decomposition (SVD), which is the p=2 case, finding such an F that minimizes
the sum of squares of distances. For p in [1,2), and for typical M-Estimators,
the minimizing gives a solution that is more robust to outliers than that
provided by the SVD. We give several algorithmic and hardness results for these
robust subspace approximation problems.
We think of the n points as forming an n x d matrix A, and letting nnz(A)
denote the number of non-zero entries of A. Our results hold for p in [1,2). We
use poly(n) to denote n^{O(1)} as n -> infty. We obtain: (1) For minimizing
sum_i dist(a_i,F)^p, we give an algorithm running in O(nnz(A) +
(n+d)poly(k/eps) + exp(poly(k/eps))), (2) we show that the problem of
minimizing sum_i dist(a_i, F)^p is NP-hard, even to output a
(1+1/poly(d))-approximation, answering a question of Kannan and Vempala, and
complementing prior results which held for p >2, (3) For loss functions for a
wide class of M-Estimators, we give a problem-size reduction: for a parameter
K=(log n)^{O(log k)}, our reduction takes O(nnz(A) log n + (n+d) poly(K/eps))
time to reduce the problem to a constrained version involving matrices whose
dimensions are poly(K eps^{-1} log n). We also give bicriteria solutions, (4)
Our techniques lead to the first O(nnz(A) + poly(d/eps)) time algorithms for
(1+eps)-approximate regression for a wide class of convex M-Estimators.Comment: paper appeared in FOCS, 201
Bilu-Linial Stable Instances of Max Cut and Minimum Multiway Cut
We investigate the notion of stability proposed by Bilu and Linial. We obtain
an exact polynomial-time algorithm for -stable Max Cut instances with
for some absolute constant . Our
algorithm is robust: it never returns an incorrect answer; if the instance is
-stable, it finds the maximum cut, otherwise, it either finds the
maximum cut or certifies that the instance is not -stable. We prove
that there is no robust polynomial-time algorithm for -stable instances
of Max Cut when , where is the best
approximation factor for Sparsest Cut with non-uniform demands.
Our algorithm is based on semidefinite programming. We show that the standard
SDP relaxation for Max Cut (with triangle inequalities) is integral
if , where
is the least distortion with which every point metric space of negative
type embeds into . On the negative side, we show that the SDP
relaxation is not integral when .
Moreover, there is no tractable convex relaxation for -stable instances
of Max Cut when . That suggests that solving
-stable instances with might be difficult or
impossible.
Our results significantly improve previously known results. The best
previously known algorithm for -stable instances of Max Cut required
that (for some ) [Bilu, Daniely, Linial, and
Saks]. No hardness results were known for the problem. Additionally, we present
an algorithm for 4-stable instances of Minimum Multiway Cut. We also study a
relaxed notion of weak stability.Comment: 24 page
- …