1,623 research outputs found

    The Geometric Maximum Traveling Salesman Problem

    Get PDF
    We consider the traveling salesman problem when the cities are points in R^d for some fixed d and distances are computed according to geometric distances, determined by some norm. We show that for any polyhedral norm, the problem of finding a tour of maximum length can be solved in polynomial time. If arithmetic operations are assumed to take unit time, our algorithms run in time O(n^{f-2} log n), where f is the number of facets of the polyhedron determining the polyhedral norm. Thus for example we have O(n^2 log n) algorithms for the cases of points in the plane under the Rectilinear and Sup norms. This is in contrast to the fact that finding a minimum length tour in each case is NP-hard. Our approach can be extended to the more general case of quasi-norms with not necessarily symmetric unit ball, where we get a complexity of O(n^{2f-2} log n). For the special case of two-dimensional metrics with f=4 (which includes the Rectilinear and Sup norms), we present a simple algorithm with O(n) running time. The algorithm does not use any indirect addressing, so its running time remains valid even in comparison based models in which sorting requires Omega(n \log n) time. The basic mechanism of the algorithm provides some intuition on why polyhedral norms allow fast algorithms. Complementing the results on simplicity for polyhedral norms, we prove that for the case of Euclidean distances in R^d for d>2, the Maximum TSP is NP-hard. This sheds new light on the well-studied difficulties of Euclidean distances.Comment: 24 pages, 6 figures; revised to appear in Journal of the ACM. (clarified some minor points, fixed typos

    Low Rank Approximation of Binary Matrices: Column Subset Selection and Generalizations

    Get PDF
    Low rank matrix approximation is an important tool in machine learning. Given a data matrix, low rank approximation helps to find factors, patterns and provides concise representations for the data. Research on low rank approximation usually focus on real matrices. However, in many applications data are binary (categorical) rather than continuous. This leads to the problem of low rank approximation of binary matrix. Here we are given a d×nd \times n binary matrix AA and a small integer kk. The goal is to find two binary matrices UU and VV of sizes d×kd \times k and k×nk \times n respectively, so that the Frobenius norm of AUVA - U V is minimized. There are two models of this problem, depending on the definition of the dot product of binary vectors: The GF(2)\mathrm{GF}(2) model and the Boolean semiring model. Unlike low rank approximation of real matrix which can be efficiently solved by Singular Value Decomposition, approximation of binary matrix is NPNP-hard even for k=1k=1. In this paper, we consider the problem of Column Subset Selection (CSS), in which one low rank matrix must be formed by kk columns of the data matrix. We characterize the approximation ratio of CSS for binary matrices. For GF(2)GF(2) model, we show the approximation ratio of CSS is bounded by k2+1+k2(2k1)\frac{k}{2}+1+\frac{k}{2(2^k-1)} and this bound is asymptotically tight. For Boolean model, it turns out that CSS is no longer sufficient to obtain a bound. We then develop a Generalized CSS (GCSS) procedure in which the columns of one low rank matrix are generated from Boolean formulas operating bitwise on columns of the data matrix. We show the approximation ratio of GCSS is bounded by 2k1+12^{k-1}+1, and the exponential dependency on kk is inherent.Comment: 38 page

    Non-Abelian Analogs of Lattice Rounding

    Full text link
    Lattice rounding in Euclidean space can be viewed as finding the nearest point in the orbit of an action by a discrete group, relative to the norm inherited from the ambient space. Using this point of view, we initiate the study of non-abelian analogs of lattice rounding involving matrix groups. In one direction, we give an algorithm for solving a normed word problem when the inputs are random products over a basis set, and give theoretical justification for its success. In another direction, we prove a general inapproximability result which essentially rules out strong approximation algorithms (i.e., whose approximation factors depend only on dimension) analogous to LLL in the general case.Comment: 30 page

    Prizing on Paths: A PTAS for the Highway Problem

    Full text link
    In the highway problem, we are given an n-edge line graph (the highway), and a set of paths (the drivers), each one with its own budget. For a given assignment of edge weights (the tolls), the highway owner collects from each driver the weight of the associated path, when it does not exceed the budget of the driver, and zero otherwise. The goal is choosing weights so as to maximize the profit. A lot of research has been devoted to this apparently simple problem. The highway problem was shown to be strongly NP-hard only recently [Elbassioni,Raman,Ray-'09]. The best-known approximation is O(\log n/\log\log n) [Gamzu,Segev-'10], which improves on the previous-best O(\log n) approximation [Balcan,Blum-'06]. In this paper we present a PTAS for the highway problem, hence closing the complexity status of the problem. Our result is based on a novel randomized dissection approach, which has some points in common with Arora's quadtree dissection for Euclidean network design [Arora-'98]. The basic idea is enclosing the highway in a bounding path, such that both the size of the bounding path and the position of the highway in it are random variables. Then we consider a recursive O(1)-ary dissection of the bounding path, in subpaths of uniform optimal weight. Since the optimal weights are unknown, we construct the dissection in a bottom-up fashion via dynamic programming, while computing the approximate solution at the same time. Our algorithm can be easily derandomized. We demonstrate the versatility of our technique by presenting PTASs for two variants of the highway problem: the tollbooth problem with a constant number of leaves and the maximum-feasibility subsystem problem on interval matrices. In both cases the previous best approximation factors are polylogarithmic [Gamzu,Segev-'10,Elbassioni,Raman,Ray,Sitters-'09]

    Optimal Recombination in Genetic Algorithms

    Full text link
    This paper surveys results on complexity of the optimal recombination problem (ORP), which consists in finding the best possible offspring as a result of a recombination operator in a genetic algorithm, given two parent solutions. We consider efficient reductions of the ORPs, allowing to establish polynomial solvability or NP-hardness of the ORPs, as well as direct proofs of hardness results

    Input Sparsity and Hardness for Robust Subspace Approximation

    Full text link
    In the subspace approximation problem, we seek a k-dimensional subspace F of R^d that minimizes the sum of p-th powers of Euclidean distances to a given set of n points a_1, ..., a_n in R^d, for p >= 1. More generally than minimizing sum_i dist(a_i,F)^p,we may wish to minimize sum_i M(dist(a_i,F)) for some loss function M(), for example, M-Estimators, which include the Huber and Tukey loss functions. Such subspaces provide alternatives to the singular value decomposition (SVD), which is the p=2 case, finding such an F that minimizes the sum of squares of distances. For p in [1,2), and for typical M-Estimators, the minimizing FF gives a solution that is more robust to outliers than that provided by the SVD. We give several algorithmic and hardness results for these robust subspace approximation problems. We think of the n points as forming an n x d matrix A, and letting nnz(A) denote the number of non-zero entries of A. Our results hold for p in [1,2). We use poly(n) to denote n^{O(1)} as n -> infty. We obtain: (1) For minimizing sum_i dist(a_i,F)^p, we give an algorithm running in O(nnz(A) + (n+d)poly(k/eps) + exp(poly(k/eps))), (2) we show that the problem of minimizing sum_i dist(a_i, F)^p is NP-hard, even to output a (1+1/poly(d))-approximation, answering a question of Kannan and Vempala, and complementing prior results which held for p >2, (3) For loss functions for a wide class of M-Estimators, we give a problem-size reduction: for a parameter K=(log n)^{O(log k)}, our reduction takes O(nnz(A) log n + (n+d) poly(K/eps)) time to reduce the problem to a constrained version involving matrices whose dimensions are poly(K eps^{-1} log n). We also give bicriteria solutions, (4) Our techniques lead to the first O(nnz(A) + poly(d/eps)) time algorithms for (1+eps)-approximate regression for a wide class of convex M-Estimators.Comment: paper appeared in FOCS, 201

    Bilu-Linial Stable Instances of Max Cut and Minimum Multiway Cut

    Full text link
    We investigate the notion of stability proposed by Bilu and Linial. We obtain an exact polynomial-time algorithm for γ\gamma-stable Max Cut instances with γclognloglogn\gamma \geq c\sqrt{\log n}\log\log n for some absolute constant c>0c > 0. Our algorithm is robust: it never returns an incorrect answer; if the instance is γ\gamma-stable, it finds the maximum cut, otherwise, it either finds the maximum cut or certifies that the instance is not γ\gamma-stable. We prove that there is no robust polynomial-time algorithm for γ\gamma-stable instances of Max Cut when γ<αSC(n/2)\gamma < \alpha_{SC}(n/2), where αSC\alpha_{SC} is the best approximation factor for Sparsest Cut with non-uniform demands. Our algorithm is based on semidefinite programming. We show that the standard SDP relaxation for Max Cut (with 22\ell_2^2 triangle inequalities) is integral if γD221(n)\gamma \geq D_{\ell_2^2\to \ell_1}(n), where D221(n)D_{\ell_2^2\to \ell_1}(n) is the least distortion with which every nn point metric space of negative type embeds into 1\ell_1. On the negative side, we show that the SDP relaxation is not integral when γ<D221(n/2)\gamma < D_{\ell_2^2\to \ell_1}(n/2). Moreover, there is no tractable convex relaxation for γ\gamma-stable instances of Max Cut when γ<αSC(n/2)\gamma < \alpha_{SC}(n/2). That suggests that solving γ\gamma-stable instances with γ=o(logn)\gamma =o(\sqrt{\log n}) might be difficult or impossible. Our results significantly improve previously known results. The best previously known algorithm for γ\gamma-stable instances of Max Cut required that γcn\gamma \geq c\sqrt{n} (for some c>0c > 0) [Bilu, Daniely, Linial, and Saks]. No hardness results were known for the problem. Additionally, we present an algorithm for 4-stable instances of Minimum Multiway Cut. We also study a relaxed notion of weak stability.Comment: 24 page
    corecore