48 research outputs found

    The Hardness of Approximation of Euclidean k-means

    Get PDF
    The Euclidean kk-means problem is a classical problem that has been extensively studied in the theoretical computer science, machine learning and the computational geometry communities. In this problem, we are given a set of nn points in Euclidean space RdR^d, and the goal is to choose kk centers in RdR^d so that the sum of squared distances of each point to its nearest center is minimized. The best approximation algorithms for this problem include a polynomial time constant factor approximation for general kk and a (1+ϵ)(1+\epsilon)-approximation which runs in time poly(n)2O(k/ϵ)poly(n) 2^{O(k/\epsilon)}. At the other extreme, the only known computational complexity result for this problem is NP-hardness [ADHP'09]. The main difficulty in obtaining hardness results stems from the Euclidean nature of the problem, and the fact that any point in RdR^d can be a potential center. This gap in understanding left open the intriguing possibility that the problem might admit a PTAS for all k,dk,d. In this paper we provide the first hardness of approximation for the Euclidean kk-means problem. Concretely, we show that there exists a constant ϵ>0\epsilon > 0 such that it is NP-hard to approximate the kk-means objective to within a factor of (1+ϵ)(1+\epsilon). We show this via an efficient reduction from the vertex cover problem on triangle-free graphs: given a triangle-free graph, the goal is to choose the fewest number of vertices which are incident on all the edges. Additionally, we give a proof that the current best hardness results for vertex cover can be carried over to triangle-free graphs. To show this we transform GG, a known hard vertex cover instance, by taking a graph product with a suitably chosen graph HH, and showing that the size of the (normalized) maximum independent set is almost exactly preserved in the product graph using a spectral analysis, which might be of independent interest

    Tight Analysis of a Multiple-Swap Heuristic for Budgeted Red-Blue Median

    Get PDF
    Budgeted Red-Blue Median is a generalization of classic kk-Median in that there are two sets of facilities, say R\mathcal{R} and B\mathcal{B}, that can be used to serve clients located in some metric space. The goal is to open krk_r facilities in R\mathcal{R} and kbk_b facilities in B\mathcal{B} for some given bounds kr,kbk_r, k_b and connect each client to their nearest open facility in a way that minimizes the total connection cost. We extend work by Hajiaghayi, Khandekar, and Kortsarz [2012] and show that a multiple-swap local search heuristic can be used to obtain a (5+ϵ)(5+\epsilon)-approximation for Budgeted Red-Blue Median for any constant ϵ>0\epsilon > 0. This is an improvement over their single swap analysis and beats the previous best approximation guarantee of 8 by Swamy [2014]. We also present a matching lower bound showing that for every p1p \geq 1, there are instances of Budgeted Red-Blue Median with local optimum solutions for the pp-swap heuristic whose cost is 5+Ω(1p)5 + \Omega\left(\frac{1}{p}\right) times the optimum solution cost. Thus, our analysis is tight up to the lower order terms. In particular, for any ϵ>0\epsilon > 0 we show the single-swap heuristic admits local optima whose cost can be as bad as 7ϵ7-\epsilon times the optimum solution cost

    An Improved Approximation Algorithm for the Hard Uniform Capacitated k-median Problem

    Full text link
    In the kk-median problem, given a set of locations, the goal is to select a subset of at most kk centers so as to minimize the total cost of connecting each location to its nearest center. We study the uniform hard capacitated version of the kk-median problem, in which each selected center can only serve a limited number of locations. Inspired by the algorithm of Charikar, Guha, Tardos and Shmoys, we give a (6+10α)(6+10\alpha)-approximation algorithm for this problem with increasing the capacities by a factor of 2+2α,α42+\frac{2}{\alpha}, \alpha\geq 4, which improves the previous best (32l2+28l+7)(32 l^2+28 l+7)-approximation algorithm proposed by Byrka, Fleszar, Rybicki and Spoerhase violating the capacities by factor 2+3l1,l{2,3,4,}2+\frac{3}{l-1}, l\in \{2,3,4,\dots\}.Comment: 19 pages, 1 figur

    Probabilistic Analysis of Optimization Problems on Generalized Random Shortest Path Metrics

    Get PDF
    Simple heuristics often show a remarkable performance in practice for optimization problems. Worst-case analysis often falls short of explaining this performance. Because of this, "beyond worst-case analysis" of algorithms has recently gained a lot of attention, including probabilistic analysis of algorithms. The instances of many optimization problems are essentially a discrete metric space. Probabilistic analysis for such metric optimization problems has nevertheless mostly been conducted on instances drawn from Euclidean space, which provides a structure that is usually heavily exploited in the analysis. However, most instances from practice are not Euclidean. Little work has been done on metric instances drawn from other, more realistic, distributions. Some initial results have been obtained by Bringmann et al. (Algorithmica, 2013), who have used random shortest path metrics on complete graphs to analyze heuristics. The goal of this paper is to generalize these findings to non-complete graphs, especially Erd\H{o}s-R\'enyi random graphs. A random shortest path metric is constructed by drawing independent random edge weights for each edge in the graph and setting the distance between every pair of vertices to the length of a shortest path between them with respect to the drawn weights. For such instances, we prove that the greedy heuristic for the minimum distance maximum matching problem, the nearest neighbor and insertion heuristics for the traveling salesman problem, and a trivial heuristic for the kk-median problem all achieve a constant expected approximation ratio. Additionally, we show a polynomial upper bound for the expected number of iterations of the 2-opt heuristic for the traveling salesman problem.Comment: An extended abstract appeared in the proceedings of WALCOM 201

    An Approximation Algorithm for Multi Allocation Hub Location Problems

    Full text link
    The multi allocation p-hub median problem (MApHM), the multi allocation uncapacitated hub location problem (MAuHLP) and the multi allocation p-hub location problem (MApHLP) are common hub location problems with several practical applications. HLPs aim to construct a network for routing tasks between different locations. Specifically, a set of hubs must be chosen and each routing must be performed using one or two hubs as stopovers. The costs between two hubs are discounted. The objective is to minimize the total transportation cost in the MApHM and additionally to minimize the set-up costs for the hubs in the MAuHLP and MApHLP. In this paper, an approximation algorithm to solve these problems is developed, which improves the approximation bound for MApHM to 3.451, for MAuHLP to 2.173 and for MApHLP to 4.552 when combined with the algorithm of Benedito & Pedrosa. The proposed algorithm is capable of solving much bigger instances than any exact algorithm in the literature. New benchmark instances have been created and published for evaluation, such that HLP algorithms can be tested and compared on huge instances. The proposed algorithm performs on most instances better than the algorithm of Benedito & Pedrosa, which was the only known approximation algorithm for these problems by now

    The Non-Uniform k-Center Problem

    Get PDF
    In this paper, we introduce and study the Non-Uniform k-Center problem (NUkC). Given a finite metric space (X,d)(X,d) and a collection of balls of radii {r1rk}\{r_1\geq \cdots \ge r_k\}, the NUkC problem is to find a placement of their centers on the metric space and find the minimum dilation α\alpha, such that the union of balls of radius αri\alpha\cdot r_i around the iith center covers all the points in XX. This problem naturally arises as a min-max vehicle routing problem with fleets of different speeds. The NUkC problem generalizes the classic kk-center problem when all the kk radii are the same (which can be assumed to be 11 after scaling). It also generalizes the kk-center with outliers (kCwO) problem when there are kk balls of radius 11 and \ell balls of radius 00. There are 22-approximation and 33-approximation algorithms known for these problems respectively; the former is best possible unless P=NP and the latter remains unimproved for 15 years. We first observe that no O(1)O(1)-approximation is to the optimal dilation is possible unless P=NP, implying that the NUkC problem is more non-trivial than the above two problems. Our main algorithmic result is an (O(1),O(1))(O(1),O(1))-bi-criteria approximation result: we give an O(1)O(1)-approximation to the optimal dilation, however, we may open Θ(1)\Theta(1) centers of each radii. Our techniques also allow us to prove a simple (uni-criteria), optimal 22-approximation to the kCwO problem improving upon the long-standing 33-factor. Our main technical contribution is a connection between the NUkC problem and the so-called firefighter problems on trees which have been studied recently in the TCS community.Comment: Adjusted the figur
    corecore