Search CORE

119 research outputs found

Approximating $k$ -Median via Pseudo-Approximation

Author: Li Shi
Svensson Ola
Publication venue
Publication date: 01/11/2012
Field of study

We present a novel approximation algorithm for

k

-median that achieves an approximation guarantee of

1+\sqrt{3}+\epsilon

, improving upon the decade-old ratio of

3+\epsilon

. Our approach is based on two components, each of which, we believe, is of independent interest. First, we show that in order to give an

\alpha

-approximation algorithm for

k

-median, it is sufficient to give a \emph{pseudo-approximation algorithm} that finds an

\alpha

-approximate solution by opening

k+O(1)

facilities. This is a rather surprising result as there exist instances for which opening

k+1

facilities may lead to a significant smaller cost than if only

k

facilities were opened. Second, we give such a pseudo-approximation algorithm with

\alpha= 1+\sqrt{3}+\epsilon

. Prior to our work, it was not even known whether opening

k + o(k)

facilities would help improve the approximation ratio.Comment: 18 page

arXiv.org e-Print Archive

CiteSeerX

Fault Tolerant Clustering Revisited

Author: Kumar Nirman
Raichel Benjamin
Publication venue
Publication date: 01/01/2013
Field of study

In discrete k-center and k-median clustering, we are given a set of points P in a metric space M, and the task is to output a set C \subseteq ? P, |C| = k, such that the cost of clustering P using C is as small as possible. For k-center, the cost is the furthest a point has to travel to its nearest center, whereas for k-median, the cost is the sum of all point to nearest center distances. In the fault-tolerant versions of these problems, we are given an additional parameter 1 ?\leq \ell \leq ? k, such that when computing the cost of clustering, points are assigned to their \ell-th nearest-neighbor in C, instead of their nearest neighbor. We provide constant factor approximation algorithms for these problems that are both conceptually simple and highly practical from an implementation stand-point

arXiv.org e-Print Archive

University of Memphis Digital Commons

Certified Algorithms: Worst-Case Analysis and Beyond

Author: Makarychev Konstantin
Makarychev Yury
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 11th Innovations in Theoretical Computer Science Conference (ITCS 2020)
Publication date: 01/01/2020
Field of study

In this paper, we introduce the notion of a certified algorithm. Certified algorithms provide worst-case and beyond-worst-case performance guarantees. First, a ?-certified algorithm is also a ?-approximation algorithm - it finds a ?-approximation no matter what the input is. Second, it exactly solves ?-perturbation-resilient instances (?-perturbation-resilient instances model real-life instances). Additionally, certified algorithms have a number of other desirable properties: they solve both maximization and minimization versions of a problem (e.g. Max Cut and Min Uncut), solve weakly perturbation-resilient instances, and solve optimization problems with hard constraints. In the paper, we define certified algorithms, describe their properties, present a framework for designing certified algorithms, provide examples of certified algorithms for Max Cut/Min Uncut, Minimum Multiway Cut, k-medians and k-means. We also present some negative results

Dagstuhl Research Online Publication Server

Fair Clustering Through Fairlets

Author: Chierichetti Flavio
Kumar Ravi
Lattanzi Silvio
Vassilvitskii Sergei
Publication venue
Publication date: 01/01/2017
Field of study

We study the question of fair clustering under the {\em disparate impact} doctrine, where each protected class must have approximately equal representation in every cluster. We formulate the fair clustering problem under both the

k

-center and the

k

-median objectives, and show that even with two protected classes the problem is challenging, as the optimum solution can violate common conventions---for instance a point may no longer be assigned to its nearest cluster center! En route we introduce the concept of fairlets, which are minimal sets that satisfy fair representation while approximately preserving the clustering objective. We show that any fair clustering problem can be decomposed into first finding good fairlets, and then using existing machinery for traditional clustering algorithms. While finding good fairlets can be NP-hard, we proceed to obtain efficient approximation algorithms based on minimum cost flow. We empirically quantify the value of fair clustering on real-world datasets with sensitive attributes

arXiv.org e-Print Archive

Archivio della ricerca- Università di Roma La Sapienza

Constant-Factor FPT Approximation for Capacitated k-Median

Author: Adamczyk Marek
Byrka Jaroslaw
Marcinkowski Jan
Meesum Syed M.
Wlodarczyk Michal
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 27th Annual European Symposium on Algorithms (ESA 2019)
Publication date: 15/09/2018
Field of study

Capacitated k-median is one of the few outstanding optimization problems for which the existence of a polynomial time constant factor approximation algorithm remains an open problem. In a series of recent papers algorithms producing solutions violating either the number of facilities or the capacity by a multiplicative factor were obtained. However, to produce solutions without violations appears to be hard and potentially requires different algorithmic techniques. Notably, if parameterized by the number of facilities k, the problem is also W[2] hard, making the existence of an exact FPT algorithm unlikely. In this work we provide an FPT-time constant factor approximation algorithm preserving both cardinality and capacity of the facilities. The algorithm runs in time 2^O(k log k) n^O(1) and achieves an approximation ratio of 7+epsilon

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

The Hardness of Approximation of Euclidean k-means

Author: Awasthi Pranjal
Charikar Moses
Krishnaswamy Ravishankar
Sinop Ali Kemal
Publication venue
Publication date: 01/01/2015
Field of study

The Euclidean

k

-means problem is a classical problem that has been extensively studied in the theoretical computer science, machine learning and the computational geometry communities. In this problem, we are given a set of

n

points in Euclidean space

R^d

, and the goal is to choose

k

centers in

R^d

so that the sum of squared distances of each point to its nearest center is minimized. The best approximation algorithms for this problem include a polynomial time constant factor approximation for general

k

and a

(1+\epsilon)

-approximation which runs in time

poly(n) 2^{O(k/\epsilon)}

. At the other extreme, the only known computational complexity result for this problem is NP-hardness [ADHP'09]. The main difficulty in obtaining hardness results stems from the Euclidean nature of the problem, and the fact that any point in

R^d

can be a potential center. This gap in understanding left open the intriguing possibility that the problem might admit a PTAS for all

k,d

. In this paper we provide the first hardness of approximation for the Euclidean

k

-means problem. Concretely, we show that there exists a constant

\epsilon > 0

such that it is NP-hard to approximate the

k

-means objective to within a factor of

(1+\epsilon)

. We show this via an efficient reduction from the vertex cover problem on triangle-free graphs: given a triangle-free graph, the goal is to choose the fewest number of vertices which are incident on all the edges. Additionally, we give a proof that the current best hardness results for vertex cover can be carried over to triangle-free graphs. To show this we transform

G

, a known hard vertex cover instance, by taking a graph product with a suitably chosen graph

H

, and showing that the size of the (normalized) maximum independent set is almost exactly preserved in the product graph using a spectral analysis, which might be of independent interest

arXiv.org e-Print Archive

CiteSeerX

Princeton University Open Access Repository

Dagstuhl Research Online Publication Server

Constant Factor Approximation for Capacitated k-Center with Outliers

Author: Cygan Marek
Kociumaka Tomasz
Publication venue
Publication date: 01/01/2014
Field of study

The

k

-center problem is a classic facility location problem, where given an edge-weighted graph

G = (V,E)

one is to find a subset of

k

vertices

S

, such that each vertex in

V

is "close" to some vertex in

S

. The approximation status of this basic problem is well understood, as a simple 2-approximation algorithm is known to be tight. Consequently different extensions were studied. In the capacitated version of the problem each vertex is assigned a capacity, which is a strict upper bound on the number of clients a facility can serve, when located at this vertex. A constant factor approximation for the capacitated

k

-center was obtained last year by Cygan, Hajiaghayi and Khuller [FOCS'12], which was recently improved to a 9-approximation by An, Bhaskara and Svensson [arXiv'13]. In a different generalization of the problem some clients (denoted as outliers) may be disregarded. Here we are additionally given an integer

p

and the goal is to serve exactly

p

clients, which the algorithm is free to choose. In 2001 Charikar et al. [SODA'01] presented a 3-approximation for the

k

-center problem with outliers. In this paper we consider a common generalization of the two extensions previously studied separately, i.e. we work with the capacitated

k

-center with outliers. We present the first constant factor approximation algorithm with approximation ratio of 25 even for the case of non-uniform hard capacities.Comment: 15 pages, 3 figures, accepted to STACS 201

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Tight Analysis of a Multiple-Swap Heuristic for Budgeted Red-Blue Median

Author: Friggstad Zachary
Zhang Yifeng
Publication venue
Publication date: 01/01/2016
Field of study

Budgeted Red-Blue Median is a generalization of classic

k

-Median in that there are two sets of facilities, say

\mathcal{R}

and

\mathcal{B}

, that can be used to serve clients located in some metric space. The goal is to open

k_r

facilities in

\mathcal{R}

and

k_b

facilities in

\mathcal{B}

for some given bounds

k_r, k_b

and connect each client to their nearest open facility in a way that minimizes the total connection cost. We extend work by Hajiaghayi, Khandekar, and Kortsarz [2012] and show that a multiple-swap local search heuristic can be used to obtain a

(5+\epsilon)

-approximation for Budgeted Red-Blue Median for any constant

\epsilon > 0

. This is an improvement over their single swap analysis and beats the previous best approximation guarantee of 8 by Swamy [2014]. We also present a matching lower bound showing that for every