Search CORE

792 research outputs found

Tight Analysis of a Multiple-Swap Heuristic for Budgeted Red-Blue Median

Author: Friggstad Zachary
Zhang Yifeng
Publication venue
Publication date: 01/01/2016
Field of study

Budgeted Red-Blue Median is a generalization of classic

k

-Median in that there are two sets of facilities, say

\mathcal{R}

and

\mathcal{B}

, that can be used to serve clients located in some metric space. The goal is to open

k_r

facilities in

\mathcal{R}

and

k_b

facilities in

\mathcal{B}

for some given bounds

k_r, k_b

and connect each client to their nearest open facility in a way that minimizes the total connection cost. We extend work by Hajiaghayi, Khandekar, and Kortsarz [2012] and show that a multiple-swap local search heuristic can be used to obtain a

(5+\epsilon)

-approximation for Budgeted Red-Blue Median for any constant

\epsilon > 0

. This is an improvement over their single swap analysis and beats the previous best approximation guarantee of 8 by Swamy [2014]. We also present a matching lower bound showing that for every

p \geq 1

, there are instances of Budgeted Red-Blue Median with local optimum solutions for the

p

-swap heuristic whose cost is

5 + \Omega\left(\frac{1}{p}\right)

times the optimum solution cost. Thus, our analysis is tight up to the lower order terms. In particular, for any

\epsilon > 0

we show the single-swap heuristic admits local optima whose cost can be as bad as

7-\epsilon

times the optimum solution cost

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

The Non-Uniform k-Center Problem

Author: Chakrabarty Deeparnab
Goyal Prachi
Krishnaswamy Ravishankar
Publication venue
Publication date: 01/01/2016
Field of study

In this paper, we introduce and study the Non-Uniform k-Center problem (NUkC). Given a finite metric space

(X,d)

and a collection of balls of radii

\{r_1\geq \cdots \ge r_k\}

, the NUkC problem is to find a placement of their centers on the metric space and find the minimum dilation

\alpha

, such that the union of balls of radius

\alpha\cdot r_i

around the

i

th center covers all the points in

X

. This problem naturally arises as a min-max vehicle routing problem with fleets of different speeds. The NUkC problem generalizes the classic

k

-center problem when all the

k

radii are the same (which can be assumed to be

1

after scaling). It also generalizes the

k

-center with outliers (kCwO) problem when there are

k

balls of radius

1

and

\ell

balls of radius

0

. There are

2

-approximation and

3

-approximation algorithms known for these problems respectively; the former is best possible unless P=NP and the latter remains unimproved for 15 years. We first observe that no

O(1)

-approximation is to the optimal dilation is possible unless P=NP, implying that the NUkC problem is more non-trivial than the above two problems. Our main algorithmic result is an

(O(1),O(1))

-bi-criteria approximation result: we give an

O(1)

-approximation to the optimal dilation, however, we may open

\Theta(1)

centers of each radii. Our techniques also allow us to prove a simple (uni-criteria), optimal

2

-approximation to the kCwO problem improving upon the long-standing

3

-factor. Our main technical contribution is a connection between the NUkC problem and the so-called firefighter problems on trees which have been studied recently in the TCS community.Comment: Adjusted the figur

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Dependent randomized rounding for clustering and partition systems with knapsack constraints

Author: Harris David G.
Pensyl Thomas
Srinivasan Aravind
Trinh Khoa
Publication venue
Publication date: 01/04/2020
Field of study

Clustering problems are fundamental to unsupervised learning. There is an increased emphasis on fairness in machine learning and AI; one representative notion of fairness is that no single demographic group should be over-represented among the cluster-centers. This, and much more general clustering problems, can be formulated with "knapsack" and "partition" constraints. We develop new randomized algorithms targeting such problems, and study two in particular: multi-knapsack median and multi-knapsack center. Our rounding algorithms give new approximation and pseudo-approximation algorithms for these problems. One key technical tool, which may be of independent interest, is a new tail bound analogous to Feige (2006) for sums of random variables with unbounded variances. Such bounds are very useful in inferring properties of large networks using few samples

arXiv.org e-Print Archive

FPT Approximation for Fair Minimum-Load Clustering

Author: Bandyapadhyay Sayan
Fomin Fedor V.
Golovach Petr A.
Purohit Nidhi
Simonov Kirill
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 17th International Symposium on Parameterized and Exact Computation (IPEC 2022)
Publication date: 01/01/2022
Field of study

In this paper, we consider the Minimum-Load k-Clustering/Facility Location (MLkC) problem where we are given a set P of n points in a metric space that we have to cluster and an integer k > 0 that denotes the number of clusters. Additionally, we are given a set F of cluster centers in the same metric space. The goal is to select a set C ? F of k centers and assign each point in P to a center in C, such that the maximum load over all centers is minimized. Here the load of a center is the sum of the distances between it and the points assigned to it. Although clustering/facility location problems have rich literature, the minimum-load objective has not been studied substantially, and hence MLkC has remained a poorly understood problem. More interestingly, the problem is notoriously hard even in some special cases including the one in line metrics as shown by Ahmadian et al. [APPROX 2014, ACM Trans. Algorithms 2018]. They also show APX-hardness of the problem in the plane. On the other hand, the best-known approximation factor for MLkC is O(k), even in the plane. In this work, we study a fair version of MLkC inspired by the work of Chierichetti et al. [NeurIPS, 2017]. Here the input points are partitioned into ? protected groups, and only clusters that proportionally represent each group are allowed. MLkC is the special case with ? = 1. For the fair version, we are able to obtain a randomized 3-approximation algorithm in f(k,?)? n^O(1) time. Also, our scheme leads to an improved (1 + ?)-approximation in the case of Euclidean norm with the same running time (depending also linearly on the dimension d). Our results imply the same approximations for MLkC with running time f(k)? n^O(1), achieving the first constant-factor FPT approximations for this problem in general and Euclidean metric spaces

Dagstuhl Research Online Publication Server

FPT Approximation for Fair Minimum-Load Clustering

Author: Bandyapadhyay Sayan
Fomin Fedor V.
Golovach Petr A.
Purohit Nidhi
Simonov Kirill
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 17th International Symposium on Parameterized and Exact Computation (IPEC 2022)
Publication date: 20/07/2021
Field of study

arXiv.org e-Print Archive

University of Bergen

Dagstuhl Research Online Publication Server

Constant-Factor FPT Approximation for Capacitated k-Median

Author: Adamczyk Marek
Byrka Jaroslaw
Marcinkowski Jan
Meesum Syed M.
Wlodarczyk Michal
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 27th Annual European Symposium on Algorithms (ESA 2019)
Publication date: 15/09/2018
Field of study

Capacitated k-median is one of the few outstanding optimization problems for which the existence of a polynomial time constant factor approximation algorithm remains an open problem. In a series of recent papers algorithms producing solutions violating either the number of facilities or the capacity by a multiplicative factor were obtained. However, to produce solutions without violations appears to be hard and potentially requires different algorithmic techniques. Notably, if parameterized by the number of facilities k, the problem is also W[2] hard, making the existence of an exact FPT algorithm unlikely. In this work we provide an FPT-time constant factor approximation algorithm preserving both cardinality and capacity of the facilities. The algorithm runs in time 2^O(k log k) n^O(1) and achieves an approximation ratio of 7+epsilon

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

The Hardness of Approximation of Euclidean k-means

Author: Awasthi Pranjal
Charikar Moses
Krishnaswamy Ravishankar
Sinop Ali Kemal
Publication venue
Publication date: 01/01/2015
Field of study

The Euclidean

k

-means problem is a classical problem that has been extensively studied in the theoretical computer science, machine learning and the computational geometry communities. In this problem, we are given a set of

n

points in Euclidean space

R^d

, and the goal is to choose

k

centers in

R^d

so that the sum of squared distances of each point to its nearest center is minimized. The best approximation algorithms for this problem include a polynomial time constant factor approximation for general

k

and a

(1+\epsilon)

-approximation which runs in time

poly(n) 2^{O(k/\epsilon)}

. At the other extreme, the only known computational complexity result for this problem is NP-hardness [ADHP'09]. The main difficulty in obtaining hardness results stems from the Euclidean nature of the problem, and the fact that any point in

R^d

can be a potential center. This gap in understanding left open the intriguing possibility that the problem might admit a PTAS for all

k,d

. In this paper we provide the first hardness of approximation for the Euclidean

k

-means problem. Concretely, we show that there exists a constant

\epsilon > 0

such that it is NP-hard to approximate the

k

-means objective to within a factor of

(1+\epsilon)

. We show this via an efficient reduction from the vertex cover problem on triangle-free graphs: given a triangle-free graph, the goal is to choose the fewest number of vertices which are incident on all the edges. Additionally, we give a proof that the current best hardness results for vertex cover can be carried over to triangle-free graphs. To show this we transform

G

, a known hard vertex cover instance, by taking a graph product with a suitably chosen graph

H

, and showing that the size of the (normalized) maximum independent set is almost exactly preserved in the product graph using a spectral analysis, which might be of independent interest

arXiv.org e-Print Archive

CiteSeerX

Princeton University Open Access Repository

Dagstuhl Research Online Publication Server

Approximation Algorithms for Clustering with Dynamic Points

Author: Deng Shichuan
Li Jian
Rabani Yuval
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 28th Annual European Symposium on Algorithms (ESA 2020)
Publication date: 01/01/2020
Field of study

In many classic clustering problems, we seek to sketch a massive data set of

n

points in a metric space, by segmenting them into

k

categories or clusters, each cluster represented concisely by a single point in the metric space. Two notable examples are the

k

-center/

k

-supplier problem and the

k

-median problem. In practical applications of clustering, the data set may evolve over time, reflecting an evolution of the underlying clustering model. In this paper, we initiate the study of a dynamic version of clustering problems that aims to capture these considerations. In this version there are

T

time steps, and in each time step

t\in\{1,2,\dots,T\}

, the set of clients needed to be clustered may change, and we can move the

k

facilities between time steps. More specifically, we study two concrete problems in this framework: the Dynamic Ordered

k

-Median and the Dynamic

k

-Supplier problem. We first consider the Dynamic Ordered

k

-Median problem, where the objective is to minimize the weighted sum of ordered distances over all time steps, plus the total cost of moving the facilities between time steps. We present one constant-factor approximation algorithm for

T=2

and another approximation algorithm for fixed

T \geq 3

. Then we consider the Dynamic

k

-Supplier problem, where the objective is to minimize the maximum distance from any client to its facility, subject to the constraint that between time steps the maximum distance moved by any facility is no more than a given threshold. When the number of time steps

T

is 2, we present a simple constant factor approximation algorithm and a bi-criteria constant factor approximation algorithm for the outlier version, where some of the clients can be discarded. We also show that it is NP-hard to approximate the problem with any factor for

T \geq 3

.Comment: To be published in the Proceedings of the 28th Annual European Symposium on Algorithms (ESA 2020

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server