Search CORE

61 research outputs found

On the Fixed-Parameter Tractability of Capacitated Clustering

Author: Cohen-Addad Vincent
Li Jason
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 46th International Colloquium on Automata, Languages, and Programming (ICALP 2019)
Publication date: 01/01/2019
Field of study

We study the complexity of the classic capacitated k-median and k-means problems parameterized by the number of centers, k. These problems are notoriously difficult since the best known approximation bound for high dimensional Euclidean space and general metric space is Theta(log k) and it remains a major open problem whether a constant factor exists. We show that there exists a (3+epsilon)-approximation algorithm for the capacitated k-median and a (9+epsilon)-approximation algorithm for the capacitated k-means problem in general metric spaces whose running times are f(epsilon,k) n^{O(1)}. For Euclidean inputs of arbitrary dimension, we give a (1+epsilon)-approximation algorithm for both problems with a similar running time. This is a significant improvement over the (7+epsilon)-approximation of Adamczyk et al. for k-median in general metric spaces and the (69+epsilon)-approximation of Xu et al. for Euclidean k-means

Dagstuhl Research Online Publication Server

Improved Bounds for Metric Capacitated Covering Problems

Author: Bandyapadhyay Sayan
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 28th Annual European Symposium on Algorithms (ESA 2020)
Publication date: 01/01/2020
Field of study

In the Metric Capacitated Covering (MCC) problem, given a set of balls ? in a metric space P with metric d and a capacity parameter U, the goal is to find a minimum sized subset ?\u27 ? ? and an assignment of the points in P to the balls in ?\u27 such that each point is assigned to a ball that contains it and each ball is assigned with at most U points. MCC achieves an O(log |P|)-approximation using a greedy algorithm. On the other hand, it is hard to approximate within a factor of o(log |P|) even with ? < 3 factor expansion of the balls. Bandyapadhyay et al. [SoCG 2018, DCG 2019] showed that one can obtain an O(1)-approximation for the problem with 6.47 factor expansion of the balls. An open question left by their work is to reduce the gap between the lower bound 3 and the upper bound 6.47. In this current work, we show that it is possible to obtain an O(1)-approximation with only 4.24 factor expansion of the balls. We also show a similar upper bound of 5 for a more generalized version of MCC for which the best previously known bound was 9

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

FPT Approximations for Capacitated/Fair Clustering with Outliers

Author: Dabas Rajni
Gupta Neelima
Inamdar Tanmay
Publication venue
Publication date: 02/05/2023
Field of study

Clustering problems such as

k

-Median, and

k

-Means, are motivated from applications such as location planning, unsupervised learning among others. In such applications, it is important to find the clustering of points that is not ``skewed'' in terms of the number of points, i.e., no cluster should contain too many points. This is modeled by capacity constraints on the sizes of clusters. In an orthogonal direction, another important consideration in clustering is how to handle the presence of outliers in the data. Indeed, these clustering problems have been generalized in the literature to separately handle capacity constraints and outliers. To the best of our knowledge, there has been very little work on studying the approximability of clustering problems that can simultaneously handle both capacities and outliers. We initiate the study of the Capacitated

k

-Median with Outliers (C

k

MO) problem. Here, we want to cluster all except

m

outlier points into at most

k

clusters, such that (i) the clusters respect the capacity constraints, and (ii) the cost of clustering, defined as the sum of distances of each non-outlier point to its assigned cluster-center, is minimized. We design the first constant-factor approximation algorithms for C

k

MO. In particular, our algorithm returns a (3+\epsilon)-approximation for C

k

MO in general metric spaces, and a (1+\epsilon)-approximation in Euclidean spaces of constant dimension, that runs in time in time

f(k, m, \epsilon) \cdot |I_m|^{O(1)}

, where

|I_m|

denotes the input size. We can also extend these results to a broader class of problems, including Capacitated k-Means/k-Facility Location with Outliers, and Size-Balanced Fair Clustering problems with Outliers. For each of these problems, we obtain an approximation ratio that matches the best known guarantee of the corresponding outlier-free problem.Comment: Abstract shortened to meet arxiv requirement

arXiv.org e-Print Archive

The Capacitated Matroid Median Problem

Author: Kalhan Sanchit
Publication venue: 'University of Waterloo'
Publication date: 16/05/2018
Field of study

In this thesis, we study the capacitated generalization of the Matroid Median Problem which is a generalization of the classical clustering problem called the k-Median problem. In the capacitated matroid median problem, we are given a set F of facilities, a set D of clients and a common metric defined on F ∪ D, where the cost of connecting client j to facility i is denoted as c_{ij}. Each client j ∈ D has a demand of d_j, and each facility i ∈ F has an opening cost of f_i and a capacity u_i which limits the amount of demand that can be assigned to facility i. Moreover, there is a matroid M = (F,I) defined on the set of facilities. A solution to the capacitated matroid median problem involves opening a set of facilities F' ⊆ F such that F' ∈ I, and figuring out an assignment i(j) ∈ F' for every j ∈ D such that each facility i ∈ F' is assigned at most u_i demand. The cost associated with such a solution is : Σ_{i∈F} f_i + Σ_{j∈D} d_j c_{i(j)j}. Our goal is to find a solution of minimum cost. As the Matroid Median Problem generalizes the classical NP-Hard problem called k- median, it also is NP-Hard. We provide a bi-criteria approximation algorithm for the capacitated Matroid Median Problem with uniform capacities based on rounding the natural LP for the problem. Our algorithm achieves an approximation guarantee of 76 and violates the capacities by a factor of at most 6. We complement this result by providing two integrality gap results for the natural LP for capacitated matroid median

University of Waterloo's Institutional Repository

FPT Constant-Approximations for Capacitated Clustering to Minimize the Sum of Cluster Radii

Author: Bandyapadhyay Sayan
Lochet William
Saurabh Saket
Publication venue
Publication date: 01/01/2023
Field of study

Clustering with capacity constraints is a fundamental problem that attracted significant attention throughout the years. In this paper, we give the first FPT constant-factor approximation algorithm for the problem of clustering points in a general metric into

k

clusters to minimize the sum of cluster radii, subject to non-uniform hard capacity constraints. In particular, we give a

(15+\epsilon)

-approximation algorithm that runs in

2^{0(k^2\log k)}\cdot n^3

time. When capacities are uniform, we obtain the following improved approximation bounds: A (4 +

\epsilon

)-approximation with running time

2^{O(k\log(k/\epsilon))}n^3

, which significantly improves over the FPT 28-approximation of Inamdar and Varadarajan [ESA 2020]; a (2 +

\epsilon

)-approximation with running time

2^{O(k/\epsilon^2 \cdot\log(k/\epsilon))}dn^3

and a

(1+\epsilon)

-approximation with running time

2^{O(kd\log ((k/\epsilon)))}n^{3}

in the Euclidean space; and a (1 +

\epsilon

)-approximation in the Euclidean space with running time

2^{O(k/\epsilon^2 \cdot\log(k/\epsilon))}dn^3

if we are allowed to violate the capacities by (1 +

\epsilon

)-factor. We complement this result by showing that there is no (1 +

\epsilon

)-approximation algorithm running in time

f(k)\cdot n^{O(1)}

, if any capacity violation is not allowed.Comment: Full version of a paper accepted to SoCG 202

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Capacitated Sum-Of-Radii Clustering: An FPT Approximation

Author: Inamdar Tanmay
Varadarajan Kasturi
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 28th Annual European Symposium on Algorithms (ESA 2020)
Publication date: 01/01/2020
Field of study

Dagstuhl Research Online Publication Server

Coresets for Clustering with General Assignment Constraints

Author: Huang Lingxiao
Jiang Shaofeng H. -C.
Li Jian
Wu Xuan
Publication venue
Publication date: 23/01/2023
Field of study

Designing small-sized \emph{coresets}, which approximately preserve the costs of the solutions for large datasets, has been an important research direction for the past decade. We consider coreset construction for a variety of general constrained clustering problems. We significantly extend and generalize the results of a very recent paper (Braverman et al., FOCS'22), by demonstrating that the idea of hierarchical uniform sampling (Chen, SICOMP'09; Braverman et al., FOCS'22) can be applied to efficiently construct coresets for a very general class of constrained clustering problems with general assignment constraints, including capacity constraints on cluster centers, and assignment structure constraints for data points (modeled by a convex body

\mathcal{B})

. Our main theorem shows that a small-sized

\epsilon

-coreset exists as long as a complexity measure

\mathsf{Lip}(\mathcal{B})

of the structure constraint, and the \emph{covering exponent}

\Lambda_\epsilon(\mathcal{X})

for metric space

(\mathcal{X},d)

are bounded. The complexity measure

\mathsf{Lip}(\mathcal{B})

for convex body

\mathcal{B}

is the Lipschitz constant of a certain transportation problem constrained in

\mathcal{B}

, called \emph{optimal assignment transportation problem}. We prove nontrivial upper bounds of

\mathsf{Lip}(\mathcal{B})

for various polytopes, including the general matroid basis polytopes, and laminar matroid polytopes (with better bound). As an application of our general theorem, we construct the first coreset for the fault-tolerant clustering problem (with or without capacity upper/lower bound) for the above metric spaces, in which the fault-tolerance requirement is captured by a uniform matroid basis polytope

arXiv.org e-Print Archive