80 research outputs found

    Constant-Factor FPT Approximation for Capacitated k-Median

    Get PDF
    Capacitated k-median is one of the few outstanding optimization problems for which the existence of a polynomial time constant factor approximation algorithm remains an open problem. In a series of recent papers algorithms producing solutions violating either the number of facilities or the capacity by a multiplicative factor were obtained. However, to produce solutions without violations appears to be hard and potentially requires different algorithmic techniques. Notably, if parameterized by the number of facilities k, the problem is also W[2] hard, making the existence of an exact FPT algorithm unlikely. In this work we provide an FPT-time constant factor approximation algorithm preserving both cardinality and capacity of the facilities. The algorithm runs in time 2^O(k log k) n^O(1) and achieves an approximation ratio of 7+epsilon

    FPT Approximations for Capacitated/Fair Clustering with Outliers

    Full text link
    Clustering problems such as kk-Median, and kk-Means, are motivated from applications such as location planning, unsupervised learning among others. In such applications, it is important to find the clustering of points that is not ``skewed'' in terms of the number of points, i.e., no cluster should contain too many points. This is modeled by capacity constraints on the sizes of clusters. In an orthogonal direction, another important consideration in clustering is how to handle the presence of outliers in the data. Indeed, these clustering problems have been generalized in the literature to separately handle capacity constraints and outliers. To the best of our knowledge, there has been very little work on studying the approximability of clustering problems that can simultaneously handle both capacities and outliers. We initiate the study of the Capacitated kk-Median with Outliers (CkkMO) problem. Here, we want to cluster all except mm outlier points into at most kk clusters, such that (i) the clusters respect the capacity constraints, and (ii) the cost of clustering, defined as the sum of distances of each non-outlier point to its assigned cluster-center, is minimized. We design the first constant-factor approximation algorithms for CkkMO. In particular, our algorithm returns a (3+\epsilon)-approximation for CkkMO in general metric spaces, and a (1+\epsilon)-approximation in Euclidean spaces of constant dimension, that runs in time in time f(k,m,ϵ)ImO(1)f(k, m, \epsilon) \cdot |I_m|^{O(1)}, where Im|I_m| denotes the input size. We can also extend these results to a broader class of problems, including Capacitated k-Means/k-Facility Location with Outliers, and Size-Balanced Fair Clustering problems with Outliers. For each of these problems, we obtain an approximation ratio that matches the best known guarantee of the corresponding outlier-free problem.Comment: Abstract shortened to meet arxiv requirement

    A Unified Framework of FPT Approximation Algorithms for Clustering Problems

    Get PDF
    In this paper, we present a framework for designing FPT approximation algorithms for many k-clustering problems. Our results are based on a new technique for reducing search spaces. A reduced search space is a small subset of the input data that has the guarantee of containing k clients close to the facilities opened in an optimal solution for any clustering problem we consider. We show, somewhat surprisingly, that greedily sampling O(k) clients yields the desired reduced search space, based on which we obtain FPT(k)-time algorithms with improved approximation guarantees for problems such as capacitated clustering, lower-bounded clustering, clustering with service installation costs, fault tolerant clustering, and priority clustering

    Capacitated Sum-Of-Radii Clustering: An FPT Approximation

    Get PDF

    On the Fixed-Parameter Tractability of Capacitated Clustering

    Get PDF
    We study the complexity of the classic capacitated k-median and k-means problems parameterized by the number of centers, k. These problems are notoriously difficult since the best known approximation bound for high dimensional Euclidean space and general metric space is Theta(log k) and it remains a major open problem whether a constant factor exists. We show that there exists a (3+epsilon)-approximation algorithm for the capacitated k-median and a (9+epsilon)-approximation algorithm for the capacitated k-means problem in general metric spaces whose running times are f(epsilon,k) n^{O(1)}. For Euclidean inputs of arbitrary dimension, we give a (1+epsilon)-approximation algorithm for both problems with a similar running time. This is a significant improvement over the (7+epsilon)-approximation of Adamczyk et al. for k-median in general metric spaces and the (69+epsilon)-approximation of Xu et al. for Euclidean k-means

    FPT Constant-Approximations for Capacitated Clustering to Minimize the Sum of Cluster Radii

    Get PDF
    Clustering with capacity constraints is a fundamental problem that attracted significant attention throughout the years. In this paper, we give the first FPT constant-factor approximation algorithm for the problem of clustering points in a general metric into kk clusters to minimize the sum of cluster radii, subject to non-uniform hard capacity constraints. In particular, we give a (15+ϵ)(15+\epsilon)-approximation algorithm that runs in 20(k2logk)n32^{0(k^2\log k)}\cdot n^3 time. When capacities are uniform, we obtain the following improved approximation bounds: A (4 + ϵ\epsilon)-approximation with running time 2O(klog(k/ϵ))n32^{O(k\log(k/\epsilon))}n^3, which significantly improves over the FPT 28-approximation of Inamdar and Varadarajan [ESA 2020]; a (2 + ϵ\epsilon)-approximation with running time 2O(k/ϵ2log(k/ϵ))dn32^{O(k/\epsilon^2 \cdot\log(k/\epsilon))}dn^3 and a (1+ϵ)(1+\epsilon)-approximation with running time 2O(kdlog((k/ϵ)))n32^{O(kd\log ((k/\epsilon)))}n^{3} in the Euclidean space; and a (1 + ϵ\epsilon)-approximation in the Euclidean space with running time 2O(k/ϵ2log(k/ϵ))dn32^{O(k/\epsilon^2 \cdot\log(k/\epsilon))}dn^3 if we are allowed to violate the capacities by (1 + ϵ\epsilon)-factor. We complement this result by showing that there is no (1 + ϵ\epsilon)-approximation algorithm running in time f(k)nO(1)f(k)\cdot n^{O(1)}, if any capacity violation is not allowed.Comment: Full version of a paper accepted to SoCG 202

    On Coresets for Fair Clustering in Metric and Euclidean Spaces and Their Applications

    Get PDF
    Fair clustering is a constrained variant of clustering where the goal is to partition a set of colored points, such that the fraction of points of any color in every cluster is more or less equal to the fraction of points of this color in the dataset. This variant was recently introduced by Chierichetti et al. [NeurIPS, 2017] in a seminal work and became widely popular in the clustering literature. In this paper, we propose a new construction of coresets for fair clustering based on random sampling. The new construction allows us to obtain the first coreset for fair clustering in general metric spaces. For Euclidean spaces, we obtain the first coreset whose size does not depend exponentially on the dimension. Our coreset results solve open questions proposed by Schmidt et al. [WAOA, 2019] and Huang et al. [NeurIPS, 2019]. The new coreset construction helps to design several new approximation and streaming algorithms. In particular, we obtain the first true constant-approximation algorithm for metric fair clustering, whose running time is fixed-parameter tractable (FPT). In the Euclidean case, we derive the first (1+ϵ)(1+\epsilon)-approximation algorithm for fair clustering whose time complexity is near-linear and does not depend exponentially on the dimension of the space. Besides, our coreset construction scheme is fairly general and gives rise to coresets for a wide range of constrained clustering problems. This leads to improved constant-approximations for these problems in general metrics and near-linear time (1+ϵ)(1+\epsilon)-approximations in the Euclidean metric

    A Survey on Approximation in Parameterized Complexity: Hardness and Algorithms

    Get PDF
    Parameterization and approximation are two popular ways of coping with NP-hard problems. More recently, the two have also been combined to derive many interesting results. We survey developments in the area both from the algorithmic and hardness perspectives, with emphasis on new techniques and potential future research directions

    FPT Approximation for Constrained Metric k-Median/Means

    Get PDF
    The Metric kk-median problem over a metric space (X,d)(\mathcal{X}, d) is defined as follows: given a set LXL \subseteq \mathcal{X} of facility locations and a set CXC \subseteq \mathcal{X} of clients, open a set FLF \subseteq L of kk facilities such that the total service cost, defined as Φ(F,C)xCminfFd(x,f)\Phi(F, C) \equiv \sum_{x \in C} \min_{f \in F} d(x, f), is minimised. The metric kk-means problem is defined similarly using squared distances. In many applications there are additional constraints that any solution needs to satisfy. This gives rise to different constrained versions of the problem such as rr-gather, fault-tolerant, outlier kk-means/kk-median problem. Surprisingly, for many of these constrained problems, no constant-approximation algorithm is known. We give FPT algorithms with constant approximation guarantee for a range of constrained kk-median/means problems. For some of the constrained problems, ours is the first constant factor approximation algorithm whereas for others, we improve or match the approximation guarantee of previous works. We work within the unified framework of Ding and Xu that allows us to simultaneously obtain algorithms for a range of constrained problems. In particular, we obtain a (3+ε)(3+\varepsilon)-approximation and (9+ε)(9+\varepsilon)-approximation for the constrained versions of the kk-median and kk-means problem respectively in FPT time. In many practical settings of the kk-median/means problem, one is allowed to open a facility at any client location, i.e., CLC \subseteq L. For this special case, our algorithm gives a (2+ε)(2+\varepsilon)-approximation and (4+ε)(4+\varepsilon)-approximation for the constrained versions of kk-median and kk-means problem respectively in FPT time. Since our algorithm is based on simple sampling technique, it can also be converted to a constant-pass log-space streaming algorithm
    corecore