Search CORE

10 research outputs found

Constant-Factor FPT Approximation for Capacitated k-Median

Author: Adamczyk Marek
Byrka Jaroslaw
Marcinkowski Jan
Meesum Syed M.
Wlodarczyk Michal
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 27th Annual European Symposium on Algorithms (ESA 2019)
Publication date: 15/09/2018
Field of study

Capacitated k-median is one of the few outstanding optimization problems for which the existence of a polynomial time constant factor approximation algorithm remains an open problem. In a series of recent papers algorithms producing solutions violating either the number of facilities or the capacity by a multiplicative factor were obtained. However, to produce solutions without violations appears to be hard and potentially requires different algorithmic techniques. Notably, if parameterized by the number of facilities k, the problem is also W[2] hard, making the existence of an exact FPT algorithm unlikely. In this work we provide an FPT-time constant factor approximation algorithm preserving both cardinality and capacity of the facilities. The algorithm runs in time 2^O(k log k) n^O(1) and achieves an approximation ratio of 7+epsilon

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Approximation Schemes for Min-Sum k-Clustering

Author: Naderi Ismail
Rezapour Mohsen
Salavatipour Mohammad R.
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 31st Annual European Symposium on Algorithms (ESA 2023)
Publication date: 01/01/2023
Field of study

We consider the Min-Sum k-Clustering (k-MSC) problem. Given a set of points in a metric which is represented by an edge-weighted graph G = (V, E) and a parameter k, the goal is to partition the points V into k clusters such that the sum of distances between all pairs of the points within the same cluster is minimized. The k-MSC problem is known to be APX-hard on general metrics. The best known approximation algorithms for the problem obtained by Behsaz, Friggstad, Salavatipour and Sivakumar [Algorithmica 2019] achieve an approximation ratio of O(log |V|) in polynomial time for general metrics and an approximation ratio 2+? in quasi-polynomial time for metrics with bounded doubling dimension. No approximation schemes for k-MSC (when k is part of the input) is known for any non-trivial metrics prior to our work. In fact, most of the previous works rely on the simple fact that there is a 2-approximate reduction from k-MSC to the balanced k-median problem and design approximation algorithms for the latter to obtain an approximation for k-MSC. In this paper, we obtain the first Quasi-Polynomial Time Approximation Schemes (QPTAS) for the problem on metrics induced by graphs of bounded treewidth, graphs of bounded highway dimension, graphs of bounded doubling dimensions (including fixed dimensional Euclidean metrics), and planar and minor-free graphs. We bypass the barrier of 2 for k-MSC by introducing a new clustering problem, which we call min-hub clustering, which is a generalization of balanced k-median and is a trade off between center-based clustering problems (such as balanced k-median) and pair-wise clustering (such as Min-Sum k-clustering). We then show how one can find approximation schemes for Min-hub clustering on certain classes of metrics

Dagstuhl Research Online Publication Server

Small space representations for metric min-sum k-clustering and their applications

Author: Czumaj Artur
Sohler Christian
Publication venue: Springer Berlin Heidelberg
Publication date: 01/01/2007
Field of study

The min-sum k-clustering problem is to partition a metric space (P, d) into k clusters C-1, . . . , C-k subset of P such that Sigma(k)(i=1), Sigma(p,q is an element of Ci) d(p, q) is minimized. We show the first efficient construction of a coreset for this problem. Our coreset construction is based on a new adaptive sampling algorithm. Using our coresets we obtain three main algorithmic results. The first result is a sublinear time (4 + is an element of)-approximation algorithm for the min-sum k-clustering problem in metric spaces. The running time of this algorithm is (O) over tilde (n) for any constant k and E, and it is o(n(2)) for all k = o(log n/ log log n). Since the description size of the input is Theta(n(2)), this is sublinear in the input size. Our second result is the first pass-efficient data streaming algorithm for min-sum k-clustering in the distance oracle model, i.e., an algorithm that uses poly (log n, k) space and makes 2 passes over the input point set arriving as a data stream. Our third result is a sublinear-time polylogarithmic-factor-approximation algorithm for the min-sum k-clustering problem for arbitrary values of k. To develop the coresets, we introduce the concept of alpha-preserving metric embeddings. Such an embedding satisfies properties that (a) the distance between any pair of points does not decrease, and (b) the cost of an optimal solution for the considered problem on input (P, d') is within a constant factor of the optimal solution on input (P, d). In other words, the idea is find a metric embedding into a (structurally simpler) metric space that approximates the original metric up to a factor of a with respect to a certain problem. We believe that this concept is an interesting generalization of coresets

Warwick Research Archives Portal Repository

Small space representations for metric min-sum k-clustering and their applications

Author: Czumaj Artur
Sohler Christian
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 20/07/2009
Field of study

The min-sum k -clustering problem is to partition a metric space (P,d) into k clusters C 1,…,C k ⊆P such that

\sum_{i=1}^{k}\sum_{p,q\in C_{i}}d(p,q)

is minimized. We show the first efficient construction of a coreset for this problem. Our coreset construction is based on a new adaptive sampling algorithm. With our construction of coresets we obtain two main algorithmic results. The first result is a sublinear-time (4+ε)-approximation algorithm for the min-sum k-clustering problem in metric spaces. The running time of this algorithm is

\widetilde{{\mathcal{O}}}(n)

for any constant k and ε, and it is o(n 2) for all k=o(log n/log log n). Since the full description size of the input is Θ(n 2), this is sublinear in the input size. The fastest previously known o(log n)-factor approximation algorithm for k>2 achieved a running time of Ω(n k ), and no non-trivial o(n 2)-time algorithm was known before. Our second result is the first pass-efficient data streaming algorithm for min-sum k-clustering in the distance oracle model, i.e., an algorithm that uses poly(log n,k) space and makes 2 passes over the input point set, which arrives in form of a data stream in arbitrary order. It computes an implicit representation of a clustering of (P,d) with cost at most a constant factor larger than that of an optimal partition. Using one further pass, we can assign each point to its corresponding cluster. To develop the coresets, we introduce the concept of α -preserving metric embeddings. Such an embedding satisfies properties that the distance between any pair of points does not decrease and the cost of an optimal solution for the considered problem on input (P,d′) is within a constant factor of the optimal solution on input (P,d). In other words, the goal is to find a metric embedding into a (structurally simpler) metric space that approximates the original metric up to a factor of α with respect to a given problem. We believe that this concept is an interesting generalization of coresets

Warwick Research Archives Portal Repository

Approximation Algorithms for Min-Sum k-Clustering and Balanced k-Median

Author: A Czumaj
N Guttman-Beck
S Sahni
V Arya
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 04/12/2015
Field of study

We consider two closely related fundamental clustering problems in this paper. In the min-sum k-clustering one is given a metric space and has to partition the points into k clusters while minimizing the sum of pairwise distances between the points within the clusters. In the Balanced k-Median problem the instance is the same and one has to obtain a clustering into k cluster C1,..., Ck, where each cluster Ci has a center ci, while minimizing the total assignment costs for the points in the metric; here the cost of assigning a point j to a cluster Ci is equal to |Ci | times the j, cj distance in the metric. In this paper, we present an O(log n)-approximation for both these problems where n is the number of points in the metric that are to be served. This is an improvement over the O(−1 log1+ n)-approximation (for any constant > 0) obtained by Bartal, Charikar, and Raz [STOC ’01]. We also obtain a quasi-PTAS for Balanced k-Median in metrics with constant doubling dimension. As in the work of Bartal et al., our approximation for general metrics uses embeddings into tree metrics. The main technical contribution in this paper is an O(1)-approximation for Balanced k-Median in hierarchically separated trees (HSTs). Our improvement comes from a more direct dynamic programming approach that heavily exploits properties of standard HSTs. In this way, we avoid the reduction to special types of HSTs that were considered by Bartal et al., thereby avoiding an additional O(−1 log n) loss

CiteSeerX

Crossref

Approximation Techniques for Facility Location and Their Applications in Metric Embeddings

Author: Lammersen Christiane
Publication venue
Publication date: 01/02/2011
Field of study

This thesis addresses the development of geometric approximation algorithms for huge datasets and is subdivided into two parts. The first part deals with algorithms for facility location problems, and the second part is concerned with the problem of computing compact representations of finite metric spaces. Facility location problems belong to the most studied problems in combinatorial optimization and operations research. In the facility location variants considered in this thesis, the input consists of a set of points where each point is a client as well as a potential location for a facility. Each client has to be served by a facility. However, connecting a client incurs connection costs, and opening or maintaining a facility causes so-called opening costs. The goal is to open a subset of the input points as facilities such that the total cost of the system is minimized

Eldorado - Ressourcen aus und für Lehre, Studium und Forschung

27th Annual European Symposium on Algorithms: ESA 2019, September 9-11, 2019, Munich/Garching, Germany

Author: ESA <27. 2019, München>
Publication venue: Schloss Dagstuhl - Leibniz-Zentrum für Informatik GmbH, Dagstuhl Publishing
Publication date: 01/09/2019
Field of study

Digitale Bibliothek Thüringen

LIPIcs, Volume 274, ESA 2023, Complete Volume

Author: Farach-Colton Martin
Herman Grzegorz
Puglisi Simon J.
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 31st Annual European Symposium on Algorithms (ESA 2023)
Publication date: 01/01/2023
Field of study

LIPIcs, Volume 274, ESA 2023, Complete Volum

Dagstuhl Research Online Publication Server