5 research outputs found

    Tight Analysis of a Multiple-Swap Heuristic for Budgeted Red-Blue Median

    Get PDF
    Budgeted Red-Blue Median is a generalization of classic kk-Median in that there are two sets of facilities, say R\mathcal{R} and B\mathcal{B}, that can be used to serve clients located in some metric space. The goal is to open krk_r facilities in R\mathcal{R} and kbk_b facilities in B\mathcal{B} for some given bounds kr,kbk_r, k_b and connect each client to their nearest open facility in a way that minimizes the total connection cost. We extend work by Hajiaghayi, Khandekar, and Kortsarz [2012] and show that a multiple-swap local search heuristic can be used to obtain a (5+ϵ)(5+\epsilon)-approximation for Budgeted Red-Blue Median for any constant ϵ>0\epsilon > 0. This is an improvement over their single swap analysis and beats the previous best approximation guarantee of 8 by Swamy [2014]. We also present a matching lower bound showing that for every p1p \geq 1, there are instances of Budgeted Red-Blue Median with local optimum solutions for the pp-swap heuristic whose cost is 5+Ω(1p)5 + \Omega\left(\frac{1}{p}\right) times the optimum solution cost. Thus, our analysis is tight up to the lower order terms. In particular, for any ϵ>0\epsilon > 0 we show the single-swap heuristic admits local optima whose cost can be as bad as 7ϵ7-\epsilon times the optimum solution cost

    Constant Approximation for kk-Median and kk-Means with Outliers via Iterative Rounding

    Full text link
    In this paper, we present a new iterative rounding framework for many clustering problems. Using this, we obtain an (α1+ϵ7.081+ϵ)(\alpha_1 + \epsilon \leq 7.081 + \epsilon)-approximation algorithm for kk-median with outliers, greatly improving upon the large implicit constant approximation ratio of Chen [Chen, SODA 2018]. For kk-means with outliers, we give an (α2+ϵ53.002+ϵ)(\alpha_2+\epsilon \leq 53.002 + \epsilon)-approximation, which is the first O(1)O(1)-approximation for this problem. The iterative algorithm framework is very versatile; we show how it can be used to give α1\alpha_1- and (α1+ϵ)(\alpha_1 + \epsilon)-approximation algorithms for matroid and knapsack median problems respectively, improving upon the previous best approximations ratios of 88 [Swamy, ACM Trans. Algorithms] and 17.4617.46 [Byrka et al, ESA 2015]. The natural LP relaxation for the kk-median/kk-means with outliers problem has an unbounded integrality gap. In spite of this negative result, our iterative rounding framework shows that we can round an LP solution to an almost-integral solution of small cost, in which we have at most two fractionally open facilities. Thus, the LP integrality gap arises due to the gap between almost-integral and fully-integral solutions. Then, using a pre-processing procedure, we show how to convert an almost-integral solution to a fully-integral solution losing only a constant-factor in the approximation ratio. By further using a sparsification technique, the additive factor loss incurred by the conversion can be reduced to any ϵ>0\epsilon > 0

    Diversity-aware kk-median : Clustering with fair center representation

    Full text link
    We introduce a novel problem for diversity-aware clustering. We assume that the potential cluster centers belong to a set of groups defined by protected attributes, such as ethnicity, gender, etc. We then ask to find a minimum-cost clustering of the data into kk clusters so that a specified minimum number of cluster centers are chosen from each group. We thus require that all groups are represented in the clustering solution as cluster centers, according to specified requirements. More precisely, we are given a set of clients CC, a set of facilities \pazocal{F}, a collection F={F1,,Ft}\mathcal{F}=\{F_1,\dots,F_t\} of facility groups F_i \subseteq \pazocal{F}, budget kk, and a set of lower-bound thresholds R={r1,,rt}R=\{r_1,\dots,r_t\}, one for each group in F\mathcal{F}. The \emph{diversity-aware kk-median problem} asks to find a set SS of kk facilities in \pazocal{F} such that SFiri|S \cap F_i| \geq r_i, that is, at least rir_i centers in SS are from group FiF_i, and the kk-median cost cCminsSd(c,s)\sum_{c \in C} \min_{s \in S} d(c,s) is minimized. We show that in the general case where the facility groups may overlap, the diversity-aware kk-median problem is \np-hard, fixed-parameter intractable, and inapproximable to any multiplicative factor. On the other hand, when the facility groups are disjoint, approximation algorithms can be obtained by reduction to the \emph{matroid median} and \emph{red-blue median} problems. Experimentally, we evaluate our approximation methods for the tractable cases, and present a relaxation-based heuristic for the theoretically intractable case, which can provide high-quality and efficient solutions for real-world datasets.Comment: To appear in ECML-PKDD 202

    Local search heuristics for the mobile facility location problem

    Get PDF
    a b s t r a c t In the mobile facility location problem (MFLP), one seeks to relocate (or move) a set of existing facilities and assign clients to these facilities so that the sum of facility movement costs and the client travel costs (each to its assigned facility) is minimized. This paper studies formulations and develops local search heuristics for the MFLP. First, we develop an integer programming (IP) formulation for the MFLP by observing that for a given set of facility destinations the problem may be decomposed into two polynomially solvable subproblems. This IP formulation is quite compact in terms of the number of nonzero coefficients in the constraint matrix and the number of integer variables; and allows for the solution of large-scale MFLP instances. Using the decomposition observation, we propose two local search neighborhoods for the MFLP. We report on extensive computational tests of the new IP formulation and local search heuristics on a large range of instances. These tests demonstrate that the proposed formulation and local search heuristics significantly outperform the existing formulation and a previously developed local search heuristic for the problem

    Large-scale optimization for data placement problem

    Get PDF
    Large-scale optimization of combinatorial problems is one of the most challenging areas. These problems are characterized by large sets of data (variables and constraints). In this thesis, we study large-scale optimization of the data placement problem with zero storage cost. The goal in the data placement problem is to find the placement of data objects in a set of fixed capacity caches in a network to optimize the latency of access. Data placement problem arises naturally in the design of content distribution networks. We report on an empirical study of the upper bound and the lower bound of this problem for large sized instances. We also study a semi-Lagrangean relaxation of a closely related k-median problem. In this thesis, we study the theory and practice of approximation algorithm for the data placement problem and the k-median problem
    corecore