5 research outputs found
Tight Analysis of a Multiple-Swap Heuristic for Budgeted Red-Blue Median
Budgeted Red-Blue Median is a generalization of classic -Median in that
there are two sets of facilities, say and , that can
be used to serve clients located in some metric space. The goal is to open
facilities in and facilities in for
some given bounds and connect each client to their nearest open
facility in a way that minimizes the total connection cost.
We extend work by Hajiaghayi, Khandekar, and Kortsarz [2012] and show that a
multiple-swap local search heuristic can be used to obtain a
-approximation for Budgeted Red-Blue Median for any constant
. This is an improvement over their single swap analysis and
beats the previous best approximation guarantee of 8 by Swamy [2014].
We also present a matching lower bound showing that for every ,
there are instances of Budgeted Red-Blue Median with local optimum solutions
for the -swap heuristic whose cost is
times the optimum solution cost. Thus, our analysis is tight up to the lower
order terms. In particular, for any we show the single-swap
heuristic admits local optima whose cost can be as bad as times
the optimum solution cost
Constant Approximation for -Median and -Means with Outliers via Iterative Rounding
In this paper, we present a new iterative rounding framework for many
clustering problems. Using this, we obtain an -approximation algorithm for -median with outliers, greatly
improving upon the large implicit constant approximation ratio of Chen [Chen,
SODA 2018]. For -means with outliers, we give an -approximation, which is the first -approximation for
this problem. The iterative algorithm framework is very versatile; we show how
it can be used to give - and -approximation
algorithms for matroid and knapsack median problems respectively, improving
upon the previous best approximations ratios of [Swamy, ACM Trans.
Algorithms] and [Byrka et al, ESA 2015].
The natural LP relaxation for the -median/-means with outliers problem
has an unbounded integrality gap. In spite of this negative result, our
iterative rounding framework shows that we can round an LP solution to an
almost-integral solution of small cost, in which we have at most two
fractionally open facilities. Thus, the LP integrality gap arises due to the
gap between almost-integral and fully-integral solutions. Then, using a
pre-processing procedure, we show how to convert an almost-integral solution to
a fully-integral solution losing only a constant-factor in the approximation
ratio. By further using a sparsification technique, the additive factor loss
incurred by the conversion can be reduced to any
Diversity-aware -median : Clustering with fair center representation
We introduce a novel problem for diversity-aware clustering. We assume that
the potential cluster centers belong to a set of groups defined by protected
attributes, such as ethnicity, gender, etc. We then ask to find a minimum-cost
clustering of the data into clusters so that a specified minimum number of
cluster centers are chosen from each group. We thus require that all groups are
represented in the clustering solution as cluster centers, according to
specified requirements. More precisely, we are given a set of clients , a
set of facilities \pazocal{F}, a collection
of facility groups F_i \subseteq \pazocal{F}, budget , and a set of
lower-bound thresholds , one for each group in
. The \emph{diversity-aware -median problem} asks to find a set
of facilities in \pazocal{F} such that , that
is, at least centers in are from group , and the -median cost
is minimized. We show that in the
general case where the facility groups may overlap, the diversity-aware
-median problem is \np-hard, fixed-parameter intractable, and inapproximable
to any multiplicative factor. On the other hand, when the facility groups are
disjoint, approximation algorithms can be obtained by reduction to the
\emph{matroid median} and \emph{red-blue median} problems. Experimentally, we
evaluate our approximation methods for the tractable cases, and present a
relaxation-based heuristic for the theoretically intractable case, which can
provide high-quality and efficient solutions for real-world datasets.Comment: To appear in ECML-PKDD 202
Local search heuristics for the mobile facility location problem
a b s t r a c t In the mobile facility location problem (MFLP), one seeks to relocate (or move) a set of existing facilities and assign clients to these facilities so that the sum of facility movement costs and the client travel costs (each to its assigned facility) is minimized. This paper studies formulations and develops local search heuristics for the MFLP. First, we develop an integer programming (IP) formulation for the MFLP by observing that for a given set of facility destinations the problem may be decomposed into two polynomially solvable subproblems. This IP formulation is quite compact in terms of the number of nonzero coefficients in the constraint matrix and the number of integer variables; and allows for the solution of large-scale MFLP instances. Using the decomposition observation, we propose two local search neighborhoods for the MFLP. We report on extensive computational tests of the new IP formulation and local search heuristics on a large range of instances. These tests demonstrate that the proposed formulation and local search heuristics significantly outperform the existing formulation and a previously developed local search heuristic for the problem
Large-scale optimization for data placement problem
Large-scale optimization of combinatorial problems is one of the most challenging areas. These problems are characterized by large sets of data (variables and constraints). In this thesis, we study large-scale optimization of the data placement problem with zero storage cost. The goal in the data placement problem is to find the placement of data objects in a set of fixed capacity caches in a network to optimize the latency of access. Data placement problem arises naturally in the design of content distribution networks. We report on an empirical study of the upper bound and the lower bound of this problem for large sized instances. We also study a semi-Lagrangean relaxation of a closely related k-median problem. In this thesis, we study the theory and practice of approximation algorithm for the data placement problem and the k-median problem