12 research outputs found
Approximate Clustering via Metric Partitioning
In this paper we consider two metric covering/clustering problems -
\textit{Minimum Cost Covering Problem} (MCC) and -clustering. In the MCC
problem, we are given two point sets (clients) and (servers), and a
metric on . We would like to cover the clients by balls centered at
the servers. The objective function to minimize is the sum of the -th
power of the radii of the balls. Here is a parameter of the
problem (but not of a problem instance). MCC is closely related to the
-clustering problem. The main difference between -clustering and MCC is
that in -clustering one needs to select balls to cover the clients.
For any \eps > 0, we describe quasi-polynomial time (1 + \eps)
approximation algorithms for both of the problems. However, in case of
-clustering the algorithm uses (1 + \eps)k balls. Prior to our work, a
and a approximation were achieved by
polynomial-time algorithms for MCC and -clustering, respectively, where is an absolute constant. These two problems are thus interesting examples of
metric covering/clustering problems that admit (1 + \eps)-approximation
(using (1+\eps)k balls in case of -clustering), if one is willing to
settle for quasi-polynomial time. In contrast, for the variant of MCC where
is part of the input, we show under standard assumptions that no
polynomial time algorithm can achieve an approximation factor better than
for .Comment: 19 page
The Bane of Low-Dimensionality Clustering
In this paper, we give a conditional lower bound of on
running time for the classic k-median and k-means clustering objectives (where
n is the size of the input), even in low-dimensional Euclidean space of
dimension four, assuming the Exponential Time Hypothesis (ETH). We also
consider k-median (and k-means) with penalties where each point need not be
assigned to a center, in which case it must pay a penalty, and extend our lower
bound to at least three-dimensional Euclidean space.
This stands in stark contrast to many other geometric problems such as the
traveling salesman problem, or computing an independent set of unit spheres.
While these problems benefit from the so-called (limited) blessing of
dimensionality, as they can be solved in time or
in d dimensions, our work shows that widely-used clustering
objectives have a lower bound of , even in dimension four.
We complete the picture by considering the two-dimensional case: we show that
there is no algorithm that solves the penalized version in time less than
, and provide a matching upper bound of .
The main tool we use to establish these lower bounds is the placement of
points on the moment curve, which takes its inspiration from constructions of
point sets yielding Delaunay complexes of high complexity
Average Case Network Lifetime on an Interval with Adjustable Sensing Ranges
Given n sensors on an interval, each of which is equipped with an adjustable sensing radius and a unit battery charge that drains in inverse linear proportion to its radius, what schedule will maximize the lifetime of a network that covers the entire interval? Trivially, any reasonable algorithm is at least a 2-approximation for this Sensor Strip Cover problem, so we focus on developing an efficient algorithm that maximizes the expected network lifetime under a random uniform model of sensor distribution. We demonstrate one such algorithm that achieves an expected network lifetime within 12 % of the theoretical maximum. Most of the algorithms that we consider come from a particular family of RoundRobin coverage, in which sensors take turns covering predefined areas until their battery runs out
Fast Fencing
We consider very natural "fence enclosure" problems studied by Capoyleas,
Rote, and Woeginger and Arkin, Khuller, and Mitchell in the early 90s. Given a
set of points in the plane, we aim at finding a set of closed curves
such that (1) each point is enclosed by a curve and (2) the total length of the
curves is minimized. We consider two main variants. In the first variant, we
pay a unit cost per curve in addition to the total length of the curves. An
equivalent formulation of this version is that we have to enclose unit
disks, paying only the total length of the enclosing curves. In the other
variant, we are allowed to use at most closed curves and pay no cost per
curve.
For the variant with at most closed curves, we present an algorithm that
is polynomial in both and . For the variant with unit cost per curve, or
unit disks, we present a near-linear time algorithm.
Capoyleas, Rote, and Woeginger solved the problem with at most curves in
time. Arkin, Khuller, and Mitchell used this to solve the unit cost
per curve version in exponential time. At the time, they conjectured that the
problem with curves is NP-hard for general . Our polynomial time
algorithm refutes this unless P equals NP
Connecting a Set of Circles with Minimum Sum of Radii
Abstract. We consider the problem of assigning radii to a given set of points in the plane, such that the resulting set of circles is connected, and the sum of radii is minimized. We show that the problem is polynomially solvable if a connectivity tree is given. If the connectivity tree is unknown, the problem is NP-hard if there are upper bounds on the radii and open otherwise. We give approximation guarantees for a variety of polynomialtime algorithms, describe upper and lower bounds (which are matching in some of the cases), provide polynomial-time approximation schemes, and conclude with experimental results and open problems
Recommended from our members
New Applications of the Nearest-Neighbor Chain Algorithm
The nearest-neighbor chain algorithm was proposed in the eighties as a way to speed up certain hierarchical clustering algorithms. In the first part of the dissertation, we show that its application is not limited to clustering. We apply it to a variety of geometric and combinatorial problems. In each case, we show that the nearest-neighbor chain algorithm finds the same solution as a preexistent greedy algorithm, but often with an improved runtime. We obtain speedups over greedy algorithms for Euclidean TSP, Steiner TSP in planar graphs, straight skeletons, a geometric coverage problem, and three stable matching models. In the second part, we study the stable-matching Voronoi diagram, a type of plane partition which combines properties of stable matchings and Voronoi diagrams. We propose political redistricting as an application. We also show that it is impossible to compute this diagram in an algebraic model of computation, and give three algorithmic approaches to overcome this obstacle. One of them is based on the nearest-neighbor chain algorithm, linking the two parts together
Approximation Algorithms for Clustering and Facility Location Problems
Facility location problems arise in a wide range of applications such as plant or warehouse location problems, cache placement problems, and network design problems, and have been widely studied in Computer Science and Operations Research literature. These problems typically involve an underlying set F of facilities that provide service, and an underlying set D of clients that require service, which need to be assigned to facilities in a cost-effective fashion. This abstraction is quite versatile and also captures clustering problems, where one typically seeks to partition a set of data points into k clusters, for some given k, in a suitable way, which themselves find applications in data mining, machine learning, and bioinformatics.
Basic variants of facility location problems are now relatively well-u
nderstood, but we have much-less understanding of more-sophisticated models that better model the real-world concerns. In this thesis, we focus on three models inspired by some real-world optimization scenarios.
In Chapter 2, we consider mobile facility location (MFL) problem, wherein we seek to relocate a given set of facilities to destinations closer to the clients as to minimize the sum of facility-movement and client-assignment costs. This abstracts facility-location settings where one has the flexibility of moving
facilities from their current locations to other destinations so as to serve clients more efficiently by reducing their assignment costs. We give the first local-search based approximation algorithm for this problem and
achieve the best-known approximation guarantee. Our main result is
(3+epsilon)-approximation for this problem for any constant epsilon > 0 using local
search which improves the previous best guarantee of 8-approximation algorithm due to [34] based on LP-rounding. Our results extend to the weighted generalization wherein each facility i has a
non-negative weight w_i and the movement cost for i is w_i times the distance
traveled by i.
In Chapter 3, we consider a facility-location problem that we call the minimum-load k-facility location (MLkFL), which abstracts settings where the cost of
serving the clients assigned to a facility is incurred by the facility. This problem was studied under the name of min-max star cover in [32,10], who
(among other results) gave bicriteria approximation algorithms for MLkFL when F=D. MLkFL is rather poorly understood, and only an O(k)-approximation is currently
known for MLkFL, even for line metrics. Our main result is the first polytime approximation scheme (PTAS) for MLkFL on line
metrics (note that no non-trivial true approximation of any kind was known for this metric).
Complementing this, we prove that MLkFL is strongly NP-hard on line metrics.
In Chapter 4, we consider clustering problems with non-uniform lower bounds and outliers, and
obtain the first approximation guarantees for these problems.
We consider objective functions involving the radii of open facilities, where the radius of a facility i is the maximum distance between i and a client assigned to it. We consider two problems: minimizing the sum of the radii of the open facilities, which yields the lower-bounded min-sum-of-radii with outliers (LBkSRO) problem, and minimizing the maximum radius, which yields the lower-bounded k-supplier with outliers (LBkSupO) problem. We obtain an approximation factor of 12.365 for LBkSRO, which improves to 3.83 for the non-outlier version. These also constitute the first approximation bounds for the min-sum-of-radii objective when we consider lower bounds and outliers separately. We obtain approximation factors of 5 and 3 respectively for LBkSupO and its non-outlier version. These are the first approximation results for k-supplier with non-uniform lower bounds