16,936 research outputs found
Large-Scale Distributed Algorithms for Facility Location with Outliers
This paper presents fast, distributed, O(1)-approximation algorithms for metric facility location problems with outliers in the Congested Clique model, Massively Parallel Computation (MPC) model, and in the k-machine model. The paper considers Robust Facility Location and Facility Location with Penalties, two versions of the facility location problem with outliers proposed by Charikar et al. (SODA 2001). The paper also considers two alternatives for specifying the input: the input metric can be provided explicitly (as an n x n matrix distributed among the machines) or implicitly as the shortest path metric of a given edge-weighted graph. The results in the paper are:
- Implicit metric: For both problems, O(1)-approximation algorithms running in O(poly(log n)) rounds in the Congested Clique and the MPC model and O(1)-approximation algorithms running in O~(n/k) rounds in the k-machine model.
- Explicit metric: For both problems, O(1)-approximation algorithms running in O(log log log n) rounds in the Congested Clique and the MPC model and O(1)-approximation algorithms running in O~(n/k) rounds in the k-machine model.
Our main contribution is to show the existence of Mettu-Plaxton-style O(1)-approximation algorithms for both Facility Location with outlier problems. As shown in our previous work (Berns et al., ICALP 2012, Bandyapadhyay et al., ICDCN 2018) Mettu-Plaxton style algorithms are more easily amenable to being implemented efficiently in distributed and large-scale models of computation
Near-linear time approximations schemes for clustering in doubling metrics
International audienceWe consider the classic Facility Location, k-Median, and k-Means problems in metric spaces of constant doubling dimension. We give the first nearly linear-time approximation schemes for each problem, making a significant improvement over the state-of-the-art algorithms. Moreover, we show how to extend the techniques used to get the first efficient approximation schemes for the problems of prize-collecting k-Medians and k-Means, and efficient bicriteria approximation schemes for k-Medians with outliers, k-Means with outliers and k-Center
Approximation algorithms for clustering and facility location problems
In this thesis we design and analyze algorithms for various facility location and clustering problems. The problems we study are NP-Hard and therefore, assuming P is not equal NP, there do not exist polynomial time algorithms to solve them optimally. One approach to cope with the intractability of these problems is to design approximation algorithms which run in polynomial-time and output a near-optimal solution for all instances of the problem. However these algorithms do not always work well in practice. Often heuristics with no explicit approximation guarantee perform quite well. To bridge this gap between theory and practice, and to design algorithms that are tuned for instances arising in practice, there is an increasing emphasis on beyond worst-case analysis. In this thesis we consider both these approaches.
In the first part we design worst case approximation algorithms for Uniform Submodular Facility Location (USFL), and Capacitated k-center (CapKCenter) problems. USFL is a generalization of the well-known Uncapacitated Facility Location problem. In USFL the cost of opening a facility is a submodular function of the clients assigned to it (the function is identical for all facilities). We show that a natural greedy algorithm (which gives constant factor approximation for Uncapacitated Facility Location and other facility location problems) has a lower bound of log(n), where n is the number of clients. We present an O(log^2 k) approximation algorithm where k is the number of facilities. The algorithm is based on rounding a convex relaxation. We further consider several special cases of the problem and give improved approximation bounds for them. The CapKCenter problem is an extension of the well-known k-center problem: each facility has a maximum capacity on the number of clients that can be assigned to it. We obtain a 9-approximation for this problem via a linear programming (LP) rounding procedure. Our result, combined with previously known lower bounds, almost settles the integrality gap for a natural LP relaxation.
In the second part we consider several well-known clustering problems like k-center, k-median, k-means and their corresponding outlier variants. We use beyond worst-case analysis due to the practical relevance of these problems. In particular we show that when the input instances are 2-perturbation resilient (i.e. the optimal solution does not change when the distances change by a multiplicative factor of 2), the LP integrality gap for k-center (and also asymmetric k-center) is 1. We further introduce a model of perturbation resilience for clustering with outliers. Under this new model, we show that previous results (including our LP integrality result) known for clustering under perturbation resilience also extend for clustering with outliers. This leads to a dynamic programming based heuristic for k-means with outliers (k-means-outlier) which gives an optimal solution when the instance is 2-perturbation resilient. We propose two more algorithms for k-means-outlier â a sampling based algorithm which gives an O(1) approximation when the optimal clusters are not âtoo smallâ, and an LP rounding algorithm which gives an O(1) approximation at the expense of violating the number of clusters and outliers by a small constant. We empirically study our proposed algorithms on several clustering datasets
Near-linear time approximation schemes for clustering in doubling metrics
We consider the classic Facility Location, k-Median, and k-Means problems in metric spaces of doubling dimension d. We give nearly linear-time approximation schemes for each problem. The complexity of our algorithms is Ă(2(1/Δ)O(d2) n), making a significant improvement over the state-of-the-art algorithms that run in time n(d/Δ)O(d).
Moreover, we show how to extend the techniques used to get the first efficient approximation schemes for the problems of prize-collecting k-Median and k-Means and efficient bicriteria approximation schemes for k-Median with outliers, k-Means with outliers and k-Center
FPT Approximations for Capacitated/Fair Clustering with Outliers
Clustering problems such as -Median, and -Means, are motivated from
applications such as location planning, unsupervised learning among others. In
such applications, it is important to find the clustering of points that is not
``skewed'' in terms of the number of points, i.e., no cluster should contain
too many points. This is modeled by capacity constraints on the sizes of
clusters. In an orthogonal direction, another important consideration in
clustering is how to handle the presence of outliers in the data. Indeed, these
clustering problems have been generalized in the literature to separately
handle capacity constraints and outliers. To the best of our knowledge, there
has been very little work on studying the approximability of clustering
problems that can simultaneously handle both capacities and outliers.
We initiate the study of the Capacitated -Median with Outliers (CMO)
problem. Here, we want to cluster all except outlier points into at most
clusters, such that (i) the clusters respect the capacity constraints, and
(ii) the cost of clustering, defined as the sum of distances of each
non-outlier point to its assigned cluster-center, is minimized.
We design the first constant-factor approximation algorithms for CMO. In
particular, our algorithm returns a (3+\epsilon)-approximation for CMO in
general metric spaces, and a (1+\epsilon)-approximation in Euclidean spaces of
constant dimension, that runs in time in time , where denotes the input size. We can also extend these
results to a broader class of problems, including Capacitated
k-Means/k-Facility Location with Outliers, and Size-Balanced Fair Clustering
problems with Outliers. For each of these problems, we obtain an approximation
ratio that matches the best known guarantee of the corresponding outlier-free
problem.Comment: Abstract shortened to meet arxiv requirement
Approximation Algorithms for Clustering and Facility Location Problems
Facility location problems arise in a wide range of applications such as plant or warehouse location problems, cache placement problems, and network design problems, and have been widely studied in Computer Science and Operations Research literature. These problems typically involve an underlying set F of facilities that provide service, and an underlying set D of clients that require service, which need to be assigned to facilities in a cost-effective fashion. This abstraction is quite versatile and also captures clustering problems, where one typically seeks to partition a set of data points into k clusters, for some given k, in a suitable way, which themselves find applications in data mining, machine learning, and bioinformatics.
Basic variants of facility location problems are now relatively well-u
nderstood, but we have much-less understanding of more-sophisticated models that better model the real-world concerns. In this thesis, we focus on three models inspired by some real-world optimization scenarios.
In Chapter 2, we consider mobile facility location (MFL) problem, wherein we seek to relocate a given set of facilities to destinations closer to the clients as to minimize the sum of facility-movement and client-assignment costs. This abstracts facility-location settings where one has the flexibility of moving
facilities from their current locations to other destinations so as to serve clients more efficiently by reducing their assignment costs. We give the first local-search based approximation algorithm for this problem and
achieve the best-known approximation guarantee. Our main result is
(3+epsilon)-approximation for this problem for any constant epsilon > 0 using local
search which improves the previous best guarantee of 8-approximation algorithm due to [34] based on LP-rounding. Our results extend to the weighted generalization wherein each facility i has a
non-negative weight w_i and the movement cost for i is w_i times the distance
traveled by i.
In Chapter 3, we consider a facility-location problem that we call the minimum-load k-facility location (MLkFL), which abstracts settings where the cost of
serving the clients assigned to a facility is incurred by the facility. This problem was studied under the name of min-max star cover in [32,10], who
(among other results) gave bicriteria approximation algorithms for MLkFL when F=D. MLkFL is rather poorly understood, and only an O(k)-approximation is currently
known for MLkFL, even for line metrics. Our main result is the first polytime approximation scheme (PTAS) for MLkFL on line
metrics (note that no non-trivial true approximation of any kind was known for this metric).
Complementing this, we prove that MLkFL is strongly NP-hard on line metrics.
In Chapter 4, we consider clustering problems with non-uniform lower bounds and outliers, and
obtain the first approximation guarantees for these problems.
We consider objective functions involving the radii of open facilities, where the radius of a facility i is the maximum distance between i and a client assigned to it. We consider two problems: minimizing the sum of the radii of the open facilities, which yields the lower-bounded min-sum-of-radii with outliers (LBkSRO) problem, and minimizing the maximum radius, which yields the lower-bounded k-supplier with outliers (LBkSupO) problem. We obtain an approximation factor of 12.365 for LBkSRO, which improves to 3.83 for the non-outlier version. These also constitute the first approximation bounds for the min-sum-of-radii objective when we consider lower bounds and outliers separately. We obtain approximation factors of 5 and 3 respectively for LBkSupO and its non-outlier version. These are the first approximation results for k-supplier with non-uniform lower bounds
Constant Factor Approximation for Capacitated k-Center with Outliers
The -center problem is a classic facility location problem, where given an
edge-weighted graph one is to find a subset of vertices ,
such that each vertex in is "close" to some vertex in . The
approximation status of this basic problem is well understood, as a simple
2-approximation algorithm is known to be tight. Consequently different
extensions were studied.
In the capacitated version of the problem each vertex is assigned a capacity,
which is a strict upper bound on the number of clients a facility can serve,
when located at this vertex. A constant factor approximation for the
capacitated -center was obtained last year by Cygan, Hajiaghayi and Khuller
[FOCS'12], which was recently improved to a 9-approximation by An, Bhaskara and
Svensson [arXiv'13].
In a different generalization of the problem some clients (denoted as
outliers) may be disregarded. Here we are additionally given an integer and
the goal is to serve exactly clients, which the algorithm is free to
choose. In 2001 Charikar et al. [SODA'01] presented a 3-approximation for the
-center problem with outliers.
In this paper we consider a common generalization of the two extensions
previously studied separately, i.e. we work with the capacitated -center
with outliers. We present the first constant factor approximation algorithm
with approximation ratio of 25 even for the case of non-uniform hard
capacities.Comment: 15 pages, 3 figures, accepted to STACS 201
Capacitated Center Problems with Two-Sided Bounds and Outliers
In recent years, the capacitated center problems have attracted a lot of
research interest. Given a set of vertices , we want to find a subset of
vertices , called centers, such that the maximum cluster radius is
minimized. Moreover, each center in should satisfy some capacity
constraint, which could be an upper or lower bound on the number of vertices it
can serve. Capacitated -center problems with one-sided bounds (upper or
lower) have been well studied in previous work, and a constant factor
approximation was obtained.
We are the first to study the capacitated center problem with both capacity
lower and upper bounds (with or without outliers). We assume each vertex has a
uniform lower bound and a non-uniform upper bound. For the case of opening
exactly centers, we note that a generalization of a recent LP approach can
achieve constant factor approximation algorithms for our problems. Our main
contribution is a simple combinatorial algorithm for the case where there is no
cardinality constraint on the number of open centers. Our combinatorial
algorithm is simpler and achieves better constant approximation factor compared
to the LP approach
- âŠ