802 research outputs found
On Correcting Inputs: Inverse Optimization for Online Structured Prediction
Algorithm designers typically assume that the input data is correct, and then
proceed to find "optimal" or "sub-optimal" solutions using this input data.
However this assumption of correct data does not always hold in practice,
especially in the context of online learning systems where the objective is to
learn appropriate feature weights given some training samples. Such scenarios
necessitate the study of inverse optimization problems where one is given an
input instance as well as a desired output and the task is to adjust the input
data so that the given output is indeed optimal. Motivated by learning
structured prediction models, in this paper we consider inverse optimization
with a margin, i.e., we require the given output to be better than all other
feasible outputs by a desired margin. We consider such inverse optimization
problems for maximum weight matroid basis, matroid intersection, perfect
matchings, minimum cost maximum flows, and shortest paths and derive the first
known results for such problems with a non-zero margin. The effectiveness of
these algorithmic approaches to online learning for structured prediction is
also discussed.Comment: Conference version to appear in FSTTCS, 201
Low-Degree Spanning Trees of Small Weight
The degree-d spanning tree problem asks for a minimum-weight spanning tree in
which the degree of each vertex is at most d. When d=2 the problem is TSP, and
in this case, the well-known Christofides algorithm provides a
1.5-approximation algorithm (assuming the edge weights satisfy the triangle
inequality).
In 1984, Christos Papadimitriou and Umesh Vazirani posed the challenge of
finding an algorithm with performance guarantee less than 2 for Euclidean
graphs (points in R^n) and d > 2. This paper gives the first answer to that
challenge, presenting an algorithm to compute a degree-3 spanning tree of cost
at most 5/3 times the MST. For points in the plane, the ratio improves to 3/2
and the algorithm can also find a degree-4 spanning tree of cost at most 5/4
times the MST.Comment: conference version in Symposium on Theory of Computing (1994
Approximability of Connected Factors
Finding a d-regular spanning subgraph (or d-factor) of a graph is easy by
Tutte's reduction to the matching problem. By the same reduction, it is easy to
find a minimal or maximal d-factor of a graph. However, if we require that the
d-factor is connected, these problems become NP-hard - finding a minimal
connected 2-factor is just the traveling salesman problem (TSP).
Given a complete graph with edge weights that satisfy the triangle
inequality, we consider the problem of finding a minimal connected -factor.
We give a 3-approximation for all and improve this to an
(r+1)-approximation for even d, where r is the approximation ratio of the TSP.
This yields a 2.5-approximation for even d. The same algorithm yields an
(r+1)-approximation for the directed version of the problem, where r is the
approximation ratio of the asymmetric TSP. We also show that none of these
minimization problems can be approximated better than the corresponding TSP.
Finally, for the decision problem of deciding whether a given graph contains
a connected d-factor, we extend known hardness results.Comment: To appear in the proceedings of WAOA 201
Revisiting Connected Dominating Sets: An Optimal Local Algorithm?
In this paper we consider the classical Connected Dominating Set (CDS) problem. Twenty years ago, Guha and Khuller developed two algorithms for this problem - a centralized greedy approach with an approximation guarantee of H(D) +2, and a local greedy approach with an approximation guarantee of 2(H(D)+1) (where H() is the harmonic function, and D is the maximum degree in the graph). A local greedy algorithm uses significantly less information about the graph, and can be useful in a variety of contexts. However, a fundamental question remained - can we get a local greedy algorithm with the same performance guarantee as the global greedy algorithm without the penalty of the multiplicative factor of "2" in the approximation factor? In this paper, we answer that question in the affirmative
Designing Multi-Commodity Flow Trees
The traditional multi-commodity flow problem assumes a given flow network in
which multiple commodities are to be maximally routed in response to given
demands. This paper considers the multi-commodity flow network-design problem:
given a set of multi-commodity flow demands, find a network subject to certain
constraints such that the commodities can be maximally routed.
This paper focuses on the case when the network is required to be a tree. The
main result is an approximation algorithm for the case when the tree is
required to be of constant degree. The algorithm reduces the problem to the
minimum-weight balanced-separator problem; the performance guarantee of the
algorithm is within a factor of 4 of the performance guarantee of the
balanced-separator procedure. If Leighton and Rao's balanced-separator
procedure is used, the performance guarantee is O(log n). This improves the
O(log^2 n) approximation factor that is trivial to obtain by a direct
application of the balanced-separator method.Comment: Conference version in WADS'9
On the Cost of Essentially Fair Clusterings
Clustering is a fundamental tool in data mining. It partitions points into
groups (clusters) and may be used to make decisions for each point based on its
group. However, this process may harm protected (minority) classes if the
clustering algorithm does not adequately represent them in desirable clusters
-- especially if the data is already biased.
At NIPS 2017, Chierichetti et al. proposed a model for fair clustering
requiring the representation in each cluster to (approximately) preserve the
global fraction of each protected class. Restricting to two protected classes,
they developed both a 4-approximation for the fair -center problem and a
-approximation for the fair -median problem, where is a parameter
for the fairness model. For multiple protected classes, the best known result
is a 14-approximation for fair -center.
We extend and improve the known results. Firstly, we give a 5-approximation
for the fair -center problem with multiple protected classes. Secondly, we
propose a relaxed fairness notion under which we can give bicriteria
constant-factor approximations for all of the classical clustering objectives
-center, -supplier, -median, -means and facility location. The
latter approximations are achieved by a framework that takes an arbitrary
existing unfair (integral) solution and a fair (fractional) LP solution and
combines them into an essentially fair clustering with a weakly supervised
rounding scheme. In this way, a fair clustering can be established belatedly,
in a situation where the centers are already fixed
Scheduling Distributed Clusters of Parallel Machines: Primal-Dual and LP-based Approximation Algorithms
The Map-Reduce computing framework rose to prominence with datasets of such size that dozens of machines on a single cluster were needed for individual jobs. As datasets approach the exabyte scale, a single job may need distributed processing not only on multiple machines, but on multiple clusters. We consider a scheduling problem to minimize weighted average completion time of n jobs on m distributed clusters of parallel machines. In keeping with the scale of the problems motivating this work, we assume that (1) each job is divided into m "subjobs" and (2) distinct subjobs of a given job may be processed concurrently.
When each cluster is a single machine, this is the NP-Hard concurrent open shop problem. A clear limitation of such a model is that a serial processing assumption sidesteps the issue of how different tasks of a given subjob might be processed in parallel. Our algorithms explicitly model clusters as pools of resources and effectively overcome this issue.
Under a variety of parameter settings, we develop two constant factor approximation algorithms for this problem. The first algorithm uses an LP relaxation tailored to this problem from prior work. This LP-based algorithm provides strong performance guarantees. Our second algorithm exploits a surprisingly simple mapping to the special case of one machine per cluster. This mapping-based algorithm is combinatorial and extremely fast. These are the first constant factor approximations for this problem
Matroid and Knapsack Center Problems
In the classic -center problem, we are given a metric graph, and the
objective is to open nodes as centers such that the maximum distance from
any vertex to its closest center is minimized. In this paper, we consider two
important generalizations of -center, the matroid center problem and the
knapsack center problem. Both problems are motivated by recent content
distribution network applications. Our contributions can be summarized as
follows:
1. We consider the matroid center problem in which the centers are required
to form an independent set of a given matroid. We show this problem is NP-hard
even on a line. We present a 3-approximation algorithm for the problem on
general metrics. We also consider the outlier version of the problem where a
given number of vertices can be excluded as the outliers from the solution. We
present a 7-approximation for the outlier version.
2. We consider the (multi-)knapsack center problem in which the centers are
required to satisfy one (or more) knapsack constraint(s). It is known that the
knapsack center problem with a single knapsack constraint admits a
3-approximation. However, when there are at least two knapsack constraints, we
show this problem is not approximable at all. To complement the hardness
result, we present a polynomial time algorithm that gives a 3-approximate
solution such that one knapsack constraint is satisfied and the others may be
violated by at most a factor of . We also obtain a 3-approximation
for the outlier version that may violate the knapsack constraint by
.Comment: A preliminary version of this paper is accepted to IPCO 201
- …