19,590 research outputs found
Improved approximation algorithm for k-level UFL with penalties, a simplistic view on randomizing the scaling parameter
The state of the art in approximation algorithms for facility location
problems are complicated combinations of various techniques. In particular, the
currently best 1.488-approximation algorithm for the uncapacitated facility
location (UFL) problem by Shi Li is presented as a result of a non-trivial
randomization of a certain scaling parameter in the LP-rounding algorithm by
Chudak and Shmoys combined with a primal-dual algorithm of Jain et al. In this
paper we first give a simple interpretation of this randomization process in
terms of solving an aux- iliary (factor revealing) LP. Then, armed with this
simple view point, Abstract. we exercise the randomization on a more
complicated algorithm for the k-level version of the problem with penalties in
which the planner has the option to pay a penalty instead of connecting chosen
clients, which results in an improved approximation algorithm
Approximation algorithms for node-weighted prize-collecting Steiner tree problems on planar graphs
We study the prize-collecting version of the Node-weighted Steiner Tree
problem (NWPCST) restricted to planar graphs. We give a new primal-dual
Lagrangian-multiplier-preserving (LMP) 3-approximation algorithm for planar
NWPCST. We then show a ()-approximation which establishes a
new best approximation guarantee for planar NWPCST. This is done by combining
our LMP algorithm with a threshold rounding technique and utilizing the
2.4-approximation of Berman and Yaroslavtsev for the version without penalties.
We also give a primal-dual 4-approximation algorithm for the more general
forest version using techniques introduced by Hajiaghay and Jain
The Bane of Low-Dimensionality Clustering
In this paper, we give a conditional lower bound of on
running time for the classic k-median and k-means clustering objectives (where
n is the size of the input), even in low-dimensional Euclidean space of
dimension four, assuming the Exponential Time Hypothesis (ETH). We also
consider k-median (and k-means) with penalties where each point need not be
assigned to a center, in which case it must pay a penalty, and extend our lower
bound to at least three-dimensional Euclidean space.
This stands in stark contrast to many other geometric problems such as the
traveling salesman problem, or computing an independent set of unit spheres.
While these problems benefit from the so-called (limited) blessing of
dimensionality, as they can be solved in time or
in d dimensions, our work shows that widely-used clustering
objectives have a lower bound of , even in dimension four.
We complete the picture by considering the two-dimensional case: we show that
there is no algorithm that solves the penalized version in time less than
, and provide a matching upper bound of .
The main tool we use to establish these lower bounds is the placement of
points on the moment curve, which takes its inspiration from constructions of
point sets yielding Delaunay complexes of high complexity
The Price of Information in Combinatorial Optimization
Consider a network design application where we wish to lay down a
minimum-cost spanning tree in a given graph; however, we only have stochastic
information about the edge costs. To learn the precise cost of any edge, we
have to conduct a study that incurs a price. Our goal is to find a spanning
tree while minimizing the disutility, which is the sum of the tree cost and the
total price that we spend on the studies. In a different application, each edge
gives a stochastic reward value. Our goal is to find a spanning tree while
maximizing the utility, which is the tree reward minus the prices that we pay.
Situations such as the above two often arise in practice where we wish to
find a good solution to an optimization problem, but we start with only some
partial knowledge about the parameters of the problem. The missing information
can be found only after paying a probing price, which we call the price of
information. What strategy should we adopt to optimize our expected
utility/disutility?
A classical example of the above setting is Weitzman's "Pandora's box"
problem where we are given probability distributions on values of
independent random variables. The goal is to choose a single variable with a
large value, but we can find the actual outcomes only after paying a price. Our
work is a generalization of this model to other combinatorial optimization
problems such as matching, set cover, facility location, and prize-collecting
Steiner tree. We give a technique that reduces such problems to their non-price
counterparts, and use it to design exact/approximation algorithms to optimize
our utility/disutility. Our techniques extend to situations where there are
additional constraints on what parameters can be probed or when we can
simultaneously probe a subset of the parameters.Comment: SODA 201
Image Reconstruction in Optical Interferometry
This tutorial paper describes the problem of image reconstruction from
interferometric data with a particular focus on the specific problems
encountered at optical (visible/IR) wavelengths. The challenging issues in
image reconstruction from interferometric data are introduced in the general
framework of inverse problem approach. This framework is then used to describe
existing image reconstruction algorithms in radio interferometry and the new
methods specifically developed for optical interferometry.Comment: accepted for publication in IEEE Signal Processing Magazin
Data-Collection for the Sloan Digital Sky Survey: a Network-Flow Heuristic
The goal of the Sloan Digital Sky Survey is ``to map in detail one-quarter of
the entire sky, determining the positions and absolute brightnesses of more
than 100 million celestial objects''. The survey will be performed by taking
``snapshots'' through a large telescope. Each snapshot can capture up to 600
objects from a small circle of the sky. This paper describes the design and
implementation of the algorithm that is being used to determine the snapshots
so as to minimize their number. The problem is NP-hard in general; the
algorithm described is a heuristic, based on Lagriangian-relaxation and
min-cost network flow. It gets within 5-15% of a naive lower bound, whereas
using a ``uniform'' cover only gets within 25-35%.Comment: proceedings version appeared in ACM-SIAM Symposium on Discrete
Algorithms (1998
A Fast Algorithm for Robust Regression with Penalised Trimmed Squares
The presence of groups containing high leverage outliers makes linear
regression a difficult problem due to the masking effect. The available high
breakdown estimators based on Least Trimmed Squares often do not succeed in
detecting masked high leverage outliers in finite samples.
An alternative to the LTS estimator, called Penalised Trimmed Squares (PTS)
estimator, was introduced by the authors in \cite{ZiouAv:05,ZiAvPi:07} and it
appears to be less sensitive to the masking problem. This estimator is defined
by a Quadratic Mixed Integer Programming (QMIP) problem, where in the objective
function a penalty cost for each observation is included which serves as an
upper bound on the residual error for any feasible regression line. Since the
PTS does not require presetting the number of outliers to delete from the data
set, it has better efficiency with respect to other estimators. However, due to
the high computational complexity of the resulting QMIP problem, exact
solutions for moderately large regression problems is infeasible.
In this paper we further establish the theoretical properties of the PTS
estimator, such as high breakdown and efficiency, and propose an approximate
algorithm called Fast-PTS to compute the PTS estimator for large data sets
efficiently. Extensive computational experiments on sets of benchmark instances
with varying degrees of outlier contamination, indicate that the proposed
algorithm performs well in identifying groups of high leverage outliers in
reasonable computational time.Comment: 27 page
- …