16,227 research outputs found
Towards explaining the speed of -means
The -means method is a popular algorithm for clustering, known for its speed in practice. This stands in contrast to its exponential worst-case running-time. To explain the speed of the -means method, a smoothed analysis has been conducted. We sketch this smoothed analysis and a generalization to Bregman divergences
A bi-criteria approximation algorithm for Means
We consider the classical -means clustering problem in the setting
bi-criteria approximation, in which an algoithm is allowed to output clusters, and must produce a clustering with cost at most times the
to the cost of the optimal set of clusters. We argue that this approach is
natural in many settings, for which the exact number of clusters is a priori
unknown, or unimportant up to a constant factor. We give new bi-criteria
approximation algorithms, based on linear programming and local search,
respectively, which attain a guarantee depending on the number
of clusters that may be opened. Our gurantee is
always at most and improves rapidly with (for example:
, and ). Moreover, our algorithms have only
polynomial dependence on the dimension of the input data, and so are applicable
in high-dimensional settings
Bicriteria Network Design Problems
We study a general class of bicriteria network design problems. A generic
problem in this class is as follows: Given an undirected graph and two
minimization objectives (under different cost functions), with a budget
specified on the first, find a <subgraph \from a given subgraph-class that
minimizes the second objective subject to the budget on the first. We consider
three different criteria - the total edge cost, the diameter and the maximum
degree of the network. Here, we present the first polynomial-time approximation
algorithms for a large class of bicriteria network design problems for the
above mentioned criteria. The following general types of results are presented.
First, we develop a framework for bicriteria problems and their
approximations. Second, when the two criteria are the same %(note that the cost
functions continue to be different) we present a ``black box'' parametric
search technique. This black box takes in as input an (approximation) algorithm
for the unicriterion situation and generates an approximation algorithm for the
bicriteria case with only a constant factor loss in the performance guarantee.
Third, when the two criteria are the diameter and the total edge costs we use a
cluster-based approach to devise a approximation algorithms --- the solutions
output violate both the criteria by a logarithmic factor. Finally, for the
class of treewidth-bounded graphs, we provide pseudopolynomial-time algorithms
for a number of bicriteria problems using dynamic programming. We show how
these pseudopolynomial-time algorithms can be converted to fully
polynomial-time approximation schemes using a scaling technique.Comment: 24 pages 1 figur
Minimum-Cost Coverage of Point Sets by Disks
We consider a class of geometric facility location problems in which the goal
is to determine a set X of disks given by their centers (t_j) and radii (r_j)
that cover a given set of demand points Y in the plane at the smallest possible
cost. We consider cost functions of the form sum_j f(r_j), where f(r)=r^alpha
is the cost of transmission to radius r. Special cases arise for alpha=1 (sum
of radii) and alpha=2 (total area); power consumption models in wireless
network design often use an exponent alpha>2. Different scenarios arise
according to possible restrictions on the transmission centers t_j, which may
be constrained to belong to a given discrete set or to lie on a line, etc. We
obtain several new results, including (a) exact and approximation algorithms
for selecting transmission points t_j on a given line in order to cover demand
points Y in the plane; (b) approximation algorithms (and an algebraic
intractability result) for selecting an optimal line on which to place
transmission points to cover Y; (c) a proof of NP-hardness for a discrete set
of transmission points in the plane and any fixed alpha>1; and (d) a
polynomial-time approximation scheme for the problem of computing a minimum
cost covering tour (MCCT), in which the total cost is a linear combination of
the transmission cost for the set of disks and the length of a tour/path that
connects the centers of the disks.Comment: 10 pages, 4 figures, Latex, to appear in ACM Symposium on
Computational Geometry 200
- …