65,898 research outputs found
An algorithm and a core set result for the weighted euclidean one-center problem
Given a set A of m points in n-dimensional space with corresponding positive weights, the weighted Euclidean one-center problem, which is a generalization of the minimum enclosing ball problem, involves the computation of a point c A n that minimizes the maximum weighted Euclidean distance from c A to each point in A In this paper, given Īµ > 0, we propose and analyze an algorithm that computes a (1 + Īµ)-approximate solution to the weighted Euclidean one-center problem. Our algorithm explicitly constructs a small subset X ā A, called an Īµ-core set of A, for which the optimal solution of the corresponding weighted Euclidean one-center problem is a close approximation to that of A. In addition, we establish that \X\ depends only on Īµ and on the ratio of the smallest and largest weights, but is independent of the number of points m and the dimension n. This result subsumes and generalizes the previously known core set results for the minimum enclosing ball problem. Our algorithm computes a (1 + Īµ)-approximate solution to the weighted Euclidean one-center problem for A in O(mn\X\) arithmetic operations. Our computational results indicate that the size of the Īµ-core set computed by the algorithm is, in general, significantly smaller than the theoretical worst-case estimate, which contributes to the efficiency of the algorithm, especially for large-scale instances. We shed some light on the possible reasons for this discrepancy between the theoretical estimate and the practical performance. Ā© 2009 Informs
An interior point algorithm for minimum sum-of-squares clustering
Copyright @ 2000 SIAM PublicationsAn exact algorithm is proposed for minimum sum-of-squares nonhierarchical clustering, i.e., for partitioning a given set of points from a Euclidean m-space into a given number of clusters in order to minimize the sum of squared distances from all points to the centroid of the cluster to which they belong. This problem is expressed as a constrained hyperbolic program in 0-1 variables. The resolution method combines an interior point algorithm, i.e., a weighted analytic center column generation method, with branch-and-bound. The auxiliary problem of determining the entering column (i.e., the oracle) is an unconstrained hyperbolic program in 0-1 variables with a quadratic numerator and linear denominator. It is solved through a sequence of unconstrained quadratic programs in 0-1 variables. To accelerate resolution, variable neighborhood search heuristics are used both to get a good initial solution and to solve quickly the auxiliary problem as long as global optimality is not reached. Estimated bounds for the dual variables are deduced from the heuristic solution and used in the resolution process as a trust region. Proved minimum sum-of-squares partitions are determined for the rst time for several fairly large data sets from the literature, including Fisher's 150 iris.This research was supported by the Fonds
National de la Recherche Scientifique Suisse, NSERC-Canada, and FCAR-Quebec
Algorithms for Stable Matching and Clustering in a Grid
We study a discrete version of a geometric stable marriage problem originally
proposed in a continuous setting by Hoffman, Holroyd, and Peres, in which
points in the plane are stably matched to cluster centers, as prioritized by
their distances, so that each cluster center is apportioned a set of points of
equal area. We show that, for a discretization of the problem to an
grid of pixels with centers, the problem can be solved in time , and we experiment with two slower but more practical algorithms and
a hybrid method that switches from one of these algorithms to the other to gain
greater efficiency than either algorithm alone. We also show how to combine
geometric stable matchings with a -means clustering algorithm, so as to
provide a geometric political-districting algorithm that views distance in
economic terms, and we experiment with weighted versions of stable -means in
order to improve the connectivity of the resulting clusters.Comment: 23 pages, 12 figures. To appear (without the appendices) at the 18th
International Workshop on Combinatorial Image Analysis, June 19-21, 2017,
Plovdiv, Bulgari
Faster Clustering via Preprocessing
We examine the efficiency of clustering a set of points, when the
encompassing metric space may be preprocessed in advance. In computational
problems of this genre, there is a first stage of preprocessing, whose input is
a collection of points ; the next stage receives as input a query set
, and should report a clustering of according to some
objective, such as 1-median, in which case the answer is a point
minimizing .
We design fast algorithms that approximately solve such problems under
standard clustering objectives like -center and -median, when the metric
has low doubling dimension. By leveraging the preprocessing stage, our
algorithms achieve query time that is near-linear in the query size ,
and is (almost) independent of the total number of points .Comment: 24 page
New Frameworks for Offline and Streaming Coreset Constructions
A coreset for a set of points is a small subset of weighted points that
approximately preserves important properties of the original set. Specifically,
if is a set of points, is a set of queries, and is a cost function, then a set with weights
is an -coreset for some parameter if
is a multiplicative approximation to
for all . Coresets are used to solve fundamental
problems in machine learning under various big data models of computation. Many
of the suggested coresets in the recent decade used, or could have used a
general framework for constructing coresets whose size depends quadratically on
what is known as total sensitivity .
In this paper we improve this bound from to . Thus our
results imply more space efficient solutions to a number of problems, including
projective clustering, -line clustering, and subspace approximation.
Moreover, we generalize the notion of sensitivity sampling for sup-sampling
that supports non-multiplicative approximations, negative cost functions and
more. The main technical result is a generic reduction to the sample complexity
of learning a class of functions with bounded VC dimension. We show that
obtaining an -sample for this class of functions with appropriate
parameters and suffices to achieve space efficient
-coresets.
Our result implies more efficient coreset constructions for a number of
interesting problems in machine learning; we show applications to
-median/-means, -line clustering, -subspace approximation, and the
integer -projective clustering problem
An ETH-Tight Exact Algorithm for Euclidean TSP
We study exact algorithms for {\sc Euclidean TSP} in . In the
early 1990s algorithms with running time were presented for
the planar case, and some years later an algorithm with
running time was presented for any . Despite significant interest in
subexponential exact algorithms over the past decade, there has been no
progress on {\sc Euclidean TSP}, except for a lower bound stating that the
problem admits no algorithm unless ETH fails. Up to
constant factors in the exponent, we settle the complexity of {\sc Euclidean
TSP} by giving a algorithm and by showing that a
algorithm does not exist unless ETH fails.Comment: To appear in FOCS 201
- ā¦