24,486 research outputs found
Greedy Strategy Works for k-Center Clustering with Outliers and Coreset Construction
We study the problem of k-center clustering with outliers in arbitrary metrics and Euclidean space. Though a number of methods have been developed in the past decades, it is still quite challenging to design quality guaranteed algorithm with low complexity for this problem. Our idea is inspired by the greedy method, Gonzalez\u27s algorithm, for solving the problem of ordinary k-center clustering. Based on some novel observations, we show that this greedy strategy actually can handle k-center clustering with outliers efficiently, in terms of clustering quality and time complexity. We further show that the greedy approach yields small coreset for the problem in doubling metrics, so as to reduce the time complexity significantly. Our algorithms are easy to implement in practice. We test our method on both synthetic and real datasets. The experimental results suggest that our algorithms can achieve near optimal solutions and yield lower running times comparing with existing methods
Squarepants in a Tree: Sum of Subtree Clustering and Hyperbolic Pants Decomposition
We provide efficient constant factor approximation algorithms for the
problems of finding a hierarchical clustering of a point set in any metric
space, minimizing the sum of minimimum spanning tree lengths within each
cluster, and in the hyperbolic or Euclidean planes, minimizing the sum of
cluster perimeters. Our algorithms for the hyperbolic and Euclidean planes can
also be used to provide a pants decomposition, that is, a set of disjoint
simple closed curves partitioning the plane minus the input points into subsets
with exactly three boundary components, with approximately minimum total
length. In the Euclidean case, these curves are squares; in the hyperbolic
case, they combine our Euclidean square pants decomposition with our tree
clustering method for general metric spaces.Comment: 22 pages, 14 figures. This version replaces the proof of what is now
Lemma 5.2, as the previous proof was erroneou
Next Generation Cluster Editing
This work aims at improving the quality of structural variant prediction from
the mapped reads of a sequenced genome. We suggest a new model based on cluster
editing in weighted graphs and introduce a new heuristic algorithm that allows
to solve this problem quickly and with a good approximation on the huge graphs
that arise from biological datasets
The Non-Uniform k-Center Problem
In this paper, we introduce and study the Non-Uniform k-Center problem
(NUkC). Given a finite metric space and a collection of balls of radii
, the NUkC problem is to find a placement of their
centers on the metric space and find the minimum dilation , such that
the union of balls of radius around the th center covers
all the points in . This problem naturally arises as a min-max vehicle
routing problem with fleets of different speeds.
The NUkC problem generalizes the classic -center problem when all the
radii are the same (which can be assumed to be after scaling). It also
generalizes the -center with outliers (kCwO) problem when there are
balls of radius and balls of radius . There are -approximation
and -approximation algorithms known for these problems respectively; the
former is best possible unless P=NP and the latter remains unimproved for 15
years.
We first observe that no -approximation is to the optimal dilation is
possible unless P=NP, implying that the NUkC problem is more non-trivial than
the above two problems. Our main algorithmic result is an
-bi-criteria approximation result: we give an -approximation
to the optimal dilation, however, we may open centers of each
radii. Our techniques also allow us to prove a simple (uni-criteria), optimal
-approximation to the kCwO problem improving upon the long-standing
-factor. Our main technical contribution is a connection between the NUkC
problem and the so-called firefighter problems on trees which have been studied
recently in the TCS community.Comment: Adjusted the figur
- …