Search CORE

24,486 research outputs found

Greedy Strategy Works for k-Center Clustering with Outliers and Coreset Construction

Author: Ding Hu
Wang Zixiu
Yu Haikuo
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 27th Annual European Symposium on Algorithms (ESA 2019)
Publication date: 01/01/2019
Field of study

We study the problem of k-center clustering with outliers in arbitrary metrics and Euclidean space. Though a number of methods have been developed in the past decades, it is still quite challenging to design quality guaranteed algorithm with low complexity for this problem. Our idea is inspired by the greedy method, Gonzalez\u27s algorithm, for solving the problem of ordinary k-center clustering. Based on some novel observations, we show that this greedy strategy actually can handle k-center clustering with outliers efficiently, in terms of clustering quality and time complexity. We further show that the greedy approach yields small coreset for the problem in doubling metrics, so as to reduce the time complexity significantly. Our algorithms are easy to implement in practice. We test our method on both synthetic and real datasets. The experimental results suggest that our algorithms can achieve near optimal solutions and yield lower running times comparing with existing methods

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Squarepants in a Tree: Sum of Subtree Clustering and Hyperbolic Pants Decomposition

Author: Alstrup S.
Aluru S.
Bern M. W.
David Eppstein
Erickson J.
Saitou N.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 24/02/2008
Field of study

We provide efficient constant factor approximation algorithms for the problems of finding a hierarchical clustering of a point set in any metric space, minimizing the sum of minimimum spanning tree lengths within each cluster, and in the hyperbolic or Euclidean planes, minimizing the sum of cluster perimeters. Our algorithms for the hyperbolic and Euclidean planes can also be used to provide a pants decomposition, that is, a set of disjoint simple closed curves partitioning the plane minus the input points into subsets with exactly three boundary components, with approximately minimum total length. In the Euclidean case, these curves are squares; in the hyperbolic case, they combine our Euclidean square pants decomposition with our tree clustering method for general metric spaces.Comment: 22 pages, 14 figures. This version replaces the proof of what is now Lemma 5.2, as the previous proof was erroneou

arXiv.org e-Print Archive

Crossref

Next Generation Cluster Editing

Author: Bellitto Thomas
Klau Gunnar W.
Marschall Tobias
Schönhuth Alexander
Publication venue
Publication date: 01/01/2013
Field of study

This work aims at improving the quality of structural variant prediction from the mapped reads of a sequenced genome. We suggest a new model based on cluster editing in weighted graphs and introduce a new heuristic algorithm that allows to solve this problem quickly and with a good approximation on the huge graphs that arise from biological datasets

arXiv.org e-Print Archive

CWI's Institutional Repository

The Non-Uniform k-Center Problem

Author: Chakrabarty Deeparnab
Goyal Prachi
Krishnaswamy Ravishankar
Publication venue
Publication date: 01/01/2016
Field of study

In this paper, we introduce and study the Non-Uniform k-Center problem (NUkC). Given a finite metric space

(X,d)

and a collection of balls of radii

\{r_1\geq \cdots \ge r_k\}

, the NUkC problem is to find a placement of their centers on the metric space and find the minimum dilation

\alpha

, such that the union of balls of radius

\alpha\cdot r_i

around the

i

th center covers all the points in

X

. This problem naturally arises as a min-max vehicle routing problem with fleets of different speeds. The NUkC problem generalizes the classic

k

-center problem when all the

k

radii are the same (which can be assumed to be

1

after scaling). It also generalizes the

k

-center with outliers (kCwO) problem when there are

k

balls of radius

1

and

\ell

balls of radius

0

. There are

2

-approximation and

3

-approximation algorithms known for these problems respectively; the former is best possible unless P=NP and the latter remains unimproved for 15 years. We first observe that no

O(1)

-approximation is to the optimal dilation is possible unless P=NP, implying that the NUkC problem is more non-trivial than the above two problems. Our main algorithmic result is an

(O(1),O(1))

-bi-criteria approximation result: we give an

O(1)

-approximation to the optimal dilation, however, we may open

\Theta(1)

centers of each radii. Our techniques also allow us to prove a simple (uni-criteria), optimal

2

-approximation to the kCwO problem improving upon the long-standing

3

-factor. Our main technical contribution is a connection between the NUkC problem and the so-called firefighter problems on trees which have been studied recently in the TCS community.Comment: Adjusted the figur

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server