Search CORE

5 research outputs found

Clustering is difficult only when it does not matter

Author: Daniely Amit
Linial Nati
Saks Michael
Publication venue
Publication date: 01/01/2012
Field of study

Numerous papers ask how difficult it is to cluster data. We suggest that the more relevant and interesting question is how difficult it is to cluster data sets {\em that can be clustered well}. More generally, despite the ubiquity and the great importance of clustering, we still do not have a satisfactory mathematical theory of clustering. In order to properly understand clustering, it is clearly necessary to develop a solid theoretical basis for the area. For example, from the perspective of computational complexity theory the clustering problem seems very hard. Numerous papers introduce various criteria and numerical measures to quantify the quality of a given clustering. The resulting conclusions are pessimistic, since it is computationally difficult to find an optimal clustering of a given data set, if we go by any of these popular criteria. In contrast, the practitioners' perspective is much more optimistic. Our explanation for this disparity of opinions is that complexity theory concentrates on the worst case, whereas in reality we only care for data sets that can be clustered well. We introduce a theoretical framework of clustering in metric spaces that revolves around a notion of "good clustering". We show that if a good clustering exists, then in many cases it can be efficiently found. Our conclusion is that contrary to popular belief, clustering should not be considered a hard task

arXiv.org e-Print Archive

CiteSeerX

On the practically interesting instances of MAXCUT

Author: Bilu Yonatan
Daniely Amit
Linial Nati
Saks Michael
Publication venue
Publication date: 01/01/2012
Field of study

The complexity of a computational problem is traditionally quantified based on the hardness of its worst case. This approach has many advantages and has led to a deep and beautiful theory. However, from the practical perspective, this leaves much to be desired. In application areas, practically interesting instances very often occupy just a tiny part of an algorithm's space of instances, and the vast majority of instances are simply irrelevant. Addressing these issues is a major challenge for theoretical computer science which may make theory more relevant to the practice of computer science. Following Bilu and Linial, we apply this perspective to MAXCUT, viewed as a clustering problem. Using a variety of techniques, we investigate practically interesting instances of this problem. Specifically, we show how to solve in polynomial time distinguished, metric, expanding and dense instances of MAXCUT under mild stability assumptions. In particular,

(1+\epsilon)

-stability (which is optimal) suffices for metric and dense MAXCUT. We also show how to solve in polynomial time

\Omega(\sqrt{n})

-stable instances of MAXCUT, substantially improving the best previously known result

arXiv.org e-Print Archive

CiteSeerX

Dagstuhl Research Online Publication Server

Is Zero the best Price? Optimal Pricing of Mobile Applications

Author: Buck Christoph
Graf Julia
Publication venue
Publication date: 01/01/2015
Field of study

EPub Bayreuth

A correspondence between type checking via reduction and type checking via evaluation. Accompanying code overview

Author: Clarke Dave
Sergey Ilya
Publication venue: Department of Computer Science, KU Leuven
Publication date: 01/01/2012
Field of study

This is an accompanying technical report for the paper with the corresponding title, published in Information Processing Letters, volume 112, issues 1--2, pages 13--20. This document contains detailed listings of different semantic artifacts for type checking with explanations on the performed transformations.nrpages: 18status: publishe

Lirias