4 research outputs found
Exact Algorithms and Lower Bounds for Stable Instances of Euclidean k-Means
We investigate the complexity of solving stable or perturbation-resilient
instances of k-Means and k-Median clustering in fixed dimension Euclidean
metrics (or more generally doubling metrics). The notion of stable or
perturbation resilient instances was introduced by Bilu and Linial [2010] and
Awasthi et al. [2012]. In our context we say a k-Means instance is
\alpha-stable if there is a unique OPT solution which remains unchanged if
distances are (non-uniformly) stretched by a factor of at most \alpha. Stable
clustering instances have been studied to explain why heuristics such as
Lloyd's algorithm perform well in practice. In this work we show that for any
fixed \epsilon>0, (1+\epsilon)-stable instances of k-Means in doubling metrics
can be solved in polynomial time. More precisely we show a natural multiswap
local search algorithm in fact finds the OPT solution for (1+\epsilon)-stable
instances of k-Means and k-Median in a polynomial number of iterations. We
complement this result by showing that under a plausible PCP hypothesis this is
essentially tight: that when the dimension d is part of the input, there is a
fixed \epsilon_0>0 s.t. there is not even a PTAS for (1+\epsilon_0)-stable
k-Means in R^d unless NP=RP. To do this, we consider a robust property of CSPs;
call an instance stable if there is a unique optimum solution x^* and for any
other solution x', the number of unsatisfied clauses is proportional to the
Hamming distance between x^* and x'. Dinur et al. have already shown stable
QSAT is hard to approximate for some constant Q, our hypothesis is simply that
stable QSAT with bounded variable occurrence is also hard. Given this
hypothesis, we consider "stability-preserving" reductions to prove our hardness
for stable k-Means. Such reductions seem to be more fragile than standard
L-reductions and may be of further use to demonstrate other stable optimization
problems are hard.Comment: 29 page