A Quantum Approximation Scheme for k-Means

Abstract

We give a quantum approximation scheme (i.e., (1+ε)(1 + \varepsilon)-approximation for every ε>0\varepsilon > 0) for the classical kk-means clustering problem in the QRAM model with a running time that has only polylogarithmic dependence on the number of data points. More specifically, given a dataset VV with NN points in Rd\mathbb{R}^d stored in QRAM data structure, our quantum algorithm runs in time O~(2O~(kε)η2d)\tilde{O} \left( 2^{\tilde{O}(\frac{k}{\varepsilon})} \eta^2 d\right) and with high probability outputs a set CC of kk centers such that cost(V,C)≤(1+ε)⋅cost(V,COPT)cost(V, C) \leq (1+\varepsilon) \cdot cost(V, C_{OPT}). Here COPTC_{OPT} denotes the optimal kk-centers, cost(.)cost(.) denotes the standard kk-means cost function (i.e., the sum of the squared distance of points to the closest center), and η\eta is the aspect ratio (i.e., the ratio of maximum distance to minimum distance). This is the first quantum algorithm with a polylogarithmic running time that gives a provable approximation guarantee of (1+ε)(1+\varepsilon) for the kk-means problem. Also, unlike previous works on unsupervised learning, our quantum algorithm does not require quantum linear algebra subroutines and has a running time independent of parameters (e.g., condition number) that appear in such procedures

    Similar works

    Full text

    thumbnail-image

    Available Versions