2 research outputs found

    High Dimensional Clustering with rr-nets

    Full text link
    Clustering, a fundamental task in data science and machine learning, groups a set of objects in such a way that objects in the same cluster are closer to each other than to those in other clusters. In this paper, we consider a well-known structure, so-called rr-nets, which rigorously captures the properties of clustering. We devise algorithms that improve the run-time of approximating rr-nets in high-dimensional spaces with 1\ell_1 and 2\ell_2 metrics from O~(dn2Θ(ϵ))\tilde{O}(dn^{2-\Theta(\sqrt{\epsilon})}) to O~(dn+n2α)\tilde{O}(dn + n^{2-\alpha}), where α=Ω(ϵ1/3/log(1/ϵ))\alpha = \Omega({\epsilon^{1/3}}/{\log(1/\epsilon)}). These algorithms are also used to improve a framework that provides approximate solutions to other high dimensional distance problems. Using this framework, several important related problems can also be solved efficiently, e.g., (1+ϵ)(1+\epsilon)-approximate kkth-nearest neighbor distance, (4+ϵ)(4+\epsilon)-approximate Min-Max clustering, (4+ϵ)(4+\epsilon)-approximate kk-center clustering. In addition, we build an algorithm that (1+ϵ)(1+\epsilon)-approximates greedy permutations in time O~((dn+n2α)logΦ)\tilde{O}((dn + n^{2-\alpha}) \cdot \log{\Phi}) where Φ\Phi is the spread of the input. This algorithm is used to (2+ϵ)(2+\epsilon)-approximate kk-center with the same time complexity.Comment: Accepted by AAAI201

    High Dimensional Clustering with r-nets

    No full text
    corecore