5 research outputs found

    Testing Cluster Structure of Graphs

    Full text link
    We study the problem of recognizing the cluster structure of a graph in the framework of property testing in the bounded degree model. Given a parameter ε\varepsilon, a dd-bounded degree graph is defined to be (k,ϕ)(k, \phi)-clusterable, if it can be partitioned into no more than kk parts, such that the (inner) conductance of the induced subgraph on each part is at least ϕ\phi and the (outer) conductance of each part is at most cd,kε4ϕ2c_{d,k}\varepsilon^4\phi^2, where cd,kc_{d,k} depends only on d,kd,k. Our main result is a sublinear algorithm with the running time O~(npoly(ϕ,k,1/ε))\widetilde{O}(\sqrt{n}\cdot\mathrm{poly}(\phi,k,1/\varepsilon)) that takes as input a graph with maximum degree bounded by dd, parameters kk, ϕ\phi, ε\varepsilon, and with probability at least 23\frac23, accepts the graph if it is (k,ϕ)(k,\phi)-clusterable and rejects the graph if it is ε\varepsilon-far from (k,ϕ)(k, \phi^*)-clusterable for ϕ=cd,kϕ2ε4logn\phi^* = c'_{d,k}\frac{\phi^2 \varepsilon^4}{\log n}, where cd,kc'_{d,k} depends only on d,kd,k. By the lower bound of Ω(n)\Omega(\sqrt{n}) on the number of queries needed for testing graph expansion, which corresponds to k=1k=1 in our problem, our algorithm is asymptotically optimal up to polylogarithmic factors.Comment: Full version of STOC 201

    Partitioning Well-clustered Graphs with k-Means and Heat Kernel

    No full text
    We study a suitable class of well-clustered graphs that admit good k-way partitions and present the first almost-linear time algorithm for with almost-optimal approximation guarantees partitioning such graphs. A good k-way partition is a partition of the vertices of a graph into disjoint clusters (subsets) {Si}i=1k\{S_i\}_{i=1}^k, such that each cluster is better connected on the inside than towards the outside. This problem is a key building block in algorithm design, and has wide applications in community detection and network analysis. Key to our result is a theorem on the multi-cut and eigenvector structure of the graph Laplacians of these well-clustered graphs. Based on this theorem, we give the first rigorous guarantees on the approximation ratios of the widely used k-means clustering algorithms. We also give an almost-linear time algorithm based on heat kernel embeddings and approximate nearest neighbor data structures

    Spectral concentration, robust k-center, and simple clustering

    No full text
    Non UBCUnreviewedAuthor affiliation: The Ohio State UniversityFacult

    Constant factor approximation for balanced cut in the PIE model Tasos Sidiropoulos: Spectral concentration, robust k-center, and simple clustering

    No full text
    Non UBCUnreviewedAuthor affiliation: Microsoft ResearchOthe