1,743,221 research outputs found

    Optimal Bounds for the kk-cut Problem

    Full text link
    In the kk-cut problem, we want to find the smallest set of edges whose deletion breaks a given (multi)graph into kk connected components. Algorithms of Karger & Stein and Thorup showed how to find such a minimum kk-cut in time approximately O(n2k)O(n^{2k}). The best lower bounds come from conjectures about the solvability of the kk-clique problem, and show that solving kk-cut is likely to require time Ω(nk)\Omega(n^k). Recent results of Gupta, Lee & Li have given special-purpose algorithms that solve the problem in time n1.98k+O(1)n^{1.98k + O(1)}, and ones that have better performance for special classes of graphs (e.g., for small integer weights). In this work, we resolve the problem for general graphs, by showing that the Contraction Algorithm of Karger outputs any fixed kk-cut of weight αλk\alpha \lambda_k with probability Ωk(nαk)\Omega_k(n^{-\alpha k}), where λk\lambda_k denotes the minimum kk-cut size. This also gives an extremal bound of Ok(nk)O_k(n^k) on the number of minimum kk-cuts and an algorithm to compute a minimum kk-cut in similar runtime. Both are tight up to lower-order factors, with the algorithmic lower bound assuming hardness of max-weight kk-clique. The first main ingredient in our result is a fine-grained analysis of how the graph shrinks -- and how the average degree evolves -- in the Karger process. The second ingredient is an extremal bound on the number of cuts of size less than 2λk/k2 \lambda_k/k, using the Sunflower lemma.Comment: Final version of arXiv:1911.09165 with new and more general proof

    Diversifying Top-K Results

    Full text link
    Top-k query processing finds a list of k results that have largest scores w.r.t the user given query, with the assumption that all the k results are independent to each other. In practice, some of the top-k results returned can be very similar to each other. As a result some of the top-k results returned are redundant. In the literature, diversified top-k search has been studied to return k results that take both score and diversity into consideration. Most existing solutions on diversified top-k search assume that scores of all the search results are given, and some works solve the diversity problem on a specific problem and can hardly be extended to general cases. In this paper, we study the diversified top-k search problem. We define a general diversified top-k search problem that only considers the similarity of the search results themselves. We propose a framework, such that most existing solutions for top-k query processing can be extended easily to handle diversified top-k search, by simply applying three new functions, a sufficient stop condition sufficient(), a necessary stop condition necessary(), and an algorithm for diversified top-k search on the current set of generated results, div-search-current(). We propose three new algorithms, namely, div-astar, div-dp, and div-cut to solve the div-search-current() problem. div-astar is an A* based algorithm, div-dp is an algorithm that decomposes the results into components which are searched using div-astar independently and combined using dynamic programming. div-cut further decomposes the current set of generated results using cut points and combines the results using sophisticated operations. We conducted extensive performance studies using two real datasets, enwiki and reuters. Our div-cut algorithm finds the optimal solution for diversified top-k search problem in seconds even for k as large as 2,000.Comment: VLDB201

    Tight Continuous Relaxation of the Balanced kk-Cut Problem

    Full text link
    Spectral Clustering as a relaxation of the normalized/ratio cut has become one of the standard graph-based clustering methods. Existing methods for the computation of multiple clusters, corresponding to a balanced kk-cut of the graph, are either based on greedy techniques or heuristics which have weak connection to the original motivation of minimizing the normalized cut. In this paper we propose a new tight continuous relaxation for any balanced kk-cut problem and show that a related recently proposed relaxation is in most cases loose leading to poor performance in practice. For the optimization of our tight continuous relaxation we propose a new algorithm for the difficult sum-of-ratios minimization problem which achieves monotonic descent. Extensive comparisons show that our method outperforms all existing approaches for ratio cut and other balanced kk-cut criteria.Comment: Long version of paper accepted at NIPS 201

    Sparsest Cut on Bounded Treewidth Graphs: Algorithms and Hardness Results

    Full text link
    We give a 2-approximation algorithm for Non-Uniform Sparsest Cut that runs in time nO(k)n^{O(k)}, where kk is the treewidth of the graph. This improves on the previous 22k2^{2^k}-approximation in time \poly(n) 2^{O(k)} due to Chlamt\'a\v{c} et al. To complement this algorithm, we show the following hardness results: If the Non-Uniform Sparsest Cut problem has a ρ\rho-approximation for series-parallel graphs (where ρ1\rho \geq 1), then the Max Cut problem has an algorithm with approximation factor arbitrarily close to 1/ρ1/\rho. Hence, even for such restricted graphs (which have treewidth 2), the Sparsest Cut problem is NP-hard to approximate better than 17/16ϵ17/16 - \epsilon for ϵ>0\epsilon > 0; assuming the Unique Games Conjecture the hardness becomes 1/αGWϵ1/\alpha_{GW} - \epsilon. For graphs with large (but constant) treewidth, we show a hardness result of 2ϵ2 - \epsilon assuming the Unique Games Conjecture. Our algorithm rounds a linear program based on (a subset of) the Sherali-Adams lift of the standard Sparsest Cut LP. We show that even for treewidth-2 graphs, the LP has an integrality gap close to 2 even after polynomially many rounds of Sherali-Adams. Hence our approach cannot be improved even on such restricted graphs without using a stronger relaxation
    corecore