Search CORE

1,743,221 research outputs found

Optimal Bounds for the $k$ -cut Problem

Author: Gupta Anupam
Harris David G.
Lee Euiwoong
Li Jason
Publication venue
Publication date: 12/02/2021
Field of study

In the

k

-cut problem, we want to find the smallest set of edges whose deletion breaks a given (multi)graph into

k

connected components. Algorithms of Karger & Stein and Thorup showed how to find such a minimum

k

-cut in time approximately

O(n^{2k})

. The best lower bounds come from conjectures about the solvability of the

k

-clique problem, and show that solving

k

-cut is likely to require time

\Omega(n^k)

. Recent results of Gupta, Lee & Li have given special-purpose algorithms that solve the problem in time

n^{1.98k + O(1)}

, and ones that have better performance for special classes of graphs (e.g., for small integer weights). In this work, we resolve the problem for general graphs, by showing that the Contraction Algorithm of Karger outputs any fixed

k

-cut of weight

\alpha \lambda_k

with probability

\Omega_k(n^{-\alpha k})

, where

\lambda_k

denotes the minimum

k

-cut size. This also gives an extremal bound of

O_k(n^k)

on the number of minimum

k

-cuts and an algorithm to compute a minimum

k

-cut in similar runtime. Both are tight up to lower-order factors, with the algorithmic lower bound assuming hardness of max-weight

k

-clique. The first main ingredient in our result is a fine-grained analysis of how the graph shrinks -- and how the average degree evolves -- in the Karger process. The second ingredient is an extremal bound on the number of cuts of size less than

2 \lambda_k/k

, using the Sunflower lemma.Comment: Final version of arXiv:1911.09165 with new and more general proof

arXiv.org e-Print Archive

Diversifying Top-K Results

Author: Chang Lijun
Qin Lu
Yu Jeffrey Xu
Publication venue
Publication date: 01/01/2012
Field of study

Top-k query processing finds a list of k results that have largest scores w.r.t the user given query, with the assumption that all the k results are independent to each other. In practice, some of the top-k results returned can be very similar to each other. As a result some of the top-k results returned are redundant. In the literature, diversified top-k search has been studied to return k results that take both score and diversity into consideration. Most existing solutions on diversified top-k search assume that scores of all the search results are given, and some works solve the diversity problem on a specific problem and can hardly be extended to general cases. In this paper, we study the diversified top-k search problem. We define a general diversified top-k search problem that only considers the similarity of the search results themselves. We propose a framework, such that most existing solutions for top-k query processing can be extended easily to handle diversified top-k search, by simply applying three new functions, a sufficient stop condition sufficient(), a necessary stop condition necessary(), and an algorithm for diversified top-k search on the current set of generated results, div-search-current(). We propose three new algorithms, namely, div-astar, div-dp, and div-cut to solve the div-search-current() problem. div-astar is an A* based algorithm, div-dp is an algorithm that decomposes the results into components which are searched using div-astar independently and combined using dynamic programming. div-cut further decomposes the current set of generated results using cut points and combines the results using sophisticated operations. We conducted extensive performance studies using two real datasets, enwiki and reuters. Our div-cut algorithm finds the optimal solution for diversified top-k search problem in seconds even for k as large as 2,000.Comment: VLDB201

arXiv.org e-Print Archive

OPUS - University of Technology Sydney

Tight Continuous Relaxation of the Balanced $k$ -Cut Problem

Author: Hein Matthias
Mudrakarta Pramod Kaushik
Rangapuram Syama Sundar
Publication venue
Publication date: 01/01/2014
Field of study

Spectral Clustering as a relaxation of the normalized/ratio cut has become one of the standard graph-based clustering methods. Existing methods for the computation of multiple clusters, corresponding to a balanced

k

-cut of the graph, are either based on greedy techniques or heuristics which have weak connection to the original motivation of minimizing the normalized cut. In this paper we propose a new tight continuous relaxation for any balanced

k

-cut problem and show that a related recently proposed relaxation is in most cases loose leading to poor performance in practice. For the optimization of our tight continuous relaxation we propose a new algorithm for the difficult sum-of-ratios minimization problem which achieves monotonic descent. Extensive comparisons show that our method outperforms all existing approaches for ratio cut and other balanced

k

-cut criteria.Comment: Long version of paper accepted at NIPS 201

arXiv.org e-Print Archive

CiteSeerX

CISPA – Helmholtz-Zentrum für Informationssicherheit

Sparsest Cut on Bounded Treewidth Graphs: Algorithms and Hardness Results

Author: Gupta Anupam
Talwar Kunal
Witmer David
Publication venue
Publication date: 01/01/2013
Field of study

We give a 2-approximation algorithm for Non-Uniform Sparsest Cut that runs in time

n^{O(k)}

, where

k

is the treewidth of the graph. This improves on the previous

2^{2^k}

-approximation in time \poly(n) 2^{O(k)} due to Chlamt\'a\v{c} et al. To complement this algorithm, we show the following hardness results: If the Non-Uniform Sparsest Cut problem has a

\rho

-approximation for series-parallel graphs (where

\rho \geq 1

), then the Max Cut problem has an algorithm with approximation factor arbitrarily close to

1/\rho

. Hence, even for such restricted graphs (which have treewidth 2), the Sparsest Cut problem is NP-hard to approximate better than

17/16 - \epsilon

for

\epsilon > 0

; assuming the Unique Games Conjecture the hardness becomes

1/\alpha_{GW} - \epsilon

. For graphs with large (but constant) treewidth, we show a hardness result of

2 - \epsilon

assuming the Unique Games Conjecture. Our algorithm rounds a linear program based on (a subset of) the Sherali-Adams lift of the standard Sparsest Cut LP. We show that even for treewidth-2 graphs, the LP has an integrality gap close to 2 even after polynomially many rounds of Sherali-Adams. Hence our approach cannot be improved even on such restricted graphs without using a stronger relaxation

arXiv.org e-Print Archive

CiteSeerX