4,522 research outputs found
Combinatorial persistency criteria for multicut and max-cut
In combinatorial optimization, partial variable assignments are called
persistent if they agree with some optimal solution. We propose persistency
criteria for the multicut and max-cut problem as well as fast combinatorial
routines to verify them. The criteria that we derive are based on mappings that
improve feasible multicuts, respectively cuts. Our elementary criteria can be
checked enumeratively. The more advanced ones rely on fast algorithms for upper
and lower bounds for the respective cut problems and max-flow techniques for
auxiliary min-cut problems. Our methods can be used as a preprocessing
technique for reducing problem sizes or for computing partial optimality
guarantees for solutions output by heuristic solvers. We show the efficacy of
our methods on instances of both problems from computer vision, biomedical
image analysis and statistical physics
Clustering Partially Observed Graphs via Convex Optimization
This paper considers the problem of clustering a partially observed
unweighted graph---i.e., one where for some node pairs we know there is an edge
between them, for some others we know there is no edge, and for the remaining
we do not know whether or not there is an edge. We want to organize the nodes
into disjoint clusters so that there is relatively dense (observed)
connectivity within clusters, and sparse across clusters.
We take a novel yet natural approach to this problem, by focusing on finding
the clustering that minimizes the number of "disagreements"---i.e., the sum of
the number of (observed) missing edges within clusters, and (observed) present
edges across clusters. Our algorithm uses convex optimization; its basis is a
reduction of disagreement minimization to the problem of recovering an
(unknown) low-rank matrix and an (unknown) sparse matrix from their partially
observed sum. We evaluate the performance of our algorithm on the classical
Planted Partition/Stochastic Block Model. Our main theorem provides sufficient
conditions for the success of our algorithm as a function of the minimum
cluster size, edge density and observation probability; in particular, the
results characterize the tradeoff between the observation probability and the
edge density gap. When there are a constant number of clusters of equal size,
our results are optimal up to logarithmic factors.Comment: This is the final version published in Journal of Machine Learning
Research (JMLR). Partial results appeared in International Conference on
Machine Learning (ICML) 201
{RAMA}: {A} Rapid Multicut Algorithm on {GPU}
We propose a highly parallel primal-dual algorithm for the multicut (a.k.a. correlation clustering) problem, a classical graph clustering problem widely used in machine learning and computer vision. Our algorithm consists of three steps executed recursively: (1) Finding conflicted cycles that correspond to violated inequalities of the underlying multicut relaxation, (2) Performing message passing between the edges and cycles to optimize the Lagrange relaxation coming from the found violated cycles producing reduced costs and (3) Contracting edges with high reduced costs through matrix-matrix multiplications. Our algorithm produces primal solutions and dual lower bounds that estimate the distance to optimum. We implement our algorithm on GPUs and show resulting one to two order-of-magnitudes improvements in execution speed without sacrificing solution quality compared to traditional serial algorithms that run on CPUs. We can solve very large scale benchmark problems with up to variables in a few seconds with small primal-dual gaps. We make our code available at https://github.com/pawelswoboda/RAMA
- …