233 research outputs found
Efficient Semidefinite Branch-and-Cut for MAP-MRF Inference
We propose a Branch-and-Cut (B&C) method for solving general MAP-MRF
inference problems. The core of our method is a very efficient bounding
procedure, which combines scalable semidefinite programming (SDP) and a
cutting-plane method for seeking violated constraints. In order to further
speed up the computation, several strategies have been exploited, including
model reduction, warm start and removal of inactive constraints.
We analyze the performance of the proposed method under different settings,
and demonstrate that our method either outperforms or performs on par with
state-of-the-art approaches. Especially when the connectivities are dense or
when the relative magnitudes of the unary costs are low, we achieve the best
reported results. Experiments show that the proposed algorithm achieves better
approximation than the state-of-the-art methods within a variety of time
budgets on challenging non-submodular MAP-MRF inference problems.Comment: 21 page
Scalable Semidefinite Relaxation for Maximum A Posterior Estimation
Maximum a posteriori (MAP) inference over discrete Markov random fields is a
fundamental task spanning a wide spectrum of real-world applications, which is
known to be NP-hard for general graphs. In this paper, we propose a novel
semidefinite relaxation formulation (referred to as SDR) to estimate the MAP
assignment. Algorithmically, we develop an accelerated variant of the
alternating direction method of multipliers (referred to as SDPAD-LR) that can
effectively exploit the special structure of the new relaxation. Encouragingly,
the proposed procedure allows solving SDR for large-scale problems, e.g.,
problems on a grid graph comprising hundreds of thousands of variables with
multiple states per node. Compared with prior SDP solvers, SDPAD-LR is capable
of attaining comparable accuracy while exhibiting remarkably improved
scalability, in contrast to the commonly held belief that semidefinite
relaxation can only been applied on small-scale MRF problems. We have evaluated
the performance of SDR on various benchmark datasets including OPENGM2 and PIC
in terms of both the quality of the solutions and computation time.
Experimental results demonstrate that for a broad class of problems, SDPAD-LR
outperforms state-of-the-art algorithms in producing better MAP assignment in
an efficient manner.Comment: accepted to International Conference on Machine Learning (ICML 2014
Large-scale Binary Quadratic Optimization Using Semidefinite Relaxation and Applications
In computer vision, many problems such as image segmentation, pixel
labelling, and scene parsing can be formulated as binary quadratic programs
(BQPs). For submodular problems, cuts based methods can be employed to
efficiently solve large-scale problems. However, general nonsubmodular problems
are significantly more challenging to solve. Finding a solution when the
problem is of large size to be of practical interest, however, typically
requires relaxation. Two standard relaxation methods are widely used for
solving general BQPs--spectral methods and semidefinite programming (SDP), each
with their own advantages and disadvantages. Spectral relaxation is simple and
easy to implement, but its bound is loose. Semidefinite relaxation has a
tighter bound, but its computational complexity is high, especially for large
scale problems. In this work, we present a new SDP formulation for BQPs, with
two desirable properties. First, it has a similar relaxation bound to
conventional SDP formulations. Second, compared with conventional SDP methods,
the new SDP formulation leads to a significantly more efficient and scalable
dual optimization approach, which has the same degree of complexity as spectral
methods. We then propose two solvers, namely, quasi-Newton and smoothing Newton
methods, for the dual problem. Both of them are significantly more efficiently
than standard interior-point methods. In practice, the smoothing Newton solver
is faster than the quasi-Newton solver for dense or medium-sized problems,
while the quasi-Newton solver is preferable for large sparse/structured
problems. Our experiments on a few computer vision applications including
clustering, image segmentation, co-segmentation and registration show the
potential of our SDP formulation for solving large-scale BQPs.Comment: Fixed some typos. 18 pages. Accepted to IEEE Transactions on Pattern
Analysis and Machine Intelligenc
Consistent Second-Order Conic Integer Programming for Learning Bayesian Networks
Bayesian Networks (BNs) represent conditional probability relations among a
set of random variables (nodes) in the form of a directed acyclic graph (DAG),
and have found diverse applications in knowledge discovery. We study the
problem of learning the sparse DAG structure of a BN from continuous
observational data. The central problem can be modeled as a mixed-integer
program with an objective function composed of a convex quadratic loss function
and a regularization penalty subject to linear constraints. The optimal
solution to this mathematical program is known to have desirable statistical
properties under certain conditions. However, the state-of-the-art optimization
solvers are not able to obtain provably optimal solutions to the existing
mathematical formulations for medium-size problems within reasonable
computational times. To address this difficulty, we tackle the problem from
both computational and statistical perspectives. On the one hand, we propose a
concrete early stopping criterion to terminate the branch-and-bound process in
order to obtain a near-optimal solution to the mixed-integer program, and
establish the consistency of this approximate solution. On the other hand, we
improve the existing formulations by replacing the linear "big-" constraints
that represent the relationship between the continuous and binary indicator
variables with second-order conic constraints. Our numerical results
demonstrate the effectiveness of the proposed approaches
Theta Bodies for Polynomial Ideals
Inspired by a question of Lov\'asz, we introduce a hierarchy of nested
semidefinite relaxations of the convex hull of real solutions to an arbitrary
polynomial ideal, called theta bodies of the ideal. For the stable set problem
in a graph, the first theta body in this hierarchy is exactly Lov\'asz's theta
body of the graph. We prove that theta bodies are, up to closure, a version of
Lasserre's relaxations for real solutions to ideals, and that they can be
computed explicitly using combinatorial moment matrices. Theta bodies provide a
new canonical set of semidefinite relaxations for the max cut problem. For
vanishing ideals of finite point sets, we give several equivalent
characterizations of when the first theta body equals the convex hull of the
points. We also determine the structure of the first theta body for all ideals.Comment: 26 pages, 3 figure
Relax, no need to round: integrality of clustering formulations
We study exact recovery conditions for convex relaxations of point cloud
clustering problems, focusing on two of the most common optimization problems
for unsupervised clustering: -means and -median clustering. Motivations
for focusing on convex relaxations are: (a) they come with a certificate of
optimality, and (b) they are generic tools which are relatively parameter-free,
not tailored to specific assumptions over the input. More precisely, we
consider the distributional setting where there are clusters in
and data from each cluster consists of points sampled from a
symmetric distribution within a ball of unit radius. We ask: what is the
minimal separation distance between cluster centers needed for convex
relaxations to exactly recover these clusters as the optimal integral
solution? For the -median linear programming relaxation we show a tight
bound: exact recovery is obtained given arbitrarily small pairwise separation
between the balls. In other words, the pairwise center
separation is . Under the same distributional model, the
-means LP relaxation fails to recover such clusters at separation as large
as . Yet, if we enforce PSD constraints on the -means LP, we get
exact cluster recovery at center separation .
In contrast, common heuristics such as Lloyd's algorithm (a.k.a. the -means
algorithm) can fail to recover clusters in this setting; even with arbitrarily
large cluster separation, k-means++ with overseeding by any constant factor
fails with high probability at exact cluster recovery. To complement the
theoretical analysis, we provide an experimental study of the recovery
guarantees for these various methods, and discuss several open problems which
these experiments suggest.Comment: 30 pages, ITCS 201
A Riemannian low-rank method for optimization over semidefinite matrices with block-diagonal constraints
We propose a new algorithm to solve optimization problems of the form for a smooth function under the constraints that is positive
semidefinite and the diagonal blocks of are small identity matrices. Such
problems often arise as the result of relaxing a rank constraint (lifting). In
particular, many estimation tasks involving phases, rotations, orthonormal
bases or permutations fit in this framework, and so do certain relaxations of
combinatorial problems such as Max-Cut. The proposed algorithm exploits the
facts that (1) such formulations admit low-rank solutions, and (2) their
rank-restricted versions are smooth optimization problems on a Riemannian
manifold. Combining insights from both the Riemannian and the convex geometries
of the problem, we characterize when second-order critical points of the smooth
problem reveal KKT points of the semidefinite problem. We compare against state
of the art, mature software and find that, on certain interesting problem
instances, what we call the staircase method is orders of magnitude faster, is
more accurate and scales better. Code is available.Comment: 37 pages, 3 figure
- …