12 research outputs found
Physarum Powered Differentiable Linear Programming Layers and Applications
Consider a learning algorithm, which involves an internal call to an
optimization routine such as a generalized eigenvalue problem, a cone
programming problem or even sorting. Integrating such a method as layers within
a trainable deep network in a numerically stable way is not simple -- for
instance, only recently, strategies have emerged for eigendecomposition and
differentiable sorting. We propose an efficient and differentiable solver for
general linear programming problems which can be used in a plug and play manner
within deep neural networks as a layer. Our development is inspired by a
fascinating but not widely used link between dynamics of slime mold (physarum)
and mathematical optimization schemes such as steepest descent. We describe our
development and demonstrate the use of our solver in a video object
segmentation task and meta-learning for few-shot learning. We review the
relevant known results and provide a technical analysis describing its
applicability for our use cases. Our solver performs comparably with a
customized projected gradient descent method on the first task and outperforms
the very recently proposed differentiable CVXPY solver on the second task.
Experiments show that our solver converges quickly without the need for a
feasible initial point. Interestingly, our scheme is easy to implement and can
easily serve as layers whenever a learning procedure needs a fast approximate
solution to a LP, within a larger network
Efficient Relaxations for Dense CRFs with Sparse Higher Order Potentials
Dense conditional random fields (CRFs) have become a popular framework for
modelling several problems in computer vision such as stereo correspondence and
multi-class semantic segmentation. By modelling long-range interactions, dense
CRFs provide a labelling that captures finer detail than their sparse
counterparts. Currently, the state-of-the-art algorithm performs mean-field
inference using a filter-based method but fails to provide a strong theoretical
guarantee on the quality of the solution. A question naturally arises as to
whether it is possible to obtain a maximum a posteriori (MAP) estimate of a
dense CRF using a principled method. Within this paper, we show that this is
indeed possible. We will show that, by using a filter-based method, continuous
relaxations of the MAP problem can be optimised efficiently using
state-of-the-art algorithms. Specifically, we will solve a quadratic
programming (QP) relaxation using the Frank-Wolfe algorithm and a linear
programming (LP) relaxation by developing a proximal minimisation framework. By
exploiting labelling consistency in the higher-order potentials and utilising
the filter-based method, we are able to formulate the above algorithms such
that each iteration has a complexity linear in the number of classes and random
variables. The presented algorithms can be applied to any labelling problem
using a dense CRF with sparse higher-order potentials. In this paper, we use
semantic segmentation as an example application as it demonstrates the ability
of the algorithm to scale to dense CRFs with large dimensions. We perform
experiments on the Pascal dataset to indicate that the presented algorithms are
able to attain lower energies than the mean-field inference method
Efficient inference for fully-connected CRFs with stationarity
The Conditional Random Field (CRF) is a popular tool for object-based image segmentation. CRFs used in prac-tice typically have edges only between adjacent image pix-els. To represent object relationship statistics beyond adja-cent pixels, prior work either represents only weak spatial information using the segmented regions, or encodes only global object co-occurrences. In this paper, we propose a unified model that augments the pixel-wise CRFs to cap-ture object spatial relationships. To this end, we use a fully connected CRF, which has an edge for each pair of pixels. The edge potentials are defined to capture the spatial in-formation and preserve the object boundaries at the same time. Traditional inference methods, such as belief propa-gation and graph cuts, are impractical in such a case where billions of edges are defined. Under only one assumption that the spatial relationships among different objects only depend on their relative positions (spatially stationary), we develop an efficient inference algorithm that converges in a few seconds on a standard resolution image, where belief propagation takes more than one hour for a single iteration. 1
Probabilistic Inference Based Message-Passing for Resource Constrained DCOPs
Distributed constraint optimization (DCOP) is an important framework for coordinated multiagent decision making. We address a practically use-ful variant of DCOP, called resource-constrained DCOP (RC-DCOP), which takes into account agents ’ consumption of shared limited resources. We present a promising new class of algorithm for RC-DCOPs by translating the underlying co-ordination problem to probabilistic inference. Us-ing inference techniques such as expectation-maximization and convex optimization machinery, we develop a novel convergent message-passing al-gorithm for RC-DCOPs. Experiments on standard benchmarks show that our approach provides bet-ter quality than previous best DCOP algorithms and has much lower failure rate. Comparisons against an efficient centralized solver show that our ap-proach provides near-optimal solutions, and is sig-nificantly faster on larger instances.
Message Passing Algorithms for MAP Estimation using DC Programming
We address the problem of finding the most likely assignment or MAP estimation in a Markov random field. We analyze the linear programming formulation of MAP through the lens of difference of convex functions (DC) programming, and use the concaveconvex procedure (CCCP) to develop efficient message-passing solvers. The resulting algorithms are guaranteed to converge to a global optimum of the well-studied local polytope, an outer bound on the MAP marginal polytope. To tighten the outer bound, we show how to combine it with the mean-field based inner bound and, again, solve it using CCCP. We also identify a useful relationship between the DC formulations and some recently proposed algorithms based on Bregman divergence. Experimentally, this hybrid approach produces optimal solutions for a range of hard OR problems and nearoptimal solutions for standard benchmarks.
Large-scale Binary Quadratic Optimization Using Semidefinite Relaxation and Applications
In computer vision, many problems such as image segmentation, pixel
labelling, and scene parsing can be formulated as binary quadratic programs
(BQPs). For submodular problems, cuts based methods can be employed to
efficiently solve large-scale problems. However, general nonsubmodular problems
are significantly more challenging to solve. Finding a solution when the
problem is of large size to be of practical interest, however, typically
requires relaxation. Two standard relaxation methods are widely used for
solving general BQPs--spectral methods and semidefinite programming (SDP), each
with their own advantages and disadvantages. Spectral relaxation is simple and
easy to implement, but its bound is loose. Semidefinite relaxation has a
tighter bound, but its computational complexity is high, especially for large
scale problems. In this work, we present a new SDP formulation for BQPs, with
two desirable properties. First, it has a similar relaxation bound to
conventional SDP formulations. Second, compared with conventional SDP methods,
the new SDP formulation leads to a significantly more efficient and scalable
dual optimization approach, which has the same degree of complexity as spectral
methods. We then propose two solvers, namely, quasi-Newton and smoothing Newton
methods, for the dual problem. Both of them are significantly more efficiently
than standard interior-point methods. In practice, the smoothing Newton solver
is faster than the quasi-Newton solver for dense or medium-sized problems,
while the quasi-Newton solver is preferable for large sparse/structured
problems. Our experiments on a few computer vision applications including
clustering, image segmentation, co-segmentation and registration show the
potential of our SDP formulation for solving large-scale BQPs.Comment: Fixed some typos. 18 pages. Accepted to IEEE Transactions on Pattern
Analysis and Machine Intelligenc