3 research outputs found
Physarum Powered Differentiable Linear Programming Layers and Applications
Consider a learning algorithm, which involves an internal call to an
optimization routine such as a generalized eigenvalue problem, a cone
programming problem or even sorting. Integrating such a method as layers within
a trainable deep network in a numerically stable way is not simple -- for
instance, only recently, strategies have emerged for eigendecomposition and
differentiable sorting. We propose an efficient and differentiable solver for
general linear programming problems which can be used in a plug and play manner
within deep neural networks as a layer. Our development is inspired by a
fascinating but not widely used link between dynamics of slime mold (physarum)
and mathematical optimization schemes such as steepest descent. We describe our
development and demonstrate the use of our solver in a video object
segmentation task and meta-learning for few-shot learning. We review the
relevant known results and provide a technical analysis describing its
applicability for our use cases. Our solver performs comparably with a
customized projected gradient descent method on the first task and outperforms
the very recently proposed differentiable CVXPY solver on the second task.
Experiments show that our solver converges quickly without the need for a
feasible initial point. Interestingly, our scheme is easy to implement and can
easily serve as layers whenever a learning procedure needs a fast approximate
solution to a LP, within a larger network
A Bayesian conjugate gradient method (with Discussion)
A fundamental task in numerical computation is the solution of large linear
systems. The conjugate gradient method is an iterative method which offers
rapid convergence to the solution, particularly when an effective
preconditioner is employed. However, for more challenging systems a substantial
error can be present even after many iterations have been performed. The
estimates obtained in this case are of little value unless further information
can be provided about the numerical error. In this paper we propose a novel
statistical model for this numerical error set in a Bayesian framework. Our
approach is a strict generalisation of the conjugate gradient method, which is
recovered as the posterior mean for a particular choice of prior. The estimates
obtained are analysed with Krylov subspace methods and a contraction result for
the posterior is presented. The method is then analysed in a simulation study
as well as being applied to a challenging problem in medical imaging