5,880 research outputs found
On the relationship between bilevel decomposition algorithms and direct interior-point methods
Engineers have been using bilevel decomposition algorithms to solve certain nonconvex large-scale optimization problems arising in engineering design projects. These algorithms transform the large-scale problem into a bilevel program with one upperlevel problem (the master problem) and several lower-level problems (the subproblems). Unfortunately, there is analytical and numerical evidence that some of these commonly used bilevel decomposition algorithms may fail to converge even when the starting point is very close to the minimizer. In this paper, we establish a relationship between a particular bilevel decomposition algorithm, which only performs one iteration of an interior-point method when solving the subproblems, and a direct interior-point method, which solves the problem in its original (integrated) form. Using this relationship, we formally prove that the bilevel decomposition algorithm converges locally at a superlinear rate. The relevance of our analysis is that it bridges the gap between the incipient local convergence theory of bilevel decomposition algorithms and the mature theory of direct interior-point methods
Constrained Deep Networks: Lagrangian Optimization via Log-Barrier Extensions
This study investigates the optimization aspects of imposing hard inequality
constraints on the outputs of CNNs. In the context of deep networks,
constraints are commonly handled with penalties for their simplicity, and
despite their well-known limitations. Lagrangian-dual optimization has been
largely avoided, except for a few recent works, mainly due to the computational
complexity and stability/convergence issues caused by alternating explicit dual
updates/projections and stochastic optimization. Several studies showed that,
surprisingly for deep CNNs, the theoretical and practical advantages of
Lagrangian optimization over penalties do not materialize in practice. We
propose log-barrier extensions, which approximate Lagrangian optimization of
constrained-CNN problems with a sequence of unconstrained losses. Unlike
standard interior-point and log-barrier methods, our formulation does not need
an initial feasible solution. Furthermore, we provide a new technical result,
which shows that the proposed extensions yield an upper bound on the duality
gap. This generalizes the duality-gap result of standard log-barriers, yielding
sub-optimality certificates for feasible solutions. While sub-optimality is not
guaranteed for non-convex problems, our result shows that log-barrier
extensions are a principled way to approximate Lagrangian optimization for
constrained CNNs via implicit dual variables. We report comprehensive weakly
supervised segmentation experiments, with various constraints, showing that our
formulation outperforms substantially the existing constrained-CNN methods,
both in terms of accuracy, constraint satisfaction and training stability, more
so when dealing with a large number of constraints
Hessian barrier algorithms for linearly constrained optimization problems
In this paper, we propose an interior-point method for linearly constrained
optimization problems (possibly nonconvex). The method - which we call the
Hessian barrier algorithm (HBA) - combines a forward Euler discretization of
Hessian Riemannian gradient flows with an Armijo backtracking step-size policy.
In this way, HBA can be seen as an alternative to mirror descent (MD), and
contains as special cases the affine scaling algorithm, regularized Newton
processes, and several other iterative solution methods. Our main result is
that, modulo a non-degeneracy condition, the algorithm converges to the
problem's set of critical points; hence, in the convex case, the algorithm
converges globally to the problem's minimum set. In the case of linearly
constrained quadratic programs (not necessarily convex), we also show that the
method's convergence rate is for some
that depends only on the choice of kernel function (i.e., not on the problem's
primitives). These theoretical results are validated by numerical experiments
in standard non-convex test functions and large-scale traffic assignment
problems.Comment: 27 pages, 6 figure
- …