1,721 research outputs found
Randomized Algorithms for Nonconvex Nonsmooth Optimization
Nonsmooth optimization problems arise in a variety of applications including robust control, robust optimization, eigenvalue optimization, compressed sensing, and decomposition methods for large-scale or complex optimization problems. When convexity is present, such problems are relatively easier to solve. Optimization methods for convex nonsmooth optimization have been studied for decades. For example, bundle methods are a leading technique for convex nonsmooth minimization. However, these and other methods that have been developed for solving convex problems are either inapplicable or can be inefficient when applied to solve nonconvex problems. The motivation of the work in this thesis is to design robust and efficient algorithms for solving nonsmooth optimization problems, particularly when nonconvexity is present.First, we propose an adaptive gradient sampling (AGS) algorithm, which is based on a recently developed technique known as the gradient sampling (GS) algorithm. Our AGS algorithm improves the computational efficiency of GS in critical ways. Then, we propose a BFGS gradient sampling (BFGS-GS) algorithm, which is a hybrid between a standard Broyden-Fletcher-Goldfarb-Shanno (BFGS) and the GS method. Our BFGS-GS algorithm is more efficient than our previously proposed AGS algorithm and also competitive with (and in some ways outperforms) other contemporary solvers for nonsmooth nonconvex optimization. Finally, we propose a few additional extensions of the GS framework---one in which we merge GS ideas with those from bundle methods, one in which we incorporate smoothing techniques in order to minimize potentially non-Lipschitz objective functions, and one in which we tailor GS methods for solving regularization problems. We describe all the proposed algorithms in detail. In addition, for all the algorithm variants, we prove global convergence guarantees under suitable assumptions. Moreover, we perform numerical experiments to illustrate the efficiency of our algorithms. The test problems considered in our experiments include academic test problems as well as practical problems that arise in applications of nonsmooth optimization
An Inequality Constrained SL/QP Method for Minimizing the Spectral Abscissa
We consider a problem in eigenvalue optimization, in particular finding a
local minimizer of the spectral abscissa - the value of a parameter that
results in the smallest value of the largest real part of the spectrum of a
matrix system. This is an important problem for the stabilization of control
systems. Many systems require the spectra to lie in the left half plane in
order for them to be stable. The optimization problem, however, is difficult to
solve because the underlying objective function is nonconvex, nonsmooth, and
non-Lipschitz. In addition, local minima tend to correspond to points of
non-differentiability and locally non-Lipschitz behavior. We present a
sequential linear and quadratic programming algorithm that solves a series of
linear or quadratic subproblems formed by linearizing the surfaces
corresponding to the largest eigenvalues. We present numerical results
comparing the algorithms to the state of the art
Asynchronous Optimization Methods for Efficient Training of Deep Neural Networks with Guarantees
Asynchronous distributed algorithms are a popular way to reduce
synchronization costs in large-scale optimization, and in particular for neural
network training. However, for nonsmooth and nonconvex objectives, few
convergence guarantees exist beyond cases where closed-form proximal operator
solutions are available. As most popular contemporary deep neural networks lead
to nonsmooth and nonconvex objectives, there is now a pressing need for such
convergence guarantees. In this paper, we analyze for the first time the
convergence of stochastic asynchronous optimization for this general class of
objectives. In particular, we focus on stochastic subgradient methods allowing
for block variable partitioning, where the shared-memory-based model is
asynchronously updated by concurrent processes. To this end, we first introduce
a probabilistic model which captures key features of real asynchronous
scheduling between concurrent processes; under this model, we establish
convergence with probability one to an invariant set for stochastic subgradient
methods with momentum.
From the practical perspective, one issue with the family of methods we
consider is that it is not efficiently supported by machine learning
frameworks, as they mostly focus on distributed data-parallel strategies. To
address this, we propose a new implementation strategy for shared-memory based
training of deep neural networks, whereby concurrent parameter servers are
utilized to train a partitioned but shared model in single- and multi-GPU
settings. Based on this implementation, we achieve on average 1.2x speed-up in
comparison to state-of-the-art training methods for popular image
classification tasks without compromising accuracy
NESTT: A Nonconvex Primal-Dual Splitting Method for Distributed and Stochastic Optimization
We study a stochastic and distributed algorithm for nonconvex problems whose
objective consists of a sum of nonconvex -smooth functions, plus a
nonsmooth regularizer. The proposed NonconvEx primal-dual SpliTTing (NESTT)
algorithm splits the problem into subproblems, and utilizes an augmented
Lagrangian based primal-dual scheme to solve it in a distributed and stochastic
manner. With a special non-uniform sampling, a version of NESTT achieves
-stationary solution using
gradient evaluations,
which can be up to times better than the (proximal) gradient
descent methods. It also achieves Q-linear convergence rate for nonconvex
penalized quadratic problems with polyhedral constraints. Further, we
reveal a fundamental connection between primal-dual based methods and a few
primal only methods such as IAG/SAG/SAGA.Comment: 35 pages, 2 figure
A Simple and Efficient Algorithm for Nonlinear Model Predictive Control
We present PANOC, a new algorithm for solving optimal control problems
arising in nonlinear model predictive control (NMPC). A usual approach to this
type of problems is sequential quadratic programming (SQP), which requires the
solution of a quadratic program at every iteration and, consequently, inner
iterative procedures. As a result, when the problem is ill-conditioned or the
prediction horizon is large, each outer iteration becomes computationally very
expensive. We propose a line-search algorithm that combines forward-backward
iterations (FB) and Newton-type steps over the recently introduced
forward-backward envelope (FBE), a continuous, real-valued, exact merit
function for the original problem. The curvature information of Newton-type
methods enables asymptotic superlinear rates under mild assumptions at the
limit point, and the proposed algorithm is based on very simple operations:
access to first-order information of the cost and dynamics and low-cost direct
linear algebra. No inner iterative procedure nor Hessian evaluation is
required, making our approach computationally simpler than SQP methods. The
low-memory requirements and simple implementation make our method particularly
suited for embedded NMPC applications
- âŠ