1,721 research outputs found

    Randomized Algorithms for Nonconvex Nonsmooth Optimization

    Get PDF
    Nonsmooth optimization problems arise in a variety of applications including robust control, robust optimization, eigenvalue optimization, compressed sensing, and decomposition methods for large-scale or complex optimization problems. When convexity is present, such problems are relatively easier to solve. Optimization methods for convex nonsmooth optimization have been studied for decades. For example, bundle methods are a leading technique for convex nonsmooth minimization. However, these and other methods that have been developed for solving convex problems are either inapplicable or can be inefficient when applied to solve nonconvex problems. The motivation of the work in this thesis is to design robust and efficient algorithms for solving nonsmooth optimization problems, particularly when nonconvexity is present.First, we propose an adaptive gradient sampling (AGS) algorithm, which is based on a recently developed technique known as the gradient sampling (GS) algorithm. Our AGS algorithm improves the computational efficiency of GS in critical ways. Then, we propose a BFGS gradient sampling (BFGS-GS) algorithm, which is a hybrid between a standard Broyden-Fletcher-Goldfarb-Shanno (BFGS) and the GS method. Our BFGS-GS algorithm is more efficient than our previously proposed AGS algorithm and also competitive with (and in some ways outperforms) other contemporary solvers for nonsmooth nonconvex optimization. Finally, we propose a few additional extensions of the GS framework---one in which we merge GS ideas with those from bundle methods, one in which we incorporate smoothing techniques in order to minimize potentially non-Lipschitz objective functions, and one in which we tailor GS methods for solving regularization problems. We describe all the proposed algorithms in detail. In addition, for all the algorithm variants, we prove global convergence guarantees under suitable assumptions. Moreover, we perform numerical experiments to illustrate the efficiency of our algorithms. The test problems considered in our experiments include academic test problems as well as practical problems that arise in applications of nonsmooth optimization

    An Inequality Constrained SL/QP Method for Minimizing the Spectral Abscissa

    Full text link
    We consider a problem in eigenvalue optimization, in particular finding a local minimizer of the spectral abscissa - the value of a parameter that results in the smallest value of the largest real part of the spectrum of a matrix system. This is an important problem for the stabilization of control systems. Many systems require the spectra to lie in the left half plane in order for them to be stable. The optimization problem, however, is difficult to solve because the underlying objective function is nonconvex, nonsmooth, and non-Lipschitz. In addition, local minima tend to correspond to points of non-differentiability and locally non-Lipschitz behavior. We present a sequential linear and quadratic programming algorithm that solves a series of linear or quadratic subproblems formed by linearizing the surfaces corresponding to the largest eigenvalues. We present numerical results comparing the algorithms to the state of the art

    Asynchronous Optimization Methods for Efficient Training of Deep Neural Networks with Guarantees

    Full text link
    Asynchronous distributed algorithms are a popular way to reduce synchronization costs in large-scale optimization, and in particular for neural network training. However, for nonsmooth and nonconvex objectives, few convergence guarantees exist beyond cases where closed-form proximal operator solutions are available. As most popular contemporary deep neural networks lead to nonsmooth and nonconvex objectives, there is now a pressing need for such convergence guarantees. In this paper, we analyze for the first time the convergence of stochastic asynchronous optimization for this general class of objectives. In particular, we focus on stochastic subgradient methods allowing for block variable partitioning, where the shared-memory-based model is asynchronously updated by concurrent processes. To this end, we first introduce a probabilistic model which captures key features of real asynchronous scheduling between concurrent processes; under this model, we establish convergence with probability one to an invariant set for stochastic subgradient methods with momentum. From the practical perspective, one issue with the family of methods we consider is that it is not efficiently supported by machine learning frameworks, as they mostly focus on distributed data-parallel strategies. To address this, we propose a new implementation strategy for shared-memory based training of deep neural networks, whereby concurrent parameter servers are utilized to train a partitioned but shared model in single- and multi-GPU settings. Based on this implementation, we achieve on average 1.2x speed-up in comparison to state-of-the-art training methods for popular image classification tasks without compromising accuracy

    NESTT: A Nonconvex Primal-Dual Splitting Method for Distributed and Stochastic Optimization

    Get PDF
    We study a stochastic and distributed algorithm for nonconvex problems whose objective consists of a sum of NN nonconvex Li/NL_i/N-smooth functions, plus a nonsmooth regularizer. The proposed NonconvEx primal-dual SpliTTing (NESTT) algorithm splits the problem into NN subproblems, and utilizes an augmented Lagrangian based primal-dual scheme to solve it in a distributed and stochastic manner. With a special non-uniform sampling, a version of NESTT achieves Ï”\epsilon-stationary solution using O((∑i=1NLi/N)2/Ï”)\mathcal{O}((\sum_{i=1}^N\sqrt{L_i/N})^2/\epsilon) gradient evaluations, which can be up to O(N)\mathcal{O}(N) times better than the (proximal) gradient descent methods. It also achieves Q-linear convergence rate for nonconvex ℓ1\ell_1 penalized quadratic problems with polyhedral constraints. Further, we reveal a fundamental connection between primal-dual based methods and a few primal only methods such as IAG/SAG/SAGA.Comment: 35 pages, 2 figure

    A Simple and Efficient Algorithm for Nonlinear Model Predictive Control

    Full text link
    We present PANOC, a new algorithm for solving optimal control problems arising in nonlinear model predictive control (NMPC). A usual approach to this type of problems is sequential quadratic programming (SQP), which requires the solution of a quadratic program at every iteration and, consequently, inner iterative procedures. As a result, when the problem is ill-conditioned or the prediction horizon is large, each outer iteration becomes computationally very expensive. We propose a line-search algorithm that combines forward-backward iterations (FB) and Newton-type steps over the recently introduced forward-backward envelope (FBE), a continuous, real-valued, exact merit function for the original problem. The curvature information of Newton-type methods enables asymptotic superlinear rates under mild assumptions at the limit point, and the proposed algorithm is based on very simple operations: access to first-order information of the cost and dynamics and low-cost direct linear algebra. No inner iterative procedure nor Hessian evaluation is required, making our approach computationally simpler than SQP methods. The low-memory requirements and simple implementation make our method particularly suited for embedded NMPC applications
    • 

    corecore