1,398 research outputs found
Randomized Algorithms for Nonconvex Nonsmooth Optimization
Nonsmooth optimization problems arise in a variety of applications including robust control, robust optimization, eigenvalue optimization, compressed sensing, and decomposition methods for large-scale or complex optimization problems. When convexity is present, such problems are relatively easier to solve. Optimization methods for convex nonsmooth optimization have been studied for decades. For example, bundle methods are a leading technique for convex nonsmooth minimization. However, these and other methods that have been developed for solving convex problems are either inapplicable or can be inefficient when applied to solve nonconvex problems. The motivation of the work in this thesis is to design robust and efficient algorithms for solving nonsmooth optimization problems, particularly when nonconvexity is present.First, we propose an adaptive gradient sampling (AGS) algorithm, which is based on a recently developed technique known as the gradient sampling (GS) algorithm. Our AGS algorithm improves the computational efficiency of GS in critical ways. Then, we propose a BFGS gradient sampling (BFGS-GS) algorithm, which is a hybrid between a standard Broyden-Fletcher-Goldfarb-Shanno (BFGS) and the GS method. Our BFGS-GS algorithm is more efficient than our previously proposed AGS algorithm and also competitive with (and in some ways outperforms) other contemporary solvers for nonsmooth nonconvex optimization. Finally, we propose a few additional extensions of the GS framework---one in which we merge GS ideas with those from bundle methods, one in which we incorporate smoothing techniques in order to minimize potentially non-Lipschitz objective functions, and one in which we tailor GS methods for solving regularization problems. We describe all the proposed algorithms in detail. In addition, for all the algorithm variants, we prove global convergence guarantees under suitable assumptions. Moreover, we perform numerical experiments to illustrate the efficiency of our algorithms. The test problems considered in our experiments include academic test problems as well as practical problems that arise in applications of nonsmooth optimization
An Inequality Constrained SL/QP Method for Minimizing the Spectral Abscissa
We consider a problem in eigenvalue optimization, in particular finding a
local minimizer of the spectral abscissa - the value of a parameter that
results in the smallest value of the largest real part of the spectrum of a
matrix system. This is an important problem for the stabilization of control
systems. Many systems require the spectra to lie in the left half plane in
order for them to be stable. The optimization problem, however, is difficult to
solve because the underlying objective function is nonconvex, nonsmooth, and
non-Lipschitz. In addition, local minima tend to correspond to points of
non-differentiability and locally non-Lipschitz behavior. We present a
sequential linear and quadratic programming algorithm that solves a series of
linear or quadratic subproblems formed by linearizing the surfaces
corresponding to the largest eigenvalues. We present numerical results
comparing the algorithms to the state of the art
Multiobjective Robust Control with HIFOO 2.0
Multiobjective control design is known to be a difficult problem both in
theory and practice. Our approach is to search for locally optimal solutions of
a nonsmooth optimization problem that is built to incorporate minimization
objectives and constraints for multiple plants. We report on the success of
this approach using our public-domain Matlab toolbox HIFOO 2.0, comparing our
results with benchmarks in the literature
Nonconvex Nonsmooth Low-Rank Minimization via Iteratively Reweighted Nuclear Norm
The nuclear norm is widely used as a convex surrogate of the rank function in
compressive sensing for low rank matrix recovery with its applications in image
recovery and signal processing. However, solving the nuclear norm based relaxed
convex problem usually leads to a suboptimal solution of the original rank
minimization problem. In this paper, we propose to perform a family of
nonconvex surrogates of -norm on the singular values of a matrix to
approximate the rank function. This leads to a nonconvex nonsmooth minimization
problem. Then we propose to solve the problem by Iteratively Reweighted Nuclear
Norm (IRNN) algorithm. IRNN iteratively solves a Weighted Singular Value
Thresholding (WSVT) problem, which has a closed form solution due to the
special properties of the nonconvex surrogate functions. We also extend IRNN to
solve the nonconvex problem with two or more blocks of variables. In theory, we
prove that IRNN decreases the objective function value monotonically, and any
limit point is a stationary point. Extensive experiments on both synthesized
data and real images demonstrate that IRNN enhances the low-rank matrix
recovery compared with state-of-the-art convex algorithms
Asynchronous Optimization Methods for Efficient Training of Deep Neural Networks with Guarantees
Asynchronous distributed algorithms are a popular way to reduce
synchronization costs in large-scale optimization, and in particular for neural
network training. However, for nonsmooth and nonconvex objectives, few
convergence guarantees exist beyond cases where closed-form proximal operator
solutions are available. As most popular contemporary deep neural networks lead
to nonsmooth and nonconvex objectives, there is now a pressing need for such
convergence guarantees. In this paper, we analyze for the first time the
convergence of stochastic asynchronous optimization for this general class of
objectives. In particular, we focus on stochastic subgradient methods allowing
for block variable partitioning, where the shared-memory-based model is
asynchronously updated by concurrent processes. To this end, we first introduce
a probabilistic model which captures key features of real asynchronous
scheduling between concurrent processes; under this model, we establish
convergence with probability one to an invariant set for stochastic subgradient
methods with momentum.
From the practical perspective, one issue with the family of methods we
consider is that it is not efficiently supported by machine learning
frameworks, as they mostly focus on distributed data-parallel strategies. To
address this, we propose a new implementation strategy for shared-memory based
training of deep neural networks, whereby concurrent parameter servers are
utilized to train a partitioned but shared model in single- and multi-GPU
settings. Based on this implementation, we achieve on average 1.2x speed-up in
comparison to state-of-the-art training methods for popular image
classification tasks without compromising accuracy
- …