2,633 research outputs found
Global Solutions to Nonconvex Optimization of 4th-Order Polynomial and Log-Sum-Exp Functions
This paper presents a canonical dual approach for solving a nonconvex global
optimization problem governed by a sum of fourth-order polynomial and a
log-sum-exp function. Such a problem arises extensively in engineering and
sciences. Based on the canonical duality-triality theory, this nonconvex
problem is transformed to an equivalent dual problem, which can be solved
easily under certain conditions. We proved that both global minimizer and the
biggest local extrema of the primal problem can be obtained analytically from
the canonical dual solutions. As two special cases, a quartic polynomial
minimization and a minimax problem are discussed. Existence conditions are
derived, which can be used to classify easy and relative hard instances.
Applications are illustrated by several nonconvex and nonsmooth examples
Asynchronous Optimization Methods for Efficient Training of Deep Neural Networks with Guarantees
Asynchronous distributed algorithms are a popular way to reduce
synchronization costs in large-scale optimization, and in particular for neural
network training. However, for nonsmooth and nonconvex objectives, few
convergence guarantees exist beyond cases where closed-form proximal operator
solutions are available. As most popular contemporary deep neural networks lead
to nonsmooth and nonconvex objectives, there is now a pressing need for such
convergence guarantees. In this paper, we analyze for the first time the
convergence of stochastic asynchronous optimization for this general class of
objectives. In particular, we focus on stochastic subgradient methods allowing
for block variable partitioning, where the shared-memory-based model is
asynchronously updated by concurrent processes. To this end, we first introduce
a probabilistic model which captures key features of real asynchronous
scheduling between concurrent processes; under this model, we establish
convergence with probability one to an invariant set for stochastic subgradient
methods with momentum.
From the practical perspective, one issue with the family of methods we
consider is that it is not efficiently supported by machine learning
frameworks, as they mostly focus on distributed data-parallel strategies. To
address this, we propose a new implementation strategy for shared-memory based
training of deep neural networks, whereby concurrent parameter servers are
utilized to train a partitioned but shared model in single- and multi-GPU
settings. Based on this implementation, we achieve on average 1.2x speed-up in
comparison to state-of-the-art training methods for popular image
classification tasks without compromising accuracy
Successive Convex Approximation Algorithms for Sparse Signal Estimation with Nonconvex Regularizations
In this paper, we propose a successive convex approximation framework for
sparse optimization where the nonsmooth regularization function in the
objective function is nonconvex and it can be written as the difference of two
convex functions. The proposed framework is based on a nontrivial combination
of the majorization-minimization framework and the successive convex
approximation framework proposed in literature for a convex regularization
function. The proposed framework has several attractive features, namely, i)
flexibility, as different choices of the approximate function lead to different
type of algorithms; ii) fast convergence, as the problem structure can be
better exploited by a proper choice of the approximate function and the
stepsize is calculated by the line search; iii) low complexity, as the
approximate function is convex and the line search scheme is carried out over a
differentiable function; iv) guaranteed convergence to a stationary point. We
demonstrate these features by two example applications in subspace learning,
namely, the network anomaly detection problem and the sparse subspace
clustering problem. Customizing the proposed framework by adopting the
best-response type approximation, we obtain soft-thresholding with exact line
search algorithms for which all elements of the unknown parameter are updated
in parallel according to closed-form expressions. The attractive features of
the proposed algorithms are illustrated numerically.Comment: submitted to IEEE Journal of Selected Topics in Signal Processing,
special issue in Robust Subspace Learnin
Optimizing condition numbers
In this paper we study the problem of minimizing condition numbers over a compact convex subset of the cone of symmetric positive semidefinite matrices. We show that the condition number is a Clarke regular strongly pseudoconvex function. We prove that a global solution of the problem can be approximated by an exact or an inexact solution of a nonsmooth convex program. This asymptotic analysis provides a valuable tool for designing an implementable algorithm for solving the problem of minimizing condition numbers
- …