126 research outputs found
Bundle-Level Type Methods Uniformly Optimal for Smooth and Nonsmooth Convex Optimization
The main goal of this paper is to develop uniformly optimal first-order
methods for convex programming (CP). By uniform optimality we mean that the
first-order methods themselves do not require the input of any problem
parameters, but can still achieve the best possible iteration complexity
bounds. By incorporating a multi-step acceleration scheme into the well-known
bundle-level method, we develop an accelerated bundle-level (ABL) method, and
show that it can achieve the optimal complexity for solving a general class of
black-box CP problems without requiring the input of any smoothness
information, such as, whether the problem is smooth, nonsmooth or weakly
smooth, as well as the specific values of Lipschitz constant and smoothness
level. We then develop a more practical, restricted memory version of this
method, namely the accelerated prox-level (APL) method. We investigate the
generalization of the APL method for solving certain composite CP problems and
an important class of saddle-point problems recently studied by Nesterov
[Mathematical Programming, 103 (2005), pp 127-152]. We present promising
numerical results for these new bundle-level methods applied to solve certain
classes of semidefinite programming (SDP) and stochastic programming (SP)
problems.Comment: A combination of the previous two papers submitted to Mathematical
Programming, i.e., "Bundle-type methods uniformly optimal for smooth and
nonsmooth convex optimization" (December 2010) and "Level methods uniformly
optimal for composite and structured nonsmooth convex optimization (April
2011
Gradient Sliding for Composite Optimization
We consider in this paper a class of composite optimization problems whose
objective function is given by the summation of a general smooth and nonsmooth
component, together with a relatively simple nonsmooth term. We present a new
class of first-order methods, namely the gradient sliding algorithms, which can
skip the computation of the gradient for the smooth component from time to
time. As a consequence, these algorithms require only gradient evaluations for the smooth component in order
to find an -solution for the composite problem, while still
maintaining the optimal bound on the total number of
subgradient evaluations for the nonsmooth component. We then present a
stochastic counterpart for these algorithms and establish similar complexity
bounds for solving an important class of stochastic composite optimization
problems. Moreover, if the smooth component in the composite function is
strongly convex, the developed gradient sliding algorithms can significantly
reduce the number of graduate and subgradient evaluations for the smooth and
nonsmooth component to and , respectively. Finally, we generalize these algorithms to the
case when the smooth component is replaced by a nonsmooth one possessing a
certain bi-linear saddle point structure
Robust affine control of linear stochastic systems
In this work we provide a computationally tractable procedure for designing
affine control policies, applied to constrained, discrete-time, partially
observable, linear systems subject to set bounded disturbances, stochastic
noise and potentially Markovian switching over a finite horizon.
We investigate the situation when performance specifications are expressed
via averaged quadratic inequalities on the random state-control trajectory. Our
methodology also applies to steering the density of the state-control
trajectory under set bounded uncertainty. Our developments are based on
expanding the notion of affine policies that are functions of the so-called
"purified outputs", to the class of Markov jump linear systems. This
re-parametrization of the set of policies, induces a bi-affine structure in the
state and control variables that can further be exploited via robust
optimization techniques, with the approximate inhomogeneous -lemma being the
cornerstone. Tractability is understood in the sense that for each type of
performance specification considered, an explicit convex program for selecting
the parameters specifying the control policy is provided.
Our contributions to the existing literature on the subject of robust
constrained control lies in the fact that we are addressing a wider class of
systems than the ones already studied, by including Markovian switching, and
the consideration of quadratic inequalities rather than just linear ones. Our
work expands on the previous investigations on finite horizon covariance
control by addressing the robustness issue and the possibility that the full
state may not be available, therefore enabling the steering of the
state-control trajectory density in the presence of disturbances under partial
observation.Comment: 36 pages, 2 figure
Random gradient extrapolation for distributed and stochastic optimization
In this paper, we consider a class of finite-sum convex optimization problems
defined over a distributed multiagent network with agents connected to a
central server. In particular, the objective function consists of the average
of () smooth components associated with each network agent together
with a strongly convex term. Our major contribution is to develop a new
randomized incremental gradient algorithm, namely random gradient extrapolation
method (RGEM), which does not require any exact gradient evaluation even for
the initial point, but can achieve the optimal
complexity bound in terms of the total number of gradient evaluations of
component functions to solve the finite-sum problems. Furthermore, we
demonstrate that for stochastic finite-sum optimization problems, RGEM
maintains the optimal complexity (up to a certain
logarithmic factor) in terms of the number of stochastic gradient computations,
but attains an complexity in terms of
communication rounds (each round involves only one agent). It is worth noting
that the former bound is independent of the number of agents , while the
latter one only linearly depends on or even for ill-conditioned
problems. To the best of our knowledge, this is the first time that these
complexity bounds have been obtained for distributed and stochastic
optimization problems. Moreover, our algorithms were developed based on a novel
dual perspective of Nesterov's accelerated gradient method
Asynchronous decentralized accelerated stochastic gradient descent
In this work, we introduce an asynchronous decentralized accelerated
stochastic gradient descent type of method for decentralized stochastic
optimization, considering communication and synchronization are the major
bottlenecks. We establish (resp.,
) communication complexity and
(resp., ) sampling
complexity for solving general convex (resp., strongly convex) problems
Algorithms for stochastic optimization with functional or expectation constraints
This paper considers the problem of minimizing an expectation function over a
closed convex set, coupled with a {\color{black} functional or expectation}
constraint on either decision variables or problem parameters. We first present
a new stochastic approximation (SA) type algorithm, namely the cooperative SA
(CSA), to handle problems with the constraint on devision variables. We show
that this algorithm exhibits the optimal rate of
convergence, in terms of both optimality gap and constraint violation, when the
objective and constraint functions are generally convex, where
denotes the optimality gap and infeasibility. Moreover, we show that this rate
of convergence can be improved to if the objective and
constraint functions are strongly convex. We then present a variant of CSA,
namely the cooperative stochastic parameter approximation (CSPA) algorithm, to
deal with the situation when the constraint is defined over problem parameters
and show that it exhibits similar optimal rate of convergence to CSA. It is
worth noting that CSA and CSPA are primal methods which do not require the
iterations on the dual space and/or the estimation on the size of the dual
variables. To the best of our knowledge, this is the first time that such
optimal SA methods for solving functional or expectation constrained stochastic
optimization are presented in the literature
Accelerated Gradient Methods for Nonconvex Nonlinear and Stochastic Programming
In this paper, we generalize the well-known Nesterov's accelerated gradient
(AG) method, originally designed for convex smooth optimization, to solve
nonconvex and possibly stochastic optimization problems. We demonstrate that by
properly specifying the stepsize policy, the AG method exhibits the best known
rate of convergence for solving general nonconvex smooth optimization problems
by using first-order information, similarly to the gradient descent method. We
then consider an important class of composite optimization problems and show
that the AG method can solve them uniformly, i.e., by using the same aggressive
stepsize policy as in the convex case, even if the problem turns out to be
nonconvex. We demonstrate that the AG method exhibits an optimal rate of
convergence if the composite problem is convex, and improves the best known
rate of convergence if the problem is nonconvex. Based on the AG method, we
also present new nonconvex stochastic approximation methods and show that they
can improve a few existing rates of convergence for nonconvex stochastic
optimization. To the best of our knowledge, this is the first time that the
convergence of the AG method has been established for solving nonconvex
nonlinear programming in the literature
Dynamic Stochastic Approximation for Multi-stage Stochastic Optimization
In this paper, we consider multi-stage stochastic optimization problems with
convex objectives and conic constraints at each stage. We present a new
stochastic first-order method, namely the dynamic stochastic approximation
(DSA) algorithm, for solving these types of stochastic optimization problems.
We show that DSA can achieve an optimal rate of
convergence in terms of the total number of required scenarios when applied to
a three-stage stochastic optimization problem. We further show that this rate
of convergence can be improved to when the objective
function is strongly convex. We also discuss variants of DSA for solving more
general multi-stage stochastic optimization problems with the number of stages
. The developed DSA algorithms only need to go through the scenario tree
once in order to compute an -solution of the multi-stage stochastic
optimization problem. As a result, the memory required by DSA only grows
linearly with respect to the number of stages. To the best of our knowledge,
this is the first time that stochastic approximation type methods are
generalized for multi-stage stochastic optimization with
Randomized First-Order Methods for Saddle Point Optimization
In this paper, we present novel randomized algorithms for solving saddle
point problems whose dual feasible region is given by the direct product of
many convex sets. Our algorithms can achieve an and rate of convergence, respectively, for general bilinear saddle point
and smooth bilinear saddle point problems based on a new prima-dual termination
criterion, and each iteration of these algorithms needs to solve only one
randomly selected dual subproblem. Moreover, these algorithms do not require
strongly convex assumptions on the objective function and/or the incorporation
of a strongly convex perturbation term. They do not necessarily require the
primal or dual feasible regions to be bounded or the estimation of the distance
from the initial point to the set of optimal solutions to be available either.
We show that when applied to linearly constrained problems, RPDs are equivalent
to certain randomized variants of the alternating direction method of
multipliers (ADMM), while a direct extension of ADMM does not necessarily
converge when the number of blocks exceeds two
An optimal randomized incremental gradient method
In this paper, we consider a class of finite-sum convex optimization problems
whose objective function is given by the summation of () smooth
components together with some other relatively simple terms. We first introduce
a deterministic primal-dual gradient (PDG) method that can achieve the optimal
black-box iteration complexity for solving these composite optimization
problems using a primal-dual termination criterion. Our major contribution is
to develop a randomized primal-dual gradient (RPDG) method, which needs to
compute the gradient of only one randomly selected smooth component at each
iteration, but can possibly achieve better complexity than PDG in terms of the
total number of gradient evaluations. More specifically, we show that the total
number of gradient evaluations performed by RPDG can be
times smaller, both in expectation and with high probability, than those
performed by deterministic optimal first-order methods under favorable
situations. We also show that the complexity of the RPDG method is not
improvable by developing a new lower complexity bound for a general class of
randomized methods for solving large-scale finite-sum convex optimization
problems. Moreover, through the development of PDG and RPDG, we introduce a
novel game-theoretic interpretation for these optimal methods for convex
optimization
- …