1,362 research outputs found
Accelerated Optimization Landscape of Linear-Quadratic Regulator
Linear-quadratic regulator (LQR) is a landmark problem in the field of
optimal control, which is the concern of this paper. Generally, LQR is
classified into state-feedback LQR (SLQR) and output-feedback LQR (OLQR) based
on whether the full state is obtained. It has been suggested in existing
literature that both the SLQR and the OLQR could be viewed as
\textit{constrained nonconvex matrix optimization} problems in which the only
variable to be optimized is the feedback gain matrix. In this paper, we
introduce a first-order accelerated optimization framework of handling the LQR
problem, and give its convergence analysis for the cases of SLQR and OLQR,
respectively.
Specifically, a Lipschiz Hessian property of LQR performance criterion is
presented, which turns out to be a crucial property for the application of
modern optimization techniques. For the SLQR problem, a continuous-time hybrid
dynamic system is introduced, whose solution trajectory is shown to converge
exponentially to the optimal feedback gain with Nesterov-optimal order
( the condition number). Then, the
symplectic Euler scheme is utilized to discretize the hybrid dynamic system,
and a Nesterov-type method with a restarting rule is proposed that preserves
the continuous-time convergence rate, i.e., the discretized algorithm admits
the Nesterov-optimal convergence order. For the OLQR problem, a Hessian-free
accelerated framework is proposed, which is a two-procedure method consisting
of semiconvex function optimization and negative curvature exploitation. In a
time , the method can find an
-stationary point of the performance criterion; this entails that the
method improves upon the complexity of vanilla
gradient descent. Moreover, our method provides the second-order guarantee of
stationary point
Decentralized Stochastic Linear-Quadratic Optimal Control with Risk Constraint and Partial Observation
This paper addresses a risk-constrained decentralized stochastic
linear-quadratic optimal control problem with one remote controller and one
local controller, where the risk constraint is posed on the cumulative state
weighted variance in order to reduce the oscillation of system trajectory. In
this model, local controller can only partially observe the system state, and
sends the estimate of state to remote controller through an unreliable channel,
whereas the channel from remote controller to local controllers is perfect. For
the considered constrained optimization problem, we first punish the risk
constraint into cost function through Lagrange multiplier method, and the
resulting augmented cost function will include a quadratic mean-field term of
state. In the sequel, for any but fixed multiplier, explicit solutions to
finite-horizon and infinite-horizon mean-field decentralized linear-quadratic
problems are derived together with necessary and sufficient condition on the
mean-square stability of optimal system. Then, approach to find the optimal
Lagrange multiplier is presented based on bisection method. Finally, two
numerical examples are given to show the efficiency of the obtained results
Policy Optimization of Finite-Horizon Kalman Filter with Unknown Noise Covariance
This paper is on learning the Kalman gain by policy optimization method.
Firstly, we reformulate the finite-horizon Kalman filter as a policy
optimization problem of the dual system. Secondly, we obtain the global linear
convergence of exact gradient descent method in the setting of known
parameters. Thirdly, the gradient estimation and stochastic gradient descent
method are proposed to solve the policy optimization problem, and further the
global linear convergence and sample complexity of stochastic gradient descent
are provided for the setting of unknown noise covariance matrices and known
model parameters
Continuous-time Mean-Variance Portfolio Selection with Stochastic Parameters
This paper studies a continuous-time market {under stochastic environment}
where an agent, having specified an investment horizon and a target terminal
mean return, seeks to minimize the variance of the return with multiple stocks
and a bond. In the considered model firstly proposed by [3], the mean returns
of individual assets are explicitly affected by underlying Gaussian economic
factors. Using past and present information of the asset prices, a
partial-information stochastic optimal control problem with random coefficients
is formulated. Here, the partial information is due to the fact that the
economic factors can not be directly observed. Via dynamic programming theory,
the optimal portfolio strategy can be constructed by solving a deterministic
forward Riccati-type ordinary differential equation and two linear
deterministic backward ordinary differential equations
- …