16 research outputs found
The Modified MSA, a Gradient Flow and Convergence
The modified Method of Successive Approximations (MSA) is an iterative scheme
for approximating solutions to stochastic control problems in continuous time
based on Pontryagin Optimality Principle which, starting with an initial open
loop control, solves the forward equation, the backward adjoint equation and
then performs a static minimization step. We observe that this is an implicit
Euler scheme for a gradient flow system. We prove that appropriate
interpolations of the iterates of the modified MSA converge to a gradient flow
with rate . We then study the convergence of this gradient flow as time
goes to infinity. In the general (non-convex) case we prove that the gradient
term itself converges to zero. This is a consequence of an energy identity
which shows that the optimization objective decreases along the gradient flow.
Moreover, in the convex case, when Pontryagin Optimality Principle provides a
sufficient condition for optimality, we prove that the optimization objective
converges at rate to its optimal value and at exponential rate
under strong convexity. The main technical difficulties lie in obtaining
appropriate properties of the Hamiltonian (growth, continuity). These are
obtained by utilising the theory of Bounded Mean Oscillation (BMO) martingales
required for estimates on the adjoint Backward Stochastic Differential Equation
(BSDE)
Gradient Flows for Regularized Stochastic Control Problems
This paper studies stochastic control problems regularized by the relative
entropy, where the action space is the space of measures. This setting includes
relaxed control problems, problems of finding Markovian controls with the
control function replaced by an idealized infinitely wide neural network and
can be extended to the search for causal optimal transport maps. By exploiting
the Pontryagin optimality principle, we identify suitable metric space on which
we construct gradient flow for the measure-valued control process along which
the cost functional is guaranteed to decrease. It is shown that under
appropriate conditions, this gradient flow has an invariant measure which is
the optimal control for the regularized stochastic control problem. If the
problem we work with is sufficiently convex, the gradient flow converges
exponentially fast. Furthermore, the optimal measured valued control admits
Bayesian interpretation which means that one can incorporate prior knowledge
when solving stochastic control problem. This work is motivated by a desire to
extend the theoretical underpinning for the convergence of stochastic gradient
type algorithms widely used in the reinforcement learning community to solve
control problems
Itô formula for processes taking values in intersection of finitely many Banach spaces
Motivated by applications to SPDEs we extend the It\^o formula for the square
of the norm of a semimartingale from Gy\"ongy and Krylov (Stochastics
6(3):153-173, 1982) to the case \begin{equation*} \sum_{i=1}^m \int_{(0,t]}
v_i^{\ast}(s)\,dA(s) + h(t)=:y(t)\in V \quad \text{-a.e.},
\end{equation*} where is an increasing right-continuous adapted process,
is a progressively measurable process with values in ,
the dual of a Banach space , is a cadlag martingale with values in a
Hilbert space , identified with its dual , and is continuously and densely embedded in .
The formula is proved under the condition that and
are almost surely locally integrable with
respect to for some conjugate exponents . This condition is
essentially weaker than the one which would arise in application of the results
in Gy\"ongy and Krylov (Stochastics 6(3):153-173, 1982) to the semimartingale
above.Comment: Updated to the version published in Stochastics and Partial
Differential Equations: Analysis and Computation
A modified MSA for stochastic control problems
The classical Method of Successive Approximations (MSA) is an iterative
method for solving stochastic control problems and is derived from Pontryagin's
optimality principle. It is known that the MSA may fail to converge. Using
careful estimates for the backward stochastic differential equation (BSDE) this
paper suggests a modification to the MSA algorithm. This modified MSA is shown
to converge for general stochastic control problems with control in both the
drift and diffusion coefficients. Under some additional assumptions the rate of
convergence is shown. The results are valid without restrictions on the time
horizon of the control problem, in contrast to iterative methods based on the
theory of forward-backward stochastic differential equations
Coercivity condition for higher moment a priori estimates for nonlinear SPDEs and existence of a solution under local monotonicity
Higher order moment estimates for solutions to nonlinear SPDEs governed by
locally-monotone operators are obtained under appropriate coercivity condition.
These are then used to extend known existence and uniqueness results for
nonlinear SPDEs under local monotonicity conditions to allow derivatives in the
operator acting on the solution under the stochastic integral.Comment: 32 page