16 research outputs found

    The Modified MSA, a Gradient Flow and Convergence

    Full text link
    The modified Method of Successive Approximations (MSA) is an iterative scheme for approximating solutions to stochastic control problems in continuous time based on Pontryagin Optimality Principle which, starting with an initial open loop control, solves the forward equation, the backward adjoint equation and then performs a static minimization step. We observe that this is an implicit Euler scheme for a gradient flow system. We prove that appropriate interpolations of the iterates of the modified MSA converge to a gradient flow with rate τ\tau. We then study the convergence of this gradient flow as time goes to infinity. In the general (non-convex) case we prove that the gradient term itself converges to zero. This is a consequence of an energy identity which shows that the optimization objective decreases along the gradient flow. Moreover, in the convex case, when Pontryagin Optimality Principle provides a sufficient condition for optimality, we prove that the optimization objective converges at rate 1S\tfrac{1}{S} to its optimal value and at exponential rate under strong convexity. The main technical difficulties lie in obtaining appropriate properties of the Hamiltonian (growth, continuity). These are obtained by utilising the theory of Bounded Mean Oscillation (BMO) martingales required for estimates on the adjoint Backward Stochastic Differential Equation (BSDE)

    Gradient Flows for Regularized Stochastic Control Problems

    Full text link
    This paper studies stochastic control problems regularized by the relative entropy, where the action space is the space of measures. This setting includes relaxed control problems, problems of finding Markovian controls with the control function replaced by an idealized infinitely wide neural network and can be extended to the search for causal optimal transport maps. By exploiting the Pontryagin optimality principle, we identify suitable metric space on which we construct gradient flow for the measure-valued control process along which the cost functional is guaranteed to decrease. It is shown that under appropriate conditions, this gradient flow has an invariant measure which is the optimal control for the regularized stochastic control problem. If the problem we work with is sufficiently convex, the gradient flow converges exponentially fast. Furthermore, the optimal measured valued control admits Bayesian interpretation which means that one can incorporate prior knowledge when solving stochastic control problem. This work is motivated by a desire to extend the theoretical underpinning for the convergence of stochastic gradient type algorithms widely used in the reinforcement learning community to solve control problems

    Itô formula for processes taking values in intersection of finitely many Banach spaces

    Get PDF
    Motivated by applications to SPDEs we extend the It\^o formula for the square of the norm of a semimartingale y(t)y(t) from Gy\"ongy and Krylov (Stochastics 6(3):153-173, 1982) to the case \begin{equation*} \sum_{i=1}^m \int_{(0,t]} v_i^{\ast}(s)\,dA(s) + h(t)=:y(t)\in V \quad \text{dA×PdA\times \mathbb{P}-a.e.}, \end{equation*} where AA is an increasing right-continuous adapted process, viv_i^{\ast} is a progressively measurable process with values in ViV_i^{\ast}, the dual of a Banach space ViV_i, hh is a cadlag martingale with values in a Hilbert space HH, identified with its dual HH^{\ast}, and V:=V1V2VmV:=V_1\cap V_2 \cap \ldots \cap V_m is continuously and densely embedded in HH. The formula is proved under the condition that yVipi\|y\|_{V_i}^{p_i} and viViqi\|v_i^\ast\|_{V_i^\ast}^{q_i} are almost surely locally integrable with respect to dAdA for some conjugate exponents pi,qip_i, q_i. This condition is essentially weaker than the one which would arise in application of the results in Gy\"ongy and Krylov (Stochastics 6(3):153-173, 1982) to the semimartingale above.Comment: Updated to the version published in Stochastics and Partial Differential Equations: Analysis and Computation

    A modified MSA for stochastic control problems

    Get PDF
    The classical Method of Successive Approximations (MSA) is an iterative method for solving stochastic control problems and is derived from Pontryagin's optimality principle. It is known that the MSA may fail to converge. Using careful estimates for the backward stochastic differential equation (BSDE) this paper suggests a modification to the MSA algorithm. This modified MSA is shown to converge for general stochastic control problems with control in both the drift and diffusion coefficients. Under some additional assumptions the rate of convergence is shown. The results are valid without restrictions on the time horizon of the control problem, in contrast to iterative methods based on the theory of forward-backward stochastic differential equations

    Coercivity condition for higher moment a priori estimates for nonlinear SPDEs and existence of a solution under local monotonicity

    Get PDF
    Higher order moment estimates for solutions to nonlinear SPDEs governed by locally-monotone operators are obtained under appropriate coercivity condition. These are then used to extend known existence and uniqueness results for nonlinear SPDEs under local monotonicity conditions to allow derivatives in the operator acting on the solution under the stochastic integral.Comment: 32 page
    corecore