558 research outputs found
Inverse stochastic optimal controls
We study an inverse problem of the stochastic optimal control of general
diffusions with performance index having the quadratic penalty term of the
control process. Under mild conditions on the drift, the volatility, the cost
functions of the state, and under the assumption that the optimal control
belongs to the interior of the control set, we show that our inverse problem is
well-posed using a stochastic maximum principle. Then, with the well-posedness,
we reduce the inverse problem to some root finding problem of the expectation
of a random variable involved with the value function, which has a unique
solution. Based on this result, we propose a numerical method for our inverse
problem by replacing the expectation above with arithmetic mean of observed
optimal control processes and the corresponding state processes. The recent
progress of numerical analyses of Hamilton-Jacobi-Bellman equations enables the
proposed method to be implementable for multi-dimensional cases. In particular,
with the help of the kernel-based collocation method for
Hamilton-Jacobi-Bellman equations, our method for the inverse problems still
works well even when an explicit form of the value function is unavailable.
Several numerical experiments show that the numerical method recover the
unknown weight parameter with high accuracy
Optimal Reinforcement Learning for Gaussian Systems
The exploration-exploitation trade-off is among the central challenges of
reinforcement learning. The optimal Bayesian solution is intractable in
general. This paper studies to what extent analytic statements about optimal
learning are possible if all beliefs are Gaussian processes. A first order
approximation of learning of both loss and dynamics, for nonlinear,
time-varying systems in continuous time and space, subject to a relatively weak
restriction on the dynamics, is described by an infinite-dimensional partial
differential equation. An approximate finite-dimensional projection gives an
impression for how this result may be helpful.Comment: final pre-conference version of this NIPS 2011 paper. Once again,
please note some nontrivial changes to exposition and interpretation of the
results, in particular in Equation (9) and Eqs. 11-14. The algorithm and
results have remained the same, but their theoretical interpretation has
change
Nonparametric Infinite Horizon Kullback-Leibler Stochastic Control
We present two nonparametric approaches to Kullback-Leibler (KL) control, or
linearly-solvable Markov decision problem (LMDP) based on Gaussian processes
(GP) and Nystr\"{o}m approximation. Compared to recently developed parametric
methods, the proposed data-driven frameworks feature accurate function
approximation and efficient on-line operations. Theoretically, we derive the
mathematical connection of KL control based on dynamic programming with earlier
work in control theory which relies on information theoretic dualities for the
infinite time horizon case. Algorithmically, we give explicit optimal control
policies in nonparametric forms, and propose on-line update schemes with
budgeted computational costs. Numerical results demonstrate the effectiveness
and usefulness of the proposed frameworks
A Probabilistic Numerical Method for Fully Nonlinear Parabolic PDEs
We consider the probabilistic numerical scheme for fully nonlinear PDEs
suggested in \cite{cstv}, and show that it can be introduced naturally as a
combination of Monte Carlo and finite differences scheme without appealing to
the theory of backward stochastic differential equations. Our first main result
provides the convergence of the discrete-time approximation and derives a bound
on the discretization error in terms of the time step. An explicit
implementable scheme requires to approximate the conditional expectation
operators involved in the discretization. This induces a further Monte Carlo
error. Our second main result is to prove the convergence of the latter
approximation scheme, and to derive an upper bound on the approximation error.
Numerical experiments are performed for the approximation of the solution of
the mean curvature flow equation in dimensions two and three, and for two and
five-dimensional (plus time) fully-nonlinear Hamilton-Jacobi-Bellman equations
arising in the theory of portfolio optimization in financial mathematics
Backward Ornstein-Uhlenbeck transition operators and mild solutions of non-autonomous Hamilton-Jacobi equations in Banach spaces
In this paper we revisit the mild-solution approach to second-order
semi-linear PDEs of Hamilton-Jacobi type in infinite-dimensional spaces. We
show that a well-known result on existence of mild solutions in Hilbert spaces
can be easily extended to non-autonomous Hamilton-Jacobi equations in Banach
spaces. The main tool is the regularizing property of Ornstein-Uhlenbeck
transition evolution operators for stochastic Cauchy problems in Banach spaces
with time-dependent coefficient
Feynman-Kac representation of fully nonlinear PDEs and applications
The classical Feynman-Kac formula states the connection between linear
parabolic partial differential equations (PDEs), like the heat equation, and
expectation of stochastic processes driven by Brownian motion. It gives then a
method for solving linear PDEs by Monte Carlo simulations of random processes.
The extension to (fully)nonlinear PDEs led in the recent years to important
developments in stochastic analysis and the emergence of the theory of backward
stochastic differential equations (BSDEs), which can be viewed as nonlinear
Feynman-Kac formulas. We review in this paper the main ideas and results in
this area, and present implications of these probabilistic representations for
the numerical resolution of nonlinear PDEs, together with some applications to
stochastic control problems and model uncertainty in finance
Mean Field Games and Applications.
This text is inspired from a “Cours Bachelier” held in January 2009 and taught by Jean-Michel Lasry. This course was based upon the articles of the three authors and upon unpublished materials they developed. Proofs were not presented during the conferences and are now available. So are some issues that were only rapidly tackled during class.Mean Field Games;
- …