125,828 research outputs found
An Optimal Control Derivation of Nonlinear Smoothing Equations
The purpose of this paper is to review and highlight some connections between
the problem of nonlinear smoothing and optimal control of the Liouville
equation. The latter has been an active area of recent research interest owing
to work in mean-field games and optimal transportation theory. The nonlinear
smoothing problem is considered here for continuous-time Markov processes. The
observation process is modeled as a nonlinear function of a hidden state with
an additive Gaussian measurement noise. A variational formulation is described
based upon the relative entropy formula introduced by Newton and Mitter. The
resulting optimal control problem is formulated on the space of probability
distributions. The Hamilton's equation of the optimal control are related to
the Zakai equation of nonlinear smoothing via the log transformation. The
overall procedure is shown to generalize the classical Mortensen's minimum
energy estimator for the linear Gaussian problem.Comment: 7 pages, 0 figures, under peer reviewin
Relative entropy-regularized robust optimal order execution
The problem of order execution is cast as a relative entropy-regularized
robust optimal control problem in this article. The order execution agent's
goal is to maximize an objective functional associated with his profit-and-loss
of trading and simultaneously minimize the execution risk and the market's
liquidity and uncertainty. We model the market's liquidity and uncertainty by
the principle of least relative entropy associated with the market volume. The
problem of order execution is made into a relative entropy-regularized
stochastic differential game. Standard argument of dynamic programming yields
that the value function of the differential game satisfies a relative
entropy-regularized Hamilton-Jacobi-Isaacs (rHJI) equation. Under the
assumptions of linear-quadratic model with Gaussian prior, the rHJI equation
reduces to a system of Riccati and linear differential equations. Further
imposing constancy of the corresponding coefficients, the system of
differential equations can be solved in closed form, resulting in analytical
expressions for optimal strategy and trajectory as well as the posterior
distribution of market volume. Numerical examples illustrating the optimal
strategies and the comparisons with conventional trading strategies are
conducted.Comment: 32 pages, 8 figure
Discretizing Distributions with Exact Moments: Error Estimate and Convergence Analysis
The maximum entropy principle is a powerful tool for solving underdetermined
inverse problems. This paper considers the problem of discretizing a continuous
distribution, which arises in various applied fields. We obtain the
approximating distribution by minimizing the Kullback-Leibler information
(relative entropy) of the unknown discrete distribution relative to an initial
discretization based on a quadrature formula subject to some moment
constraints. We study the theoretical error bound and the convergence of this
approximation method as the number of discrete points increases. We prove that
(i) the theoretical error bound of the approximate expectation of any bounded
continuous function has at most the same order as the quadrature formula we
start with, and (ii) the approximate discrete distribution weakly converges to
the given continuous distribution. Moreover, we present some numerical examples
that show the advantage of the method and apply to numerically solving an
optimal portfolio problem.Comment: 20 pages, 14 figure
Optimal control formulation of transition path problems for Markov Jump Processes
Among various rare events, the effective computation of transition paths
connecting metastable states in a stochastic model is an important problem.
This paper proposes a stochastic optimal control formulation for transition
path problems in an infinite time horizon for Markov jump processes on polish
space. An unbounded terminal cost at a stopping time and a controlled
transition rate for the jump process regulate the transition from one
metastable state to another. The running cost is taken as an entropy form of
the control velocity, in contrast to the quadratic form for diffusion
processes. Using the Girsanov transformation for Markov jump processes, the
optimal control problem in both finite time and infinite time horizon with
stopping time fit into one framework: the optimal change of measures in the
C\`adl\`ag path space via minimizing their relative entropy. We prove that the
committor function, solved from the backward equation with appropriate boundary
conditions, yields an explicit formula for the optimal path measure and the
associated optimal control for the transition path problem. The unbounded
terminal cost leads to a singular transition rate (unbounded control velocity),
for which, the Gamma convergence technique is applied to pass the limit for a
regularized optimal path measure. The limiting path measure is proved to solve
a Martingale problem with an optimally controlled transition rate and the
associated optimal control is given by Doob-h transformation. The resulting
optimally controlled process can realize the transitions almost surely.Comment: 31 page
A Minimum Relative Entropy Principle for Learning and Acting
This paper proposes a method to construct an adaptive agent that is universal
with respect to a given class of experts, where each expert is an agent that
has been designed specifically for a particular environment. This adaptive
control problem is formalized as the problem of minimizing the relative entropy
of the adaptive agent from the expert that is most suitable for the unknown
environment. If the agent is a passive observer, then the optimal solution is
the well-known Bayesian predictor. However, if the agent is active, then its
past actions need to be treated as causal interventions on the I/O stream
rather than normal probability conditions. Here it is shown that the solution
to this new variational problem is given by a stochastic controller called the
Bayesian control rule, which implements adaptive behavior as a mixture of
experts. Furthermore, it is shown that under mild assumptions, the Bayesian
control rule converges to the control law of the most suitable expert.Comment: 36 pages, 11 figure
- …