181 research outputs found
MDP Algorithms for portfolio optimization problems in pure jump markets
We consider the problem of maximizing the expected utility of the terminal wealth of a portfolio in a continuous-time pure jump market with general utility function. This leads to an optimal control problem for Piecewise Deterministic Markov Processes. Using an embedding procedure we solve the problem by looking at a discrete-time contracting Markov Decision Process. Our aim is to show that this point of view has a number of advantages, in particular as far as computational aspects are concerned. We characterize the value function as the unique fixed point of the dynamic programming operator and prove the existence of optimal portfolios. Moreover, we show that value iteration as well as Howard\u27s policy improvement algorithm work. Finally we give error bounds when the utility function is approximated and when we discretize the state space. A numerical example is presented and our approach is compared to the approximating Markov chain method
Markov decision process algorithms for wealth allocation problems with defaultable bonds
This paper is concerned with analysing optimal wealth allocation techniques within a defaultable financial market similar to Bielecki and Jang (2007). It studies a portfolio optimization problem combining a continuous-time jump market and a defaultable security; and presents numerical solutions through the conversion into a Markov decision process and characterization of its value function as a unique fixed point to a contracting operator. This work analyses allocation strategies under several families of utilities functions, and highlights significant portfolio selection differences with previously reported results
Hedging of Financial Derivative Contracts via Monte Carlo Tree Search
The construction of approximate replication strategies for derivative
contracts in incomplete markets is a key problem of financial engineering.
Recently Reinforcement Learning algorithms for pricing and hedging under
realistic market conditions have attracted significant interest. While
financial research mostly focused on variations of -learning, in Artificial
Intelligence Monte Carlo Tree Search is the recognized state-of-the-art method
for various planning problems, such as the games of Hex, Chess, Go,... This
article introduces Monte Carlo Tree Search as a method to solve the stochastic
optimal control problem underlying the pricing and hedging of financial
derivatives. As compared to -learning it combines reinforcement learning
with tree search techniques. As a consequence Monte Carlo Tree Search has
higher sample efficiency, is less prone to over-fitting to specific market
models and generally learns stronger policies faster. In our experiments we
find that Monte Carlo Tree Search, being the world-champion in games like Chess
and Go, is easily capable of directly maximizing the utility of investor's
terminal wealth without an intermediate mathematical theory.Comment: Added figures. Added references. Corrected typos. 15 pages, 5 figure
Results in stochastic control: optimal prediction problems and Markov decision processes
The following thesis is divided in two main topics. The first part studies variations of optimal prediction problems introduced in Shiryaev, Zhou and Xu (2008) and Du Toit and Peskir (2009) to a randomized terminal-time set up and different families of utility measures. The work presents optimal stopping rules that apply under different criteria, introduces a numerical technique to build approximations of stopping boundaries for fixed terminal time problems and suggest previously reported stopping rules extend to certain generalizations of measures.
The second part of the thesis is concerned with analysing optimal wealth allocation techniques within a defaultable financial market similar to Bielecki and Jang (2007). It studies a portfolio optimization problem combining a continuous time jump market and a defaultable security; and presents numerical solutions through the conversion into a Markov Decision Process and characterization of its value function as a unique fixed point to a contracting operator. This work analyses allocation strategies under several families of utilities functions, and highlights significant portfolio selection differences with previously reported results
Results in stochastic control: optimal prediction problems and Markov decision processes
The following thesis is divided in two main topics. The first part studies variations of optimal prediction problems introduced in Shiryaev, Zhou and Xu (2008) and Du Toit and Peskir (2009) to a randomized terminal-time set up and different families of utility measures. The work presents optimal stopping rules that apply under different criteria, introduces a numerical technique to build approximations of stopping boundaries for fixed terminal time problems and suggest previously reported stopping rules extend to certain generalizations of measures.
The second part of the thesis is concerned with analysing optimal wealth allocation techniques within a defaultable financial market similar to Bielecki and Jang (2007). It studies a portfolio optimization problem combining a continuous time jump market and a defaultable security; and presents numerical solutions through the conversion into a Markov Decision Process and characterization of its value function as a unique fixed point to a contracting operator. This work analyses allocation strategies under several families of utilities functions, and highlights significant portfolio selection differences with previously reported results
Optimal control of piecewise deterministic Markov processes with finite time horizon
In this paper we study controlled Piecewise Deterministic Markov Processes with finite time horizon and unbounded rewards. Using an embedding procedure we reduce these problems to discrete-time Markov Decision Processes. Under some continuity and compactness conditions we establish the existence of an optimal policy and show that the value function is the unique solution of the Bellman equation. It is remarkable that this statement is true for unbounded rewards and without any contraction assumptions. Further conditions imply the existence of optimal nonrelaxed controls. We highlight our findings by two examples from financial mathematics
- …