181 research outputs found

    MDP Algorithms for portfolio optimization problems in pure jump markets

    Get PDF
    We consider the problem of maximizing the expected utility of the terminal wealth of a portfolio in a continuous-time pure jump market with general utility function. This leads to an optimal control problem for Piecewise Deterministic Markov Processes. Using an embedding procedure we solve the problem by looking at a discrete-time contracting Markov Decision Process. Our aim is to show that this point of view has a number of advantages, in particular as far as computational aspects are concerned. We characterize the value function as the unique fixed point of the dynamic programming operator and prove the existence of optimal portfolios. Moreover, we show that value iteration as well as Howard\u27s policy improvement algorithm work. Finally we give error bounds when the utility function is approximated and when we discretize the state space. A numerical example is presented and our approach is compared to the approximating Markov chain method

    Markov decision process algorithms for wealth allocation problems with defaultable bonds

    Get PDF
    This paper is concerned with analysing optimal wealth allocation techniques within a defaultable financial market similar to Bielecki and Jang (2007). It studies a portfolio optimization problem combining a continuous-time jump market and a defaultable security; and presents numerical solutions through the conversion into a Markov decision process and characterization of its value function as a unique fixed point to a contracting operator. This work analyses allocation strategies under several families of utilities functions, and highlights significant portfolio selection differences with previously reported results

    Hedging of Financial Derivative Contracts via Monte Carlo Tree Search

    Full text link
    The construction of approximate replication strategies for derivative contracts in incomplete markets is a key problem of financial engineering. Recently Reinforcement Learning algorithms for pricing and hedging under realistic market conditions have attracted significant interest. While financial research mostly focused on variations of QQ-learning, in Artificial Intelligence Monte Carlo Tree Search is the recognized state-of-the-art method for various planning problems, such as the games of Hex, Chess, Go,... This article introduces Monte Carlo Tree Search as a method to solve the stochastic optimal control problem underlying the pricing and hedging of financial derivatives. As compared to QQ-learning it combines reinforcement learning with tree search techniques. As a consequence Monte Carlo Tree Search has higher sample efficiency, is less prone to over-fitting to specific market models and generally learns stronger policies faster. In our experiments we find that Monte Carlo Tree Search, being the world-champion in games like Chess and Go, is easily capable of directly maximizing the utility of investor's terminal wealth without an intermediate mathematical theory.Comment: Added figures. Added references. Corrected typos. 15 pages, 5 figure

    Results in stochastic control: optimal prediction problems and Markov decision processes

    Get PDF
    The following thesis is divided in two main topics. The first part studies variations of optimal prediction problems introduced in Shiryaev, Zhou and Xu (2008) and Du Toit and Peskir (2009) to a randomized terminal-time set up and different families of utility measures. The work presents optimal stopping rules that apply under different criteria, introduces a numerical technique to build approximations of stopping boundaries for fixed terminal time problems and suggest previously reported stopping rules extend to certain generalizations of measures. The second part of the thesis is concerned with analysing optimal wealth allocation techniques within a defaultable financial market similar to Bielecki and Jang (2007). It studies a portfolio optimization problem combining a continuous time jump market and a defaultable security; and presents numerical solutions through the conversion into a Markov Decision Process and characterization of its value function as a unique fixed point to a contracting operator. This work analyses allocation strategies under several families of utilities functions, and highlights significant portfolio selection differences with previously reported results

    Results in stochastic control: optimal prediction problems and Markov decision processes

    Get PDF
    The following thesis is divided in two main topics. The first part studies variations of optimal prediction problems introduced in Shiryaev, Zhou and Xu (2008) and Du Toit and Peskir (2009) to a randomized terminal-time set up and different families of utility measures. The work presents optimal stopping rules that apply under different criteria, introduces a numerical technique to build approximations of stopping boundaries for fixed terminal time problems and suggest previously reported stopping rules extend to certain generalizations of measures. The second part of the thesis is concerned with analysing optimal wealth allocation techniques within a defaultable financial market similar to Bielecki and Jang (2007). It studies a portfolio optimization problem combining a continuous time jump market and a defaultable security; and presents numerical solutions through the conversion into a Markov Decision Process and characterization of its value function as a unique fixed point to a contracting operator. This work analyses allocation strategies under several families of utilities functions, and highlights significant portfolio selection differences with previously reported results

    Optimal control of piecewise deterministic Markov processes with finite time horizon

    Get PDF
    In this paper we study controlled Piecewise Deterministic Markov Processes with finite time horizon and unbounded rewards. Using an embedding procedure we reduce these problems to discrete-time Markov Decision Processes. Under some continuity and compactness conditions we establish the existence of an optimal policy and show that the value function is the unique solution of the Bellman equation. It is remarkable that this statement is true for unbounded rewards and without any contraction assumptions. Further conditions imply the existence of optimal nonrelaxed controls. We highlight our findings by two examples from financial mathematics
    corecore