Search CORE

374,000 research outputs found

Dynkin games with Poisson random intervention times

Author: Liang Gechun
Sun Haodong
Publication venue
Publication date: 16/07/2019
Field of study

This paper introduces a new class of Dynkin games, where the two players are allowed to make their stopping decisions at a sequence of exogenous Poisson arrival times. The value function and the associated optimal stopping strategy are characterized by the solution of a backward stochastic differential equation. The paper further applies the model to study the optimal conversion and calling strategies of convertible bonds, and their asymptotics when the Poisson intensity goes to infinity

arXiv.org e-Print Archive

Warwick Research Archives Portal Repository

Modeling continuous-time financial markets with capital gains taxes

Author: Ben-Tahar Imen
Soner Halil Mete
Touzi Nizar
Publication venue
Publication date: 01/09/2007
Field of study

We formulate a model of continuous-time financial market consisting of a bank account with constant interest rate and one risky asset subject to capital gains taxes. We consider the problem of maximizing expected utility from future consumption in infinite horizon. This is the continuous-time version of the model introduced by Dammon, Spatt and Zhang [11]. The taxation rule is linear so that it allows for tax credits when capital gains losses are experienced. In this context, wash sales are optimal. Our main contribution is to derive lower and upper bounds on the value function in terms of the corresponding value in a tax-free and frictionless model. While the upper bound corresponds to the value function in a tax-free model, the lower bound is a consequence of wash sales. As an important implication of these bounds, we derive an explicit first order expansion of our value function for small interest rate and tax rate coefficients. In order to examine the accuracy of this approximation, we provide a characterization of the value function in terms of the associated dynamic programming equation, and we suggest a numerical approximation scheme based on finite differences and the Howard algorithm. The numerical results show that the first order Taylor expansion is reasonably accurate for reasonable market data

Sabanci University Research Database

Certified Reinforcement Learning with Logic Guidance

Author: Abate Alessandro
Hasanbeig Mohammadhosein
Kroening Daniel
Publication venue
Publication date: 10/02/2020
Field of study

This paper proposes the first model-free Reinforcement Learning (RL) framework to synthesise policies for unknown, and continuous-state Markov Decision Processes (MDPs), such that a given linear temporal property is satisfied. We convert the given property into a Limit Deterministic Buchi Automaton (LDBA), namely a finite-state machine expressing the property. Exploiting the structure of the LDBA, we shape a synchronous reward function on-the-fly, so that an RL algorithm can synthesise a policy resulting in traces that probabilistically satisfy the linear temporal property. This probability (certificate) is also calculated in parallel with policy learning when the state space of the MDP is finite: as such, the RL algorithm produces a policy that is certified with respect to the property. Under the assumption of finite state space, theoretical guarantees are provided on the convergence of the RL algorithm to an optimal policy, maximising the above probability. We also show that our method produces ''best available'' control policies when the logical property cannot be satisfied. In the general case of a continuous state space, we propose a neural network architecture for RL and we empirically show that the algorithm finds satisfying policies, if there exist such policies. The performance of the proposed framework is evaluated via a set of numerical examples and benchmarks, where we observe an improvement of one order of magnitude in the number of iterations required for the policy synthesis, compared to existing approaches whenever available.Comment: This article draws from arXiv:1801.08099, arXiv:1809.0782

arXiv.org e-Print Archive

Optimal control of continuous-time Markov chains with noise-free observation

Author: Calvia Alessandro
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 30/11/2017
Field of study

We consider an infinite horizon optimal control problem for a continuous-time Markov chain

X

in a finite set

I

with noise-free partial observation. The observation process is defined as

Y_t = h(X_t)

t \geq 0

, where

h

is a given map defined on

I

. The observation is noise-free in the sense that the only source of randomness is the process

X

itself. The aim is to minimize a discounted cost functional and study the associated value function

V

. After transforming the control problem with partial observation into one with complete observation (the separated problem) using filtering equations, we provide a link between the value function

v

associated to the latter control problem and the original value function

V

. Then, we present two different characterizations of

v

(and indirectly of

V

): on one hand as the unique fixed point of a suitably defined contraction mapping and on the other hand as the unique constrained viscosity solution (in the sense of Soner) of a HJB integro-differential equation. Under suitable assumptions, we finally prove the existence of an optimal control

arXiv.org e-Print Archive

Archivio istituzionale della Ricerca - Università degli Studi di Parma

Archivio della ricerca- LUISS Libera Università Internazionale degli Studi Sociali Guido Carli di Roma

Recommended from our members

Understanding Model-Based Reinforcement Learning and its Application in Safe Reinforcement Learning

Author: Hu Dingcheng
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

Model-based reinforcement learning algorithms have been shown to achieve successful results on various continuous control benchmarks, but the understanding of model-based methods is limited. We try to interpret how model-based method works through novel experiments on state-of-the-art algorithms with an emphasis on the model learning part. We evaluate the role of the model learning in policy optimization and propose methods to learn a more accurate model. With a better understanding of model-based reinforcement learning, we then apply model-based methods to solve safe reinforcement learning (RL) problems with near-zero violation of hard constraints throughout training. Drawing an analogy with how humans and animals learn to perform safe actions, we break down the safe RL problem into three stages. First, we train agents in a constraint-free environment to learn a performant policy for reaching high rewards, and simultaneously learn a model of the dynamics. Second, we use model-based methods to plan safe actions and train a safeguarding policy from these actions through imitation. Finally, we propose a factored framework to train an overall policy that mixes the performant policy and the safeguarding policy. This three-step curriculum ensures near-zero violation of safety constraints at all times. As an advantage of model-based method, the sample complexity required at the second and third steps of the process is significantly lower than model-free methods and can enable online safe learning. We demonstrate the effectiveness of our methods in various continuous control problems and analyze the advantages over state-of-the-art approaches

eScholarship - University of California