6,524 research outputs found

    Dynamic programming principle for classical and singular stochastic control with discretionary stopping

    Full text link
    We prove the dynamic programming principle (DPP) in a class of problems where an agent controls a dd-dimensional diffusive dynamics via both classical and singular controls and, moreover, is able to terminate the optimisation at a time of her choosing, prior to a given maturity. The time-horizon of the problem is random and it is the smallest between a fixed terminal time and the first exit time of the state dynamics from a Borel set. We consider both the cases in which the total available fuel for the singular control is either bounded or unbounded. We build upon existing proofs of DPP and extend results available in the traditional literature on singular control (e.g., Haussmann and Suo, SIAM J. Control Optim., 33, 1995) by relaxing some key assumptions and including the discretionary stopping feature. We also connect with more general versions of the DPP (e.g., Bouchard and Touzi, SIAM J. Control Optim., 49, 2011) by showing in detail how our class of problems meets the abstract requirements therein

    Interbank lending with benchmark rates: Pareto optima for a class of singular control games

    Get PDF
    We analyze a class of stochastic differential games of singular control, motivated by the study of a dynamic model of interbank lending with benchmark rates. We describe Pareto optima for this game and show how they may be achieved through the intervention of a regulator, whose policy is a solution to a singular stochastic control problem. Pareto optima are characterized in terms of the solutions to a new class of Skorokhod problems with piecewise-continuous free boundary. Pareto optimal policies are shown to correspond to the enforcement of endogenous bounds on interbank lending rates. Analytical comparison between Pareto optima and Nash equilibria provides insight into the impact of regulatory intervention on the stability of interbank rates.Comment: 31 pages; 1 figur

    Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition

    Full text link
    This paper presents the MAXQ approach to hierarchical reinforcement learning based on decomposing the target Markov decision process (MDP) into a hierarchy of smaller MDPs and decomposing the value function of the target MDP into an additive combination of the value functions of the smaller MDPs. The paper defines the MAXQ hierarchy, proves formal results on its representational power, and establishes five conditions for the safe use of state abstractions. The paper presents an online model-free learning algorithm, MAXQ-Q, and proves that it converges wih probability 1 to a kind of locally-optimal policy known as a recursively optimal policy, even in the presence of the five kinds of state abstraction. The paper evaluates the MAXQ representation and MAXQ-Q through a series of experiments in three domains and shows experimentally that MAXQ-Q (with state abstractions) converges to a recursively optimal policy much faster than flat Q learning. The fact that MAXQ learns a representation of the value function has an important benefit: it makes it possible to compute and execute an improved, non-hierarchical policy via a procedure similar to the policy improvement step of policy iteration. The paper demonstrates the effectiveness of this non-hierarchical execution experimentally. Finally, the paper concludes with a comparison to related work and a discussion of the design tradeoffs in hierarchical reinforcement learning.Comment: 63 pages, 15 figure

    On two-sided controls of a linear diffusion

    Get PDF
    siirretty Doriast

    Numerical investigation of the heterogeneous combustion processes of solid fuels

    Get PDF
    Two-phase computational modelling based on the Euler–Euler was developed to investigate the heterogeneous combustion processes of biomass, in the solid carbon phase, inside a newly designed combustion chamber (Model 1). A transient simulation was carried out for a small amount of carbon powder situated in a cup which was located at the centre of the combustion chamber. A heat source was provided to initiate the combustion with the air supplied by three injection nozzles. The results show that the combustion is sustained in the chamber, as evidenced by the flame temperature. An axisymmetric combustion model (Model 2) based on the Euler–Lagrange approach was formulated to model the combustion of pulverized coal. Three cases with three different char oxidation models are presented. The predicted results have good agreement with the available experimental data and showed that the combustion inside the reactor was affected by the particulate size. A number of simulations were carried out to find the best values of parameters suitable for predicting NOx pollutants
    • …
    corecore