5 research outputs found

    Optimal Stopping of a Risk Process when Claims are Covered immediately

    Get PDF
    The optimal stopping problem for the risk process with interests rates and when claims are covered immediately is considered. An insurance company receives premiums and pays out claims which have occured according to a renewal process and which have been recognized by them. The capital of the company is invested at some interest rate, the size of claims increase at the given rate according to inflation process. The immediate payment of claims decreases the company investment by a given rate. The aim is to find the stopping time which maximizes the capital of the company. The improvement to the known models by taking into account different scheme of claims payment and the possibility of rejection of the request by the insurance company is made. It leads to essentially new risk process and the solution of optimal stopping probleln is different

    Optimal stopping under partial observation: Near-value iteration

    Get PDF
    Abstract We propose a new approximate value iteration method, namely near-value iteration (NVI), to solve continuous-state optimal stopping problems under partial observation, which in general cannot be solved analytically and also pose a great challenge to numerical solutions. NVI is motivated by the expression of the value function as the supremum over an uncountable set of linear functions in the belief state. After a smart manipulation of the operations in the updating equation for the value function, we reduce the set to only two functions at every time step, so as to achieve significant computational savings. NVI yields a value function approximation bounded by the tightest lower and upper bounds that can be achieved by existing algorithms in the same class, so the NVI approximation is closer to the true value function than at least one of these bounds. We demonstrate the effectiveness of our approach on an example of pricing American options under stochastic volatility

    A Simulation Approach to Optimal Stopping Under Partial Information

    Get PDF
    We study the numerical solution of nonlinear partially observed optimal stopping problems. The system state is taken to be a multi-dimensional diffusion and drives the drift of the observation process, which is another multi-dimensional diffusion with correlated noise. Such models where the controller is not fully aware of her environment are of interest in applied probability and financial mathematics. We propose a new approximate numerical algorithm based on the particle filtering and regression Monte Carlo methods. The algorithm maintains a continuous state-space and yields an integrated approach to the filtering and control sub-problems. Our approach is entirely simulation-based and therefore allows for a robust implementation with respect to model specification. We carry out the error analysis of our scheme and illustrate with several computational examples. An extension to discretely observed stochastic volatility models is also considered

    Intention Inference and Decision Making with Hierarchical Gaussian Process Dynamics Models

    Get PDF
    Anticipation is crucial for fluent human-robot interaction, which allows a robot to independently coordinate its actions with human beings in joint activities. An anticipatory robot relies on a predictive model of its human partners, and selects its own action according to the model's predictions. Intention inference and decision making are key elements towards such anticipatory robots. In this thesis, we present a machine-learning approach to intention inference and decision making, based on Hierarchical Gaussian Process Dynamics Models (H-GPDMs). We first introduce the H-GPDM, a class of generic latent-variable dynamics models. The H-GPDM represents the generative process of complex human movements that are directed by exogenous driving factors. Incorporating the exogenous variables in the dynamics model, the H-GPDM achieves improved interpretation, analysis, and prediction of human movements. While exact inference of the exogenous variables and the latent states is intractable, we introduce an approximate method using variational Bayesian inference, and demonstrate the merits of the H-GPDM in three different applications of human movement analysis. The H-GPDM lays a foundation for the following studies on intention inference and decision making. Intention inference is an essential step towards anticipatory robots. For this purpose, we consider a special case of the H-GPDM, the Intention-Driven Dynamics Model (IDDM), which considers the human partners' intention as exogenous driving factors. The IDDM is applicable to intention inference from observed movements using Bayes' theorem, where the latent state variables are marginalized out. As most robotics applications are subject to real-time constraints, we introduce an efficient online algorithm that allows for real-time intention inference. We show that the IDDM achieved state-of-the-art performance in intention inference using two human-robot interaction scenarios, i.e., target prediction for robot table tennis and action recognition for interactive robots. Decision making based on a time series of predictions allows a robot to be proactive in its action selection, which involves a trade-off between the accuracy and confidence of the prediction and the time for executing a selected action. To address the problem of action selection and optimal timing for initiating the movement, we formulate the anticipatory action selection using Partially Observable Markov Decision Process, where the H-GPDM is adopted to update belief state and to estimate transition model. We present two approaches to policy learning and decision making, and show their effectiveness using human-robot table tennis. In addition, we consider decision making solely based on the preference of the human partners, where observations are not sufficient for reliable intention inference. We formulate it as a repeated game and present a learning approach to safe strategies that exploit the humans' preferences. The learned strategy enables action selection when reliable intention inference is not available due to insufficient observation, e.g., for a robot to return served balls from a human table tennis player. In this thesis, we use human-robot table tennis as a running example, where a key bottleneck is the limited amount of time for executing a hitting movement. Movement initiation usually requires an early decision on the type of action, such as a forehand or backhand hitting movement, at least 80ms before the opponent has hit the ball. The robot, therefore, needs to be anticipatory and proactive of the opponent's intended target. Using the proposed methods, the robot can predict the intended target of the opponent and initiate an appropriate hitting movement according to the prediction. Experimental results show that the proposed intention inference and decision making methods can substantially enhance the capability of the robot table tennis player, using both a physically realistic simulation and a real Barrett WAM robot arm with seven degrees of freedom
    corecore