30 research outputs found

    Uniqueness of Kusuoka Representations

    Full text link
    This paper addresses law invariant coherent risk measures and their Kusuoka representations. By elaborating the existence of a minimal representation we show that every Kusuoka representation can be reduced to its minimal representation. Uniqueness -- in a sense specified in the paper -- of the risk measure's Kusuoka representation is derived from this initial result. Further, stochastic order relations are employed to identify the minimal Kusuoka representation. It is shown that measures in the minimal representation are extremal with respect to the order relations. The tools are finally employed to provide the minimal representation for important practical examples. Although the Kusuoka representation is usually given only for nonatomic probability spaces, this presentation closes the gap to spaces with atoms

    Min Max Generalization for Two-stage Deterministic Batch Mode Reinforcement Learning: Relaxation Schemes

    Full text link
    We study the minmax optimization problem introduced in [22] for computing policies for batch mode reinforcement learning in a deterministic setting. First, we show that this problem is NP-hard. In the two-stage case, we provide two relaxation schemes. The first relaxation scheme works by dropping some constraints in order to obtain a problem that is solvable in polynomial time. The second relaxation scheme, based on a Lagrangian relaxation where all constraints are dualized, leads to a conic quadratic programming problem. We also theoretically prove and empirically illustrate that both relaxation schemes provide better results than those given in [22]

    Building up time-consistency for risk measures and dynamic optimization

    Get PDF
    International audienceIn stochastic optimal control, one deals with sequential decision-making under uncertainty; with dynamic risk measures, one assesses stochastic processes (costs) as time goes on and information accumulates. Under the same vocable of time-consistency (or dynamic-consistency), both theories coin two different notions: the latter is consistency between successive evaluations of a stochas-tic processes by a dynamic risk measure (a form of monotonicity); the former is consistency between solutions to intertemporal stochastic optimization problems. Interestingly, both notions meet in their use of dynamic programming, or nested, equations. We provide a theoretical framework that offers i) basic ingredients to jointly define dynamic risk measures and corresponding intertemporal stochastic optimization problems ii) common sets of assumptions that lead to time-consistency for both. We highlight the role of time and risk preferences — materialized in one-step aggregators — in time-consistency. Depending on how one moves from one-step time and risk preferences to intertemporal time and risk preferences, and depending on their compatibility (commutation), one will or will not observe time-consistency. We also shed light on the relevance of information structure by giving an explicit role to a state control dynamical system, with a state that parameterizes risk measures and is the input to optimal policies

    Some Remarks on Stochastic Versions of the Ramsey Growth Model *

    Get PDF
    Abstract. In this note we focus attention on stochastic versions of the Ramsey growth model if either for a given time horizon expected value of the considered utility function should be maximized or if for infinite time horizon maximal average utility should be obtained. In contrast to the standard Ramsey economy growth model we assume that the production function considered in the economy model is influenced by some random factor with some specific properties. The aim is to discuss various approaches suitable for finding optimal policy of the "stochasticized" Ramsey model. To this end, we summarize basic features of multistage stochastic programming and stochastic dynamic programming -the two main methodologies that can be used to handle the above problem. Finally, we show how these approaches can be employed for finding optimal control policies for the "stochasticized" versions of the Ramsey problem if full or only partial information on the development of the economy over time is available. Keywords: Economic dynamics, Ramsey growth model with disturbance, multistage stochastic programs, stochastic dynamic programming, finding optimal policies. JEL classification: C61, E21, E22 AMS classification: 90C40, 91B15, 91B16 Ramsey growth model The heart of the seminal paper of F. Ramsey [15] on mathematical theory of saving is an economy producing output from labour and capital and the task is to decide how to divide production between consumption and capital accumulation to maximize the global utility of the consumption. Ramsey's model is purely deterministic originally considered in continuous-time setting; Ramsey suggested some variational methods for finding an optimal policy how to divide the production between consumption and capital accumulation. In the present section we formulate the Ramsey model in the discrete-time setting similarly as in the recent literature on economic growth models (see e.g. Le Van and Dana [4], Heer and Mauße

    Minimax decision rules for planning under uncertainty: drawbacks and remedies

    Get PDF
    It is common to use minimax rules to make planning decisions when there is great uncertainty about what may happen in the future. Using minimax rules avoids the need to determine probabilities for each future scenario, which is an attractive feature in many public sector settings. However there are potential problems in the application of a minimax approach. In this paper our aim is to give guidance for planners considering a minimax approach, including minimax regret which is one popular version of this. We give an analysis of the behaviour of minimax rules in the case with a finite set of possible future scenarios. Minimax rules will have sensitivity to the choice of a small number of scenarios. When regret-based rules are used there are also problems arising since the independence of irrelevant alternatives property fails, which can lead to opportunities to game the process. We analyse these phenomena considering cases where the decision variables are chosen from a convex set in Rⁿ, as well as cases with a finite set of decision choices. We show that the drawbacks of minimax regret hold even when restrictions are placed on the problem setup, and we show how working with a structured set of scenarios can ameliorate the difficulty of having a final decision depend on the characteristics of just a handful of extreme scenarios

    Best-Arm Identification for Quantile Bandits with Privacy

    Full text link
    We study the best-arm identification problem in multi-armed bandits with stochastic, potentially private rewards, when the goal is to identify the arm with the highest quantile at a fixed, prescribed level. First, we propose a (non-private) successive elimination algorithm for strictly optimal best-arm identification, we show that our algorithm is δ\delta-PAC and we characterize its sample complexity. Further, we provide a lower bound on the expected number of pulls, showing that the proposed algorithm is essentially optimal up to logarithmic factors. Both upper and lower complexity bounds depend on a special definition of the associated suboptimality gap, designed in particular for the quantile bandit problem, as we show when the gap approaches zero, best-arm identification is impossible. Second, motivated by applications where the rewards are private, we provide a differentially private successive elimination algorithm whose sample complexity is finite even for distributions with infinite support-size, and we characterize its sample complexity as well. Our algorithms do not require prior knowledge of either the suboptimality gap or other statistical information related to the bandit problem at hand.Comment: 24 pages, 4 figure
    corecore