30 research outputs found
Uniqueness of Kusuoka Representations
This paper addresses law invariant coherent risk measures and their Kusuoka
representations. By elaborating the existence of a minimal representation we
show that every Kusuoka representation can be reduced to its minimal
representation. Uniqueness -- in a sense specified in the paper -- of the risk
measure's Kusuoka representation is derived from this initial result.
Further, stochastic order relations are employed to identify the minimal
Kusuoka representation. It is shown that measures in the minimal representation
are extremal with respect to the order relations. The tools are finally
employed to provide the minimal representation for important practical
examples. Although the Kusuoka representation is usually given only for
nonatomic probability spaces, this presentation closes the gap to spaces with
atoms
Min Max Generalization for Two-stage Deterministic Batch Mode Reinforcement Learning: Relaxation Schemes
We study the minmax optimization problem introduced in [22] for computing
policies for batch mode reinforcement learning in a deterministic setting.
First, we show that this problem is NP-hard. In the two-stage case, we provide
two relaxation schemes. The first relaxation scheme works by dropping some
constraints in order to obtain a problem that is solvable in polynomial time.
The second relaxation scheme, based on a Lagrangian relaxation where all
constraints are dualized, leads to a conic quadratic programming problem. We
also theoretically prove and empirically illustrate that both relaxation
schemes provide better results than those given in [22]
Building up time-consistency for risk measures and dynamic optimization
International audienceIn stochastic optimal control, one deals with sequential decision-making under uncertainty; with dynamic risk measures, one assesses stochastic processes (costs) as time goes on and information accumulates. Under the same vocable of time-consistency (or dynamic-consistency), both theories coin two different notions: the latter is consistency between successive evaluations of a stochas-tic processes by a dynamic risk measure (a form of monotonicity); the former is consistency between solutions to intertemporal stochastic optimization problems. Interestingly, both notions meet in their use of dynamic programming, or nested, equations. We provide a theoretical framework that offers i) basic ingredients to jointly define dynamic risk measures and corresponding intertemporal stochastic optimization problems ii) common sets of assumptions that lead to time-consistency for both. We highlight the role of time and risk preferences — materialized in one-step aggregators — in time-consistency. Depending on how one moves from one-step time and risk preferences to intertemporal time and risk preferences, and depending on their compatibility (commutation), one will or will not observe time-consistency. We also shed light on the relevance of information structure by giving an explicit role to a state control dynamical system, with a state that parameterizes risk measures and is the input to optimal policies
Some Remarks on Stochastic Versions of the Ramsey Growth Model *
Abstract. In this note we focus attention on stochastic versions of the Ramsey growth model if either for a given time horizon expected value of the considered utility function should be maximized or if for infinite time horizon maximal average utility should be obtained. In contrast to the standard Ramsey economy growth model we assume that the production function considered in the economy model is influenced by some random factor with some specific properties. The aim is to discuss various approaches suitable for finding optimal policy of the "stochasticized" Ramsey model. To this end, we summarize basic features of multistage stochastic programming and stochastic dynamic programming -the two main methodologies that can be used to handle the above problem. Finally, we show how these approaches can be employed for finding optimal control policies for the "stochasticized" versions of the Ramsey problem if full or only partial information on the development of the economy over time is available. Keywords: Economic dynamics, Ramsey growth model with disturbance, multistage stochastic programs, stochastic dynamic programming, finding optimal policies. JEL classification: C61, E21, E22 AMS classification: 90C40, 91B15, 91B16 Ramsey growth model The heart of the seminal paper of F. Ramsey [15] on mathematical theory of saving is an economy producing output from labour and capital and the task is to decide how to divide production between consumption and capital accumulation to maximize the global utility of the consumption. Ramsey's model is purely deterministic originally considered in continuous-time setting; Ramsey suggested some variational methods for finding an optimal policy how to divide the production between consumption and capital accumulation. In the present section we formulate the Ramsey model in the discrete-time setting similarly as in the recent literature on economic growth models (see e.g. Le Van and Dana [4], Heer and Mauße
Minimax decision rules for planning under uncertainty: drawbacks and remedies
It is common to use minimax rules to make planning decisions when there is great uncertainty about what may happen in the future. Using minimax rules avoids the need to determine probabilities for each future scenario, which is an attractive feature in many public sector settings. However there are potential problems in the application of a minimax approach. In this paper our aim is to give guidance for planners considering a minimax approach, including minimax regret which is one popular version of this. We give an analysis of the behaviour of minimax rules in the case with a finite set of possible future scenarios. Minimax rules will have sensitivity to the choice of a small number of scenarios. When regret-based rules are used there are also problems arising since the independence of irrelevant alternatives property fails, which can lead to opportunities to game the process. We analyse these phenomena considering cases where the decision variables are chosen from a convex set in Rⁿ, as well as cases with a finite set of decision choices. We show that the drawbacks of minimax regret hold even when restrictions are placed on the problem setup, and we show how working with a structured set of scenarios can ameliorate the difficulty of having a final decision depend on the characteristics of just a handful of extreme scenarios
Best-Arm Identification for Quantile Bandits with Privacy
We study the best-arm identification problem in multi-armed bandits with
stochastic, potentially private rewards, when the goal is to identify the arm
with the highest quantile at a fixed, prescribed level. First, we propose a
(non-private) successive elimination algorithm for strictly optimal best-arm
identification, we show that our algorithm is -PAC and we characterize
its sample complexity. Further, we provide a lower bound on the expected number
of pulls, showing that the proposed algorithm is essentially optimal up to
logarithmic factors. Both upper and lower complexity bounds depend on a special
definition of the associated suboptimality gap, designed in particular for the
quantile bandit problem, as we show when the gap approaches zero, best-arm
identification is impossible. Second, motivated by applications where the
rewards are private, we provide a differentially private successive elimination
algorithm whose sample complexity is finite even for distributions with
infinite support-size, and we characterize its sample complexity as well. Our
algorithms do not require prior knowledge of either the suboptimality gap or
other statistical information related to the bandit problem at hand.Comment: 24 pages, 4 figure