Search CORE

8,928 research outputs found

Rethinking the Discount Factor in Reinforcement Learning: A Decision Theoretic Approach

Author: Pitis Silviu
Publication venue
Publication date: 07/02/2019
Field of study

Reinforcement learning (RL) agents have traditionally been tasked with maximizing the value function of a Markov decision process (MDP), either in continuous settings, with fixed discount factor

\gamma < 1

, or in episodic settings, with

\gamma = 1

. While this has proven effective for specific tasks with well-defined objectives (e.g., games), it has never been established that fixed discounting is suitable for general purpose use (e.g., as a model of human preferences). This paper characterizes rationality in sequential decision making using a set of seven axioms and arrives at a form of discounting that generalizes traditional fixed discounting. In particular, our framework admits a state-action dependent "discount" factor that is not constrained to be less than 1, so long as there is eventual long run discounting. Although this broadens the range of possible preference structures in continuous settings, we show that there exists a unique "optimizing MDP" with fixed

\gamma < 1

whose optimal value function matches the true utility of the optimal policy, and we quantify the difference between value and utility for suboptimal policies. Our work can be seen as providing a normative justification for (a slight generalization of) Martha White's RL task formalism (2017) and other recent departures from the traditional RL, and is relevant to task specification in RL, inverse RL and preference-based RL.Comment: 8 pages + 1 page supplement. In proceedings of AAAI 2019. Slides, poster and bibtex available at https://silviupitis.com/#rethinking-the-discount-factor-in-reinforcement-learning-a-decision-theoretic-approac

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

The cost of information

Author: Pomatto Luciano
Strack Philipp
Tamuz Omer
Publication venue
Publication date: 04/02/2019
Field of study

We develop an axiomatic theory of information acquisition that captures the idea of constant marginal costs in information production: the cost of generating two independent signals is the sum of their costs, and generating a signal with probability half costs half its original cost. Together with a monotonicity and a continuity conditions, these axioms determine the cost of a signal up to a vector of parameters. These parameters have a clear economic interpretation and determine the difficulty of distinguishing states. We argue that this cost function is a versatile modeling tool that leads to more realistic predictions than mutual information.Comment: 52 pages, 4 figure

arXiv.org e-Print Archive

Expected Multi-Utility Theorems with Topological Continuity Axioms

Author: Evren Özgür
Publication venue
Publication date
Field of study

Research Papers in Economics

A unified theory of cone metric spaces and its applications to the fixed point theory

Author: Proinov Petko D.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 21/11/2011
Field of study

In this paper we develop a unified theory for cone metric spaces over a solid vector space. As an application of the new theory we present full statements of the iterated contraction principle and the Banach contraction principle in cone metric spaces over a solid vector space.Comment: 51 page

arXiv.org e-Print Archive

Springer - Publisher Connector