Search CORE

45,209 research outputs found

Discounting of reward sequences: a test of competing formal models of hyperbolic discounting

Author: Alexander William
Brown Joshua W
Zarr Noah
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2014
Field of study

Humans are known to discount future rewards hyperbolically in time. Nevertheless, a formal recursive model of hyperbolic discounting has been elusive until recently, with the introduction of the hyperbolically discounted temporal difference (HDTD) model. Prior to that, models of learning (especially reinforcement learning) have relied on exponential discounting, which generally provides poorer fits to behavioral data. Recently, it has been shown that hyperbolic discounting can also be approximated by a summed distribution of exponentially discounted values, instantiated in the μAgents model. The HDTD model and the μAgents model differ in one key respect, namely how they treat sequences of rewards. The μAgents model is a particular implementation of a Parallel discounting model, which values sequences based on the summed value of the individual rewards whereas the HDTD model contains a non-linear interaction. To discriminate among these models, we observed how subjects discounted a sequence of three rewards, and then we tested how well each candidate model fit the subject data. The results show that the Parallel model generally provides a better fit to the human data

Directory of Open Access Journals

Frontiers - Publisher Connector

Optimality of Universal Bayesian Sequence Prediction for General Loss and Alphabet

Author: Hutter Marcus
Publication venue
Publication date: 01/01/2002
Field of study

Various optimality properties of universal sequence predictors based on Bayes-mixtures in general, and Solomonoff's prediction scheme in particular, will be studied. The probability of observing

x_t

at time

t

, given past observations

x_1...x_{t-1}

can be computed with the chain rule if the true generating distribution

\mu

of the sequences

x_1x_2x_3...

is known. If

\mu

is unknown, but known to belong to a countable or continuous class \M one can base ones prediction on the Bayes-mixture

\xi

defined as a

w_\nu

-weighted sum or integral of distributions \nu\in\M. The cumulative expected loss of the Bayes-optimal universal prediction scheme based on

\xi

is shown to be close to the loss of the Bayes-optimal, but infeasible prediction scheme based on

\mu

. We show that the bounds are tight and that no other predictor can lead to significantly smaller bounds. Furthermore, for various performance measures, we show Pareto-optimality of

\xi

and give an Occam's razor argument that the choice

w_\nu\sim 2^{-K(\nu)}

for the weights is optimal, where

K(\nu)

is the length of the shortest program describing

\nu

. The results are applied to games of chance, defined as a sequence of bets, observations, and rewards. The prediction schemes (and bounds) are compared to the popular predictors based on expert advice. Extensions to infinite alphabets, partial, delayed and probabilistic prediction, classification, and more active systems are briefly discussed.Comment: 34 page

arXiv.org e-Print Archive

CiteSeerX

Source Coding When the Side Information May Be Delayed

Author: Permuter Haim H.
Simeone Osvaldo
Publication venue
Publication date: 17/07/2012
Field of study

For memoryless sources, delayed side information at the decoder does not improve the rate-distortion function. However, this is not the case for more general sources with memory, as demonstrated by a number of works focusing on the special case of (delayed) feedforward. In this paper, a setting is studied in which the encoder is potentially uncertain about the delay with which measurements of the side information are acquired at the decoder. Assuming a hidden Markov model for the sources, at first, a single-letter characterization is given for the set-up where the side information delay is arbitrary and known at the encoder, and the reconstruction at the destination is required to be (near) lossless. Then, with delay equal to zero or one source symbol, a single-letter characterization is given of the rate-distortion region for the case where side information may be delayed or not, unbeknownst to the encoder. The characterization is further extended to allow for additional information to be sent when the side information is not delayed. Finally, examples for binary and Gaussian sources are provided.Comment: revised July 201

arXiv.org e-Print Archive

CiteSeerX