45,209 research outputs found

    Discounting of reward sequences: a test of competing formal models of hyperbolic discounting

    Get PDF
    Humans are known to discount future rewards hyperbolically in time. Nevertheless, a formal recursive model of hyperbolic discounting has been elusive until recently, with the introduction of the hyperbolically discounted temporal difference (HDTD) model. Prior to that, models of learning (especially reinforcement learning) have relied on exponential discounting, which generally provides poorer fits to behavioral data. Recently, it has been shown that hyperbolic discounting can also be approximated by a summed distribution of exponentially discounted values, instantiated in the μAgents model. The HDTD model and the μAgents model differ in one key respect, namely how they treat sequences of rewards. The μAgents model is a particular implementation of a Parallel discounting model, which values sequences based on the summed value of the individual rewards whereas the HDTD model contains a non-linear interaction. To discriminate among these models, we observed how subjects discounted a sequence of three rewards, and then we tested how well each candidate model fit the subject data. The results show that the Parallel model generally provides a better fit to the human data

    Optimality of Universal Bayesian Sequence Prediction for General Loss and Alphabet

    Full text link
    Various optimality properties of universal sequence predictors based on Bayes-mixtures in general, and Solomonoff's prediction scheme in particular, will be studied. The probability of observing xtx_t at time tt, given past observations x1...xt−1x_1...x_{t-1} can be computed with the chain rule if the true generating distribution μ\mu of the sequences x1x2x3...x_1x_2x_3... is known. If μ\mu is unknown, but known to belong to a countable or continuous class \M one can base ones prediction on the Bayes-mixture ξ\xi defined as a wνw_\nu-weighted sum or integral of distributions \nu\in\M. The cumulative expected loss of the Bayes-optimal universal prediction scheme based on ξ\xi is shown to be close to the loss of the Bayes-optimal, but infeasible prediction scheme based on μ\mu. We show that the bounds are tight and that no other predictor can lead to significantly smaller bounds. Furthermore, for various performance measures, we show Pareto-optimality of ξ\xi and give an Occam's razor argument that the choice wν∼2−K(ν)w_\nu\sim 2^{-K(\nu)} for the weights is optimal, where K(ν)K(\nu) is the length of the shortest program describing ν\nu. The results are applied to games of chance, defined as a sequence of bets, observations, and rewards. The prediction schemes (and bounds) are compared to the popular predictors based on expert advice. Extensions to infinite alphabets, partial, delayed and probabilistic prediction, classification, and more active systems are briefly discussed.Comment: 34 page

    Source Coding When the Side Information May Be Delayed

    Full text link
    For memoryless sources, delayed side information at the decoder does not improve the rate-distortion function. However, this is not the case for more general sources with memory, as demonstrated by a number of works focusing on the special case of (delayed) feedforward. In this paper, a setting is studied in which the encoder is potentially uncertain about the delay with which measurements of the side information are acquired at the decoder. Assuming a hidden Markov model for the sources, at first, a single-letter characterization is given for the set-up where the side information delay is arbitrary and known at the encoder, and the reconstruction at the destination is required to be (near) lossless. Then, with delay equal to zero or one source symbol, a single-letter characterization is given of the rate-distortion region for the case where side information may be delayed or not, unbeknownst to the encoder. The characterization is further extended to allow for additional information to be sent when the side information is not delayed. Finally, examples for binary and Gaussian sources are provided.Comment: revised July 201
    • …
    corecore