27 research outputs found

    Utilities in nonstationary 2-armed bandit.

    No full text
    <p>A. Utility as a function of mean of option 1 and mean of option 2, with standard deviation of both options set to 4 and discount rate, γ = 0.90. B. Utility as a function of standard deviation for 2 discount values when mean is 50 for both options and standard deviation is 4 for option 2. C. Estimate of mean and standard deviation of options 1 and 2 as they are sampled under a condition where means are fixed at 45 and 55. Black line indicates choice of option 1 (y = 5) or option 2 (y = 15). Discount rate γ = 0.90. D. Same as panel C, except γ = 0.99. E. Plot of action value for two options for data plotted in C, γ = 0.90. F. Action value for two options for data plotted in panel D, γ = 0.99. G. Example sequence of samples and estimates of mean and variance, γ = 0.90 for means drawn from the generative model. H. Example sequence of samples and estimates of mean and variance, γ = 0.99.</p

    Example MDP.

    No full text
    <p>Note that from state 1, picking action 1 leads to a reward of 1000, and a deterministic transition to state 2. Picking action 2 from state 1 leads to a reward of 1 and a deterministic transition to state 2. Only one action is available in state 2. It leads to a reward of 1 and a deterministic self-transition.</p

    Patch leaving foraging task.

    No full text
    <p>A. State space model for the task. The full state space has been collapsed. The state space shown would be repeated, one for each juice, travel delay combination. We show this here by indexing the choice state by these variables. B. Difference in action value for stay in patch vs. travel as a function of current juice and current travel delay. Yellow line indicates point of indifference between stay and travel. C. Difference in action value (same data as plotted in B) with each line representing a different travel delay. D. Average time in patch as a function of current travel delay. Note the curve is discontinuous because of the discretization of the problem. E. Difference in utility with an infinite, undiscounted time horizon. Yellow line indicates point of indifference between stay and travel. F. Difference in action values with an undiscounted, infinite time horizon. Note that the travel delay does not affect value, as would be expected.</p

    Sampling foraging task.

    No full text
    <p>A. State space model for the foraging task. The numbers in the circles indicate one of the offer pairs. As there were 6 available individual gambles, there were 15 offer pairs possible in each foraging round. Bottom of panel shows gambles that would be available in a specific foraging bout. In each trial subjects are shown a randomly sampled pair from the 6. If they accept the pair, they move on to the decision stage. If they sample again, a new pair is shown, and they have to decide whether to accept the pair, or sample again, etc. B. Expected value for accepting the current gamble or sampling again for an example sequence of draws. Option below trial number is the option pair that was presented on that trial.</p

    Beads task.

    No full text
    <p>A. Example distribution of beads in the beads in the jar task. B. State space for beads task. Example sequence of draws is taken from panel C. C-F. Action value for the three choice options as a function of draws for example sequences. Bead outcomes are shown as orange and blue beads. The star indicates the first trial on which the expected value of choosing an urn is greater than the expected value of drawing again. In this case an ideal observer would guess the urn with the highest value. Not that this is the value after seeing the bead shown in the corresponding trial. In panels C-E, cost to sample is C(s<sub>t</sub>,a) = -0.005. In panel F, C(s<sub>t</sub>,a) = -0.025.</p

    Bandit state space.

    No full text
    <p>A. A portion of the reward distribution tree, starting from a Beta(1,1) prior for one of the bandit options. As one of the options is chosen, the outcomes traverse this tree. The number at each node indicates the posterior over the number of rewards (numerator) and the number of times the option has been sampled (denominator). B. Product space across both bandit options. Blue lines (and fractions) indicate choice of option 1, red lines (and fractions) indicate choice of option 2. The numerator and denominator of the fractions are as in panel A and define the posterior probability of a reward. Thick lines show actions that would be taken from each node by an optimal policy, thin dashed lines show options that are not taken by an optimal policy. C. Distribution of reward probabilities (i.e. choices/rewards) over a finite horizon (N = 8 choices) starting from two different beta priors (Option 1: Beta(1,1) and Option 2: Beta(2,2)) which can be interpreted as different amounts of experience with the options. These priors correspond to being in the state 1/2:2/4 indicated in panel b with a box. The solid black bar under the x axis indicates q values for which p(q) is identical. Asterisks superimposed on the plots show the means of the two distributions (0.575 and 0.585 for option 2 and option 1 respectively).</p

    Mean and standard errors across participants of the biases for each condition in Experiments 1, 2, 3 and 5.

    No full text
    <p>Mean and standard errors across participants of the biases for each condition in Experiments 1, 2, 3 and 5.</p

    Emotionally-valenced image pairs used as stimuli in Experiments 1–3.

    No full text
    <p>Emotionally-valenced image pairs used as stimuli in Experiments 1–3.</p

    Behavioral results.

    No full text
    <p>All error bars are 1 s.e.m. A. Emotion bias for each group. B. Evidence vs. choice curve showing increased selection of happy faces when evidence supports angry face. C. Percent correct for each group.</p

    Demographic characteristics.

    No full text
    <p>UPDRS = Unified Parkinson's Disease Rating Scale; LEU = L-dopa equivalent units; DA = dopamine agonists. All values are mean ± SD. Significant differences are labelled with “*”. P-values refer to columns indicated in brackets. Controls (column 1), PD+ICB (column 2), PD−ICB (column 3).</p
    corecore