34 research outputs found

    DA release on successful vs. failed No-Go large-reward (NGL) trials.

    No full text
    Average cue-aligned (at 0 s) change in DA release (±SEM) on successful vs. failed No-Go large-reward (NGL) trials in [15]; triangles indicate average RT for success (filled) vs. failure (unfilled) trials. Inset: average integrated DA signal (±SEM) over the first 1 s following cue-onset on successful (s) vs. failed (f) trials; t(8) = 1.04, p = .16 (n.s.), one-tailed. (EPS)</p

    Two conceptions of a cued lever press.

    No full text
    <p>(A) A latency <i>τ</i> with which to press the lever is selected in an initial cued state (‘1’), leading to completion of the press <i>τ</i> seconds later (‘2’). (B) A latency <i>τ</i> with which to press the lever is selected in an initial cued state (‘1’), leading to a state of preparedness to press <i>τ</i> seconds later (‘2’). Completion of the press (‘3’) occurs only after a subsequent interval <i>τ</i><sub><i>post</i></sub>. After a further inter-trial interval <i>τ</i><sub><i>I</i></sub>, the process begins anew.</p

    Constant and variable hazard functions.

    No full text
    <p>(A) Two different gamma densities of the time <i>T</i> at which the critic receives notification of an impending lever press. (B) Corresponding hazard functions <math><mrow><mi>h</mi><mrow><mo>(</mo><mi>t</mi><mo>^</mo><mo>)</mo></mrow><mo>=</mo><msub><mo>lim</mo><mrow><mo>Δ</mo><mi>t</mi><mo>^</mo><mo>→</mo><mn>0</mn></mrow></msub><mrow><mo>{</mo><mi>P</mi><mrow><mo>(</mo><mi>T</mi><mo>≤</mo><mi>t</mi><mo>^</mo><mo>+</mo><mo>Δ</mo><mi>t</mi><mo>^</mo><mo>|</mo><mi>T</mi><mo>></mo><mi>t</mi><mo>^</mo><mo>)</mo></mrow><mo>/</mo><mo>Δ</mo><mi>t</mi><mo>^</mo><mo>}</mo></mrow></mrow></math>. Note that the hazard function is constant in the <math><mrow><mi>G</mi><mo>(</mo><mn>1</mn><mo>,</mo><mn>1</mn><mo>)</mo></mrow></math> case, but increases with time in the <math><mrow><mi>G</mi><mo>(</mo><mn>2</mn><mo>,</mo><mn>1</mn><mo>)</mo></mrow></math> case.</p

    General model.

    No full text
    Pavlovian influences notoriously interfere with operant behaviour. Evidence suggests this interference sometimes coincides with the release of the neuromodulator dopamine in the nucleus accumbens. Suppressing such interference is one of the targets of cognitive control. Here, using the examples of active avoidance and omission behaviour, we examine the possibility that direct manipulation of the dopamine signal is an instrument of control itself. In particular, when instrumental and Pavlovian influences come into conflict, dopamine levels might be affected by the controlled deployment of a reframing mechanism that recasts the prospect of possible punishment as an opportunity to approach safety, and the prospect of future reward in terms of a possible loss of that reward. We operationalize this reframing mechanism and fit the resulting model to rodent behaviour from two paradigmatic experiments in which accumbens dopamine release was also measured. We show that in addition to matching animals’ behaviour, the model predicts dopamine transients that capture some key features of observed dopamine release at the time of discriminative cues, supporting the idea that modulation of this neuromodulator is amongst the repertoire of cognitive control strategies.</div

    Model parameters used for mixed-valence task and best-fitting free parameter values for good- and poor-avoiders.

    No full text
    {rrew, rneu, rshk}: utilities of outcomes on press trials for each trial type. rO: utility rate associated with action other. cp: vigour cost of pressing. bshk: baseline/control signal optionally applied on shock trials. α: mixture weight determining relationship between full TD error and its dopaminergic component. τdelay: putative delay associated with application of control/reframing. {wi, }: instrumental weights. {wp+, wp−, , }: Pavlovian weights. κ: probability of deploying control/reframing.</p

    Pattern of prediction errors depends on the nature of communication between actor and critic.

    No full text
    <p>In each case (A–C), we consider signals for three particular times of <i>T</i> at which the critic receives notice of the impending lever press: 1 s (blue), 3 s (red), and 10 s (green). Parts of the signal where there is overlap between two or more different times of <i>T</i> are plotted in black. In each case, we plot TD errors (top), TD errors convolved with symmetric kernel (middle), and TD errors convolved with ‘asymmetric’ kernel (bottom). (A) Indirect communication (<i>a</i>′′) only, <math><mrow><mi>T</mi><mo>∼</mo><mi>G</mi><mo>(</mo><mn>1</mn><mo>,</mo><mn>1</mn><mo>)</mo></mrow></math>. (B) Indirect communication (<i>a</i>′′) only, <math><mrow><mi>T</mi><mo>∼</mo><mi>G</mi><mo>(</mo><mn>2</mn><mo>,</mo><mn>1</mn><mo>)</mo></mrow></math>. (C) Both direct and indirect communication (<i>a</i>′; <i>a</i>′′), <math><mrow><mi>T</mi><mo>∼</mo><mi>G</mi><mo>(</mo><mn>2</mn><mo>,</mo><mn>1</mn><mo>)</mo></mrow></math>, with timing uncertainty (uncertainty scaling constant <i>k</i> = 0.1). Vertical dashed lines indicate times of observable events, i.e. cue presentation (<i>t</i> = 0, black) and lever presses (<i>t</i> = <i>T</i>+<i>τ</i><sub><i>post</i></sub>, coloured). Note the difference in <i>y</i>-axis scaling between (A;B) and (C). Model parameters: <i>a</i> = −1, <i>b</i> = 0, <i>r</i> = 1, <i>τ</i><sub><i>post</i></sub> = 0.5 s, <i>τ</i><sub><i>I</i></sub> = 30 s.</p

    Optimal latency <i>τ</i>* as a function of discount factor <i>γ</i> and cost <i>a</i>.

    No full text
    <p>The optimal latency <i>τ</i>* tends to decrease as either the discount factor <i>γ</i> or the cost of acting quickly <i>a</i> decrease. (A) Terminating SMDP, <i>V</i><sup><i>γ</i></sup>(<i>s</i><sub><i>t</i> + <i>τ</i></sub>) = 1, ∀<i>τ</i>, <i>γ</i>. There exists a limit on the cost of acting <i>a</i><sub>lim</sub> below which there is no solution for <i>τ</i>* (solid red line). (B) Difference between the optimal <i>τ</i>* for the cases of continuing and terminating SMDP for the case that <i>τ</i><sub><i>I</i></sub> = 30 s. As <i>τ</i><sub><i>I</i></sub> is large relative to −1/log <i>γ</i>, there is little difference <i>Δτ</i>* from the terminating case. (C) Difference between the optimal <i>τ</i>* for the cases of continuing and terminating SMDP for the case that <i>τ</i><sub><i>I</i></sub> = 1 s. In this case, future rewards hasten lever pressing as seen in the more prevalent decreases in <i>τ</i>*.</p

    Model parameters used for Go/No-Go task and best-fitting free parameter values.

    No full text
    {rS, rL}: utilities of small and large rewards. bngl: baseline/control signal optionally applied on NGL trials. α: mixture weight determining relationship between full TD error and its dopaminergic component. τdelay: putative delay associated with application of control/reframing. : instrumental weight. : Pavlovian weights. {βτ, β}: inverse temperatures of softmax functions. κ: probability of deploying control/reframing. rO: utility rate associated with action other. {cf, cl, cp}: costs respectively associated with maintaining fixation, leaving, and pressing.</p

    Go/No-Go task of Syed et al. [15].

    No full text
    (a) On each trial, a tone indicated whether the animal should leave the nose-poke and press a lever (‘Go’) to gain a small (GS) or large (GL) reward; or remain in the nose-poke until the tone turns off (‘No-Go’) to gain a small (NGS) or large (NGL) reward. On No-Go trials, the duration of the tone was randomly jittered between 1.7–1.9 s on each trial. (b,c) Average success rates and RTs (±SEM) for each trial type; the latter are split further into successful (s) and failed (f) trials (*indicates significance, p d) Average cue-aligned change in DA (±SEM) for each trial type on successful trials; triangles indicate mean RTs. Again, our focus is on DA release associated with the cue, and in this case we simply focus on the first 3 s following cue onset (i.e., before the shaded region). (e) Putative shifting of origin rightwards in the affective circumplex to suppress action in light of predicted reward (adapted from [18]). (f) Cue-evoked TD errors predicted by model. (g) Average DA predicted by model on success trials. Inset: average DA predicted by model on successful vs. failed NGL trials. (Figures b–d adapted from [15].)</p

    Mixed-valence task of Gentry et al. [14].

    No full text
    (a) On each trial, a tone indicated whether a lever press within 10 s would yield reward, have no consequence, or avoid a scheduled shock. (b,c) Good-avoiders (upper) pressed often and quickly; poor-avoiders (lower) pressed less often and more slowly on shock and neutral trials (*indicates significance, p d) Average cue-aligned (nanomolar) NAc DA release (±SEM) on press trials for each trial type (vertical dashed lines indicate cue onset at 0 s and lever insertion at 5 s). Our focus is on DA release arising in response to the tone; the shaded region covers lever insertion and subsequent events. (e) Cue-evoked TD errors predicted by model (arbitrary units). (f) Average DA release predicted by model on press trials (arbitrarily scaled). (g) Putative shifting of origin leftwards in the affective circumplex to promote DA release and approach to safety in the active avoidance case (adapted from [18]). (h) Predicted DA release for press vs. no-press shock trials. (Figures b–d adapted from [14].)</p
    corecore