53 research outputs found

    Reinforcement Learning Describes the Computational and Neural Processes Underlying Flexible Learning of Values and Attentional Selection

    Get PDF
    Attention and learning are cognitive control processes that are closely related. This thesis investigates this inter-relatedness by using computational models to describe the mechanisms that are shared between these processes. Computational models describe the transformation of stimuli to observable variables (behaviour) and contain the latent mechanisms that affect this transformation. Here, I captured these mechanisms with the reinforcement learning (RL) framework applied in two different task contexts and three different projects to show 1) how attentional selection of stimuli involves the learning of values for stimuli, 2) how the learning of stimulus values is influenced by previously learned rules, and 3) how explorations of value-related mechanisms in the brain benefit from using intracranial EEG to investigate the strength of oscillatory activity in ventromedial prefrontal cortex. In the first project, the RL framework is applied to a feature-based attention task that required macaques to learn the value of stimulus features while ignoring non-relevant information. By comparing different RL schemes I found that trial-by-trial covert attentional selections were best predicted by a model that only represents expected values for the task relevant feature dimension. In the second project, I explore mechanisms of stimulus-feature value learning in humans in order to understand the influence of learned rules for the flexible, on-going learning of expected values. I test the hypothesis that naive subjects will show enhanced learning of feature specific reward associations by switching to the use of an abstract rule that associates stimuli by feature type. I found that two-thirds of subjects (n=22/32) exhibited behaviour that was best fit by a ‘flexible-rule-selection’ model. Low-frequency oscillatory activity in frontal cortex has been associated with cognitive control and integrative brain functions, however, the relationship between expected values for stimuli and band-limited, rhythmic neural activity in the human brain is largely unknown. In the third project, I used intracranial electrocorticography (ECoG) in a proof-of-principle study to reveal spectral power signatures in vmPFC related to the expected values of stimuli predicted by a RL model for a single human subject

    Feature-specific prediction errors and surprise across macaque fronto-striatal circuits

    Full text link
    [EN] To adjust expectations efficiently, prediction errors need to be associated with the precise features that gave rise to the unexpected outcome, but this credit assignment may be problematic if stimuli differ on multiple dimensions and it is ambiguous which feature dimension caused the outcome. Here, we report a potential solution: neurons in four recorded areas of the anterior fronto-striatal networks encode prediction errors that are specific to feature values of different dimensions of attended multidimensional stimuli. The most ubiquitous prediction error occurred for the reward-relevant dimension. Feature-specific prediction error signals a) emerge on average shortly after non-specific prediction error signals, b) arise earliest in the anterior cingulate cortex and later in dorsolateral prefrontal cortex, caudate and ventral striatum, and c) contribute to feature-based stimulus selection after learning. Thus, a widely-distributed feature-specific eligibility trace may be used to update synaptic weights for improved feature-based attention.This work was supported by grant MOP 102482 from the Canadian Institutes of Health Research (T.W.) and the Natural Sciences and Engineering Research Council of Canada (T.W.), as well as by the Brain in Action CREATE-IRTG program (M.O. and T.W.), and by grant LPDS 2012-08 from the Deutsche Akademie der Naturforscher Leopoldina (S.W.). Imaging data provided by the Duke Center for In Vivo Microscopy, an NIH Biomedical Technology Resource (NIHP41EB015897, 1S10OD010683-01). The funders had no role in study design, data collection and analysis, the decision to publish, or the preparation of this manuscript. The authors would like to thank Hongying Wang for technical supportOemisch, M.; Westendorff, S.; Azimi, M.; Hassani, SA.; Ardid-RamĂ­rez, JS.; Tiesinga, P.; Womelsdorf, T. (2019). Feature-specific prediction errors and surprise across macaque fronto-striatal circuits. Nature Communications. 10:1-15. https://doi.org/10.1038/s41467-018-08184-9S11510Farashahi, S., Rowe, K., Aslami, Z., Lee, D. & Soltani, A. Feature-based learning improves adaptability without compromising precision. Nat. Commun. 8, 1768 (2017).Hikosaka, O., Ghazizadeh, A., Griggs, W. & Amita, H. Parallel basal ganglia circuits for decision making. J. Neural Transm. 1–15 (2017). https://doi.org/10.1007/s00702-017-1691-1Leong, Y. C., Radulescu, A., Daniel, R., DeWoskin, V. & Niv, Y. Dynamic Interaction between reinforcement learning and attention in multidimensional environments. Neuron 93, 451–463 (2017).Niv, Y. et al. Reinforcement learning in multidimensional environments relies on attention mechanisms. J. Neurosci. 35, 8145–8157 (2015).Sutton, R. S. & Barto, A. G. Reinforcement Learning: An Introduction. Vol. 135. Cambridge: MIT Press (1998).Gottlieb, J. Attention, learning and the value of information. Neuron 76, 281–295 (2012).Pearce, J. & Hall, G. A model for Pavlovian learning: variation in the effectiveness of conditioned but not unconditioned stimuli. Psychol. Rev. 87, 532–552 (1980).Daddaoua, N., Lopes, M. & Gottlieb, J. Intrinsically motivated oculomotor exploration guided by uncertainty reduction and conditioned reinforcement in non-human primates. Sci. Rep. 6, 1–15 (2016).Dayan, P., Kakade, S. & Montague, P. R. Learning and selective attention. Nat. Neurosci. 3, 1218–1223 (2000).Hassani, S. A. et al. A computational psychiatry approach identifies how alpha-2A noradrenergic agonist Guanfacine affects feature-based reinforcement learning in the macaque. Sci. Rep. 7, 1–19 (2017).Wilson, R. C. & Niv, Y. Inferring relevance in a changing world. Front. Hum. Neurosci. 5, 1–14 (2012).Kruschke, J. K. & Hullinger, R. A. Evolution of attention in learning. Comput. Models Condition. (2010). https://doi.org/10.1017/CBO9780511760402.002Asaad, W. F., Lauro, P. M., Perge, J. A. & Eskandar, E. N. Prefrontal neurons encode a solution to the credit assignment problem. J. Neurosci. 37, 3311–3316 (2017).Haber, S. N. & Knutson, B. The reward circuit: linking primate anatomy and human imaging. Neuropsychopharmacology 35, 4–26 (2010).Dias, R., Robbins, T. W. & Roberts, A. C. Dissociation in prefrontal cortex of affective and attentional shifts. Nature 380, 69–72 (1996).Glimcher, P. W. Understanding dopamine and reinforcement learning: the dopamine reward prediction error hypothesis. Proc. Natl Acad. Sci. USA 108 Suppl, 15647–15654 (2011).Bichot, N. P., Heard, M. T., DeGennaro, E. M. & Desimone, R. A source for feature-based attention in the prefrontal cortex. Neuron 88, 832–844 (2015).Kaping, D., Vinck, M., Hutchison, R. M., Everling, S. & Womelsdorf, T. Specific contributions of ventromedial, anterior cingulate, and lateral prefrontal cortex for attentional selection and stimulus valuation. PLoS Biol. 9, e1001224 (2011).Alexander, W. H. & Brown, J. W. Hierarchical error representation: a computational model of anterior cingulate and dorsolateral prefrontal cortex. Neural Comput. 27, 2354–2410 (2015).Sutton, R. S. & Barto, A. G. Reinforcement Learning: an Introduction. 322 (1998). https://doi.org/10.1109/TNN.1998.712192Roelfsema, P. R. & van Ooyen, A. Attention-gated reinforcement learning of internal representations for classification. Neural Comput. 17, 2176–2214 (2005).Rombouts, J. O., Bohte, S. M. & Roelfsema, P. R. How attention can create synaptic tags for the learning of working memories in sequential tasks. PLoS Comput. Biol. 11, 1–34 (2015).Balcarras, M., Ardid, S., Kaping, D., Everling, S. & Womelsdorf, T. Attentional selection can be predicted by reinforcement learning of task-relevant stimulus features weighted by value-independent stickiness. J. Cogn. Neurosci. 28, 333–349 (2016).Smith, A. C. et al. Dynamic analysis of learning in behavioral experiments. J. Neurosci. 24, 447–461 (2004).Kennerley, S. W., Behrens, T. E. J. & Wallis, J. D. Double dissociation of value computations in orbitofrontal and anterior cingulate neurons. Nat. Neurosci. 14, 1581–1589 (2011).Asaad, W. F. & Eskandar, E. N. Encoding of both positive and negative reward prediction errors by neurons of the primate lateral prefrontal cortex and caudate nucleus. J. Neurosci. 31, 17772–17787 (2011).Hayden, B. Y., Heilbronner, S. R., Pearson, J. M. & Platt, M. L. Surprise signals in anterior cingulate cortex: neuronal encoding of unsigned reward prediction errors driving adjustment in behavior. J. Neurosci. 31, 4178–4187 (2011).Schultz, W. Dopamine reward prediction error coding. Dialogues Clin. Neurosci. 18, 23–32 (2016).Izquierdo, A., Brigman, J. L., Radke, A. K., Rudebeck, P. H. & Holmes, A. The neural basis of reversal learning: an updated perspective. Neuroscience 345, 12–26 (2017).Fiorillo, C. D., Tobler, P. N. & Schultz, W. Discrete coding of reward probability and uncertainty by dopamine neurons. Science 299, 1898–1902 (2003).Schultz, W. Predictive reward signal of dopamine neurons. J. Neurophysiol. 80, 1–27 (1998).Ardid, S. et al. Mapping of functionally characterized cell classes onto canonical circuit operations in primate prefrontal cortex. J. Neurosci. 35, 2975–2991 (2015).Berke, J. D. Uncoordinated firing rate changes of striatal fast-spiking interneurons during behavioural task performance. J. Neurosci. 28, 10075–10080 (2008).Lansink, C. S., Goltstein, P. M., Lankelma, J. V. & Pennartz, C. M. A. Fast-spiking interneurons of the rat ventral striatum: temporal coordination of activity with principal cells and responsiveness to reward. Eur. J. Neurosci. 32, 494–508 (2010).Kawaguchi, Y. Physiological, morphological, and histochemical characterization of three classes of interneurons in rat neostriatum. J. Neurosci. 13, 4908–4923 (1993).Shen, C. et al. Anterior cingulate cortex cells identify process-specific errors of attentional control prior to transient prefrontal-cingulate inhibition. Cereb. Cortex 25, 2213–2228 (2015).Shenhav, A., Cohen, J. D. & Botvinick, M. M. Dorsal anterior cingulate cortex and the value of control. Nat. Neurosci. 19, 1286–1291 (2016).Quilodran, R., RothĂ©, M. & Procyk, E. Behavioral shifts and action valuation in the anterior cingulate cortex. Neuron 57, 314–325 (2008).Kennerley, S. W., Dahmubed, A. F., Lara, A. H. & Wallis, J. D. Neurons in the frontal lobe encode the value of multiple decision variables. J. Cogn. Neurosci. 21, 1162–1178 (2009).Womelsdorf, T., Johnston, K., Vinck, M. & Everling, S. Theta-activity in anterior cingulate cortex predicts task rules and their adjustments following errors. Proc. Natl Acad. Sci. 107, 5248–5253 (2010).Oemisch, M., Westendorff, S., Everling, S. & Womelsdorf, T. Interareal spike-train correlations of anterior cingulate and dorsal prefrontal cortex during attention shifts. J. Neurosci. 35, 13076–13089 (2015).Voloh, B., Valiante, T. A., Everling, S. & Womelsdorf, T. Theta-gamma coordination between anterior cingulate and prefrontal cortex indexes correct attention shifts. Proc. Natl Acad. Sci. USA 112, 8457–8462 (2015).Westendorff, S., Kaping, D., Everling, S. & Womelsdorf, T. Prefrontal and anterior cingulate cortex neurons encode attentional targets even when they do not apparently bias behavior. J. Neurophysiol. 116, 796–811 (2016).Womelsdorf, T. & Everling, S. Long-range attention networks: circuit motifs underlying endogenously controlled stimulus selection. Trends Neurosci. 38, 682–700 (2015).Medalla, M. & Barbas, H. Synapses with inhibitory neurons differentiate anterior cingulate from dorsolateral prefrontal pathways associated with cognitive control. Neuron 61, 609–620 (2009).Antzoulatos, E. G. & Miller, E. K. Increases in functional connectivity between prefrontal cortex and striatum during category learning. Neuron 83, 216–225 (2014).Womelsdorf, T., Ardid, S., Everling, S. & Valiante, T. A. Burst firing synchronizes prefrontal and anterior cingulate cortex during attentional control. Curr. Biol. 1–9 (2014). https://doi.org/10.1016/j.cub.2014.09.046Hunt, L. T. & Hayden, B. Y. A distributed, hierarchical and recurrent framework for reward-based choice. Nat. Rev. Neurosci. 18, 172–182 (2017).Kable, J. W. & Glimcher, P. W. The neurobiology of decision: consensus and controversy. Neuron 63, 733–745 (2009).Badre, D. & Nee, D. E. Frontal cortex and the hierarchical control of behavior. Trends Cogn. Sci. 22, 170–188 (2018).Tian, J. et al. Distributed and mixed information in monosynaptic inputs to dopamine neurons. Neuron 1374–1389 (2016). https://doi.org/10.1016/j.neuron.2016.08.018den Ouden, H. E. M., Kok, P. & de Lange, F. P. How prediction errors shape perception, attention, and motivation. Front. Psychol. 3, 1–12 (2012).Rigotti, M. et al. The importance of mixed selectivity in complex cognitive tasks. Nature 497, 585–590 (2013).Genovesio, A., Wise, S. P. & Passingham, R. E. Prefrontal—parietal function: from foraging to foresight. Trends Cogn. Sci. 18, 72–81 (2014).Donahue, C. H. & Lee, D. Dynamic routing of task-relevant signals for decision making in dorsolateral prefrontal cortex. Nat. Neurosci. 18, 1–9 (2015).Mante, V., Sussillo, D., Shenoy, K. V. & Newsome, W. T. Context-dependent computation by recurrent dynamics in prefrontal cortex. Nature 503, 78–84 (2013).Berke, J. D. Functional properties of striatal fast-spiking interneurons. Front. Syst. Neurosci. 5, 1–7 (2011).Hennequin, G., Agnes, E. J. & Vogels, T. P. Inhibitory plasticity: balance, control, and codependence. Annu. Rev. Neurosci. 40, 557–579 (2017).Wilson, F. A., O’Scalaidhe, S. P. & Goldman-Rakic, P. S. Functional synergism between putative gamma-aminobutyrate-containing neurons and pyramidal neurons in prefrontal cortex. Proc. Natl Acad. Sci. 91, 4009–4013 (1994).Lee, K. et al. Parvalbumin interneurons modulate striatal output and enhance performance during associative learning. Neuron 93, 1451–1463.e4 (2017).Vogels, T. P., Sprekeler, H., Zenke, F., Clopath, C. & Gerstner, W. Inhibitory plasticity balances excitation and inhibition in sensory pathways and memory networks. Science 334, 1569–1573 (2011).Le Pelley, M. E., Mitchell, C. J., Beesley, T., George, D. N. & Wills, A. J. Attention and associative learning in humans: an integrative review. Psychol. Bull. 142, 1111–1140 (2016).Courville, A. C., Daw, N. D. & Touretzky, D. S. Bayesian theories of conditioning in a changing world. Trends Cogn. Sci. 10, 294–300 (2006).Gottlieb, J., Hayhoe, M., Hikosaka, O. & Rangel, A. Attention, reward, and information seeking. J. Neurosci. 34, 15497–15504 (2014).Rusch, T., Korn, C. W. & GlĂ€scher, J. A two-way street between attention and learning. Neuron 93, 256–258 (2017).Takahashi, Y. K., Langdon, A. J., Niv, Y. & Schoenbaum, G. Temporal specificity of reward prediction errors signaled by putative dopamine neurons in rat VTA depends on ventral striatum. Neuron 91, 182–193 (2016).Watabe-Uchida, M., Eshel, N. & Uchida, N. Neural circuitry of reward prediction error. Annu. Rev. Neurosci. 40, 373–394 (2017).Krauzlis, R. J., Bollimunta, A., Arcizet, F. & Wang, L. Attention as an effect not a cause. Trends Cogn. Sci. 18, 457–464 (2014).Lovejoy, L. P. & Krauzlis, R. J. Inactivation of primate superior colliculus impairs covert selection of signals for perceptual judgments. Nat. Neurosci. 13, 261–266 (2010).Rasmussen, D., Voelker, A. & Eliasmith, C. A neural model of hierarchical reinforcement learning. PLoS ONE 12, e0180234 (2017).Fusi, S., Asaad, W. F., Miller, E. K. & Wang, X. J. A neural circuit model of flexible sensorimotor mapping: learning and forgetting on multiple timescales. Neuron 54, 319–333 (2007).Roelfsema, P. R. & Holtmaat, A. Control of synaptic plasticity in deep cortical networks. Nat. Rev. Neurosci. 19, 166–180 (2018).Calabrese, E. et al. A diffusion tensor MRI atlas of the postmortem rhesus macaque brain. Neuroimage 117, 408–416 (2015).Bakker, R., Tiesinga, P. & Kötter, R. The scalable brain atlas: instant web-based access to public brain atlases and related content. Neuroinformatics 13, 353–366 (2015)

    A computational psychiatry approach identifies how alpha-2A noradrenergic agonist Guanfacine affects feature-based reinforcement learning in the macaque

    Get PDF
    [EN] Noradrenaline is believed to support cognitive flexibility through the alpha 2A noradrenergic receptor (a2A-NAR) acting in prefrontal cortex. Enhanced flexibility has been inferred from improved working memory with the a2A-NA agonist Guanfacine. But it has been unclear whether Guanfacine improves specific attention and learning mechanisms beyond working memory, and whether the drug effects can be formalized computationally to allow single subject predictions. We tested and confirmed these suggestions in a case study with a healthy nonhuman primate performing a feature-based reversal learning task evaluating performance using Bayesian and Reinforcement learning models. In an initial dose-testing phase we found a Guanfacine dose that increased performance accuracy, decreased distractibility and improved learning. In a second experimental phase using only that dose we examined the faster feature-based reversal learning with Guanfacine with single-subject computational modeling. Parameter estimation suggested that improved learning is not accounted for by varying a single reinforcement learning mechanism, but by changing the set of parameter values to higher learning rates and stronger suppression of non-chosen over chosen feature information. These findings provide an important starting point for developing nonhuman primate models to discern the synaptic mechanisms of attention and learning functions within the context of a computational neuropsychiatry framework.This research was supported by grants from the Canadian Institutes of Health Research (CIHR), the Natural Sciences and Engineering Research Council of Canada (NSERC) and the Ontario Ministry of Economic Development and Innovation (MEDI). We thank Dr. Hongying Wang for invaluable help with drug administration and animal careHassani, SA.; Oemisch, M.; Balcarras, M.; Westendorff, S.; Ardid-RamĂ­rez, JS.; Van Der Meer, MA.; Tiesinga, P.... (2017). A computational psychiatry approach identifies how alpha-2A noradrenergic agonist Guanfacine affects feature-based reinforcement learning in the macaque. Scientific Reports. 7:1-19. https://doi.org/10.1038/srep40606S1197Arnsten, A. F., Wang, M. J. & Paspalas, C. D. Neuromodulation of thought: flexibilities and vulnerabilities in prefrontal cortical network synapses. Neuron 76, 223–239 (2012).Arnsten, A. F. & Dudley, A. G. Methylphenidate improves prefrontal cortical cognitive function through alpha2 adrenoceptor and dopamine D1 receptor actions: Relevance to therapeutic effects in Attention Deficit Hyperactivity Disorder. Behav Brain Funct 1, 2 (2005).Clark, K. L. & Noudoost, B. The role of prefrontal catecholamines in attention and working memory. Front Neural Circuits 8, 33 (2014).Wang, M. et al. Neuronal basis of age-related working memory decline. Nature 476, 210–213 (2011).Wang, M. et al. Alpha2A-adrenoceptors strengthen working memory networks by inhibiting cAMP-HCN channel signaling in prefrontal cortex. Cell 129, 397–410 (2007).Aston-Jones, G. & Cohen, J. D. An integrative theory of locus coeruleus-norepinephrine function: adaptive gain and optimal performance. Annu Rev Neurosci 28, 403–450 (2005).Yu, A. J. & Dayan, P. Uncertainty, neuromodulation, and attention. Neuron 46, 681–692 (2005).Mather, M., Clewett, D., Sakaki, M. & Harley, C. W. Norepinephrine ignites local hot spots of neuronal excitation: How arousal amplifies selectivity in perception and memory. Behav Brain Sci, 1–100, doi: 10.1017/S0140525X15000667 (2015).Amemiya, S. & Redish, A. D. Manipulating Decisiveness in Decision Making: Effects of Clonidine on Hippocampal Search Strategies. J Neurosci 36, 814–827 (2016).Doya, K. Metalearning and neuromodulation. Neural Netw 15, 495–506 (2002).Uhlen, S., Muceniece, R., Rangel, N., Tiger, G. & Wikberg, J. E. Comparison of the binding activities of some drugs on alpha 2A, alpha 2B and alpha 2C-adrenoceptors and non-adrenergic imidazoline sites in the guinea pig. Pharmacology & toxicology 76, 353–364 (1995).Mao, Z. M., Arnsten, A. F. & Li, B. M. Local infusion of an alpha-1 adrenergic agonist into the prefrontal cortex impairs spatial working memory performance in monkeys. Biological psychiatry 46, 1259–1265 (1999).Arnsten, A. F. & Goldman-Rakic, P. S. Analysis of alpha-2 adrenergic agonist effects on the delayed nonmatch-to-sample performance of aged rhesus monkeys. Neurobiol Aging 11, 583–590 (1990).Franowicz, J. S. & Arnsten, A. F. The alpha-2a noradrenergic agonist, guanfacine, improves delayed response performance in young adult rhesus monkeys. Psychopharmacology 136, 8–14 (1998).Caetano, M. S. et al. Noradrenergic control of error perseveration in medial prefrontal cortex. Frontiers in Integrative Neuroscience 6, 125 (2012).Kim, S., Bobeica, I., Gamo, N. J., Arnsten, A. F. & Lee, D. Effects of alpha-2A adrenergic receptor agonist on time and risk preference in primates. Psychopharmacology 219, 363–375 (2012).Seu, E., Lang, A., Rivera, R. J. & Jentsch, J. D. Inhibition of the norepinephrine transporter improves behavioral flexibility in rats and monkeys. Psychopharmacology 202, 505–519 (2009).Kawaura, K., Karasawa, J., Chaki, S. & Hikichi, H. Stimulation of postsynapse adrenergic alpha2A receptor improves attention/cognition performance in an animal model of attention deficit hyperactivity disorder. Behav Brain Res 270, 349–356 (2014).Aoki, C., Go, C. G., Venkatesan, C. & Kurose, H. Perikaryal and synaptic localization of alpha 2A-adrenergic receptor-like immunoreactivity. Brain Res 650, 181–204 (1994).Barth, A. M., Vizi, E. S., Zelles, T. & Lendvai, B. Alpha2-adrenergic receptors modify dendritic spike generation via HCN channels in the prefrontal cortex. J Neurophysiol 99, 394–401 (2008).Ji, X. H., Ji, J. Z., Zhang, H. & Li, B. M. Stimulation of alpha2-adrenoceptors suppresses excitatory synaptic transmission in the medial prefrontal cortex of rat. Neuropsychopharmacology 33, 2263–2271 (2008).Yi, F., Liu, S. S., Luo, F., Zhang, X. H. & Li, B. M. Signaling mechanism underlying alpha2A -adrenergic suppression of excitatory synaptic transmission in the medial prefrontal cortex of rats. Eur J Neurosci 38, 2364–2373 (2013).Engberg, G. & Eriksson, E. Effects of alpha 2-adrenoceptor agonists on locus coeruleus firing rate and brain noradrenaline turnover in N-ethoxycarbonyl-2-ethoxy-1,2-dihydroquinoline (EEDQ)-treated rats. Naunyn-Schmiedeberg’s archives of pharmacology 343, 472–477 (1991).Jakala, P. et al. Guanfacine, but not clonidine, improves planning and working memory performance in humans. Neuropsychopharmacology 20, 460–470 (1999).Jakala, P. et al. Guanfacine and clonidine, alpha 2-agonists, improve paired associates learning, but not delayed matching to sample, in humans. Neuropsychopharmacology 20, 119–130 (1999).Muller, U. et al. Lack of effects of guanfacine on executive and memory functions in healthy male volunteers. Psychopharmacology 182, 205–213 (2005).Scahill, L. et al. A placebo-controlled study of guanfacine in the treatment of children with tic disorders and attention deficit hyperactivity disorder. The American journal of psychiatry 158, 1067–1074 (2001).Huys, Q. J. M., Maia, T. V. & Frank, M. J. Computational psychiatry as a bridge from neuroscience to clinical applications. Nat Neurosci 19, 404–413 (2016).Stephan, K. E. et al. Computational neuroimaging strategies for single patient predictions. NeuroImage in press (2015).Arnsten, A. F., Cai, J. X. & Goldman-Rakic, P. S. The alpha-2 adrenergic agonist guanfacine improves memory in aged monkeys without sedative or hypotensive side effects: evidence for alpha-2 receptor subtypes. J Neurosci 8, 4287–4298 (1988).Callado, L. F. & Stamford, J. A. Alpha2A- but not alpha2B/C-adrenoceptors modulate noradrenaline release in rat locus coeruleus: voltammetric data. Eur J Pharmacol 366, 35–39 (1999).Millan, M. J. et al. Cognitive dysfunction in psychiatric disorders: characteristics, causes and the quest for improved therapy. Nature reviews. Drug discovery 11, 141–168 (2012).Niv, Y. et al. Reinforcement learning in multidimensional environments relies on attention mechanisms. J Neurosci 35, 8145–8157 (2015).Balcarras, M., Ardid, S., Kaping, D., Everling, S. & Womelsdorf, T. Attentional Selection Can Be Predicted by Reinforcement Learning of Task-relevant Stimulus Features Weighted by Value-independent Stickiness. J Cogn Neurosci 28, 333–349 (2016).Redish, A. D., Jensen, S., Johnson, A. & Kurth-Nelson, Z. Reconciling reinforcement learning models with behavioral extinction and renewal: implications for addiction, relapse, and problem gambling. Psychol Rev 114, 784–805 (2007).Nassar, M. R. et al. Rational regulation of learning dynamics by pupil-linked arousal systems. Nat Neurosci 15, 1040–1046 (2012).O’Reilly, J. X. et al. Dissociable effects of surprise and model update in parietal and anterior cingulate cortex. Proc Natl Acad Sci USA 110, 3660–3669 (2013).Shenhav, A., Botvinick, M. M. & Cohen, J. D. The expected value of control: an integrative theory of anterior cingulate cortex function. Neuron 79, 217–240 (2013).Womelsdorf, T. & Everling, S. Long-Range Attention Networks: Circuit Motifs Underlying Endogenously Controlled Stimulus Selection. Trends Neurosci 38, 682–700 (2015).Yang, Y. et al. Nicotinic alpha7 receptors enhance NMDA cognitive circuits in dorsolateral prefrontal cortex. Proc Natl Acad Sci USA 110, 12078–12083 (2013).Aston-Jones, G., Rajkowski, J. & Cohen, J. Role of locus coeruleus in attention and behavioral flexibility. Biological psychiatry 46, 1309–1320 (1999).Cole, B. J. & Robbins, T. W. Forebrain norepinephrine: role in controlled information processing in the rat. Neuropsychopharmacology 7, 129–142 (1992).Dalley, J. W., Cardinal, R. N. & Robbins, T. W. Prefrontal executive and cognitive functions in rodents: neural and neurochemical substrates. Neuroscience and biobehavioral reviews 28, 771–784 (2004).Devauges, V. & Sara, S. J. Activation of the noradrenergic system facilitates an attentional shift in the rat. Behav Brain Res 39, 19–28 (1990).Connor, D. F., Arnsten, A. F., Pearson, G. S. & Greco, G. F. Guanfacine extended release for the treatment of attention-deficit/hyperactivity disorder in children and adolescents. Expert opinion on pharmacotherapy 15, 1601–1610 (2014).Sallee, F. R. et al. Guanfacine extended release in children and adolescents with attention-deficit/hyperactivity disorder: a placebo-controlled trial. J Am Acad Child Adolesc Psychiatry 48, 155–165 (2009).Steere, J. C. & Arnsten, A. F. The alpha-2A noradrenergic receptor agonist guanfacine improves visual object discrimination reversal performance in aged rhesus monkeys. Behav Neurosci 111, 883–891 (1997).Doya, K. Modulators of decision making. Nat Neurosci 11, 410–416 (2008).Wang, X. J. & Krystal, J. H. Computational psychiatry. Neuron 84, 638–654 (2014).Wiecki, T. V. et al. A Computational Cognitive Biomarker for Early-Stage Huntington’s Disease. PLoS One 11, e0148409, doi: 10.1371/journal.pone.0148409 (2016).Huys, Q. J., Pizzagalli, D. A., Bogdan, R. & Dayan, P. Mapping anhedonia onto reinforcement learning: a behavioural meta-analysis. Biol Mood Anxiety Disord 3, 12 (2013).Gershman, S. J. & Niv, Y. Learning latent structure: carving nature at its joints. Curr Opin Neurobiol 20, 251–256 (2010).Voon, V. et al. Disorders of compulsivity: a common bias towards learning habits. Mol Psychiatry 20, 345–352 (2015).Maia, T. V. & Frank, M. J. From reinforcement learning models to psychiatric and neurological disorders. Nature Neuroscience 14, 154–162 (2011).Adams, R. A., Huys, Q. J. M. & Roiser, J. P. Computational Psychiatry: towards a mathematically informed understanding of mental illness. Journal of Neurology, Neurosurgery, and Psychiatry 87, 53–63 (2015).Schlagenhauf, F. et al. Striatal dysfunction during reversal learning in unmedicated schizophrenia patients. NeuroImage 89, 171–180 (2014).HarlĂ©, K. M. et al. Bayesian neural adjustment of inhibitory control predicts emergence of problem stimulant use. Brain 138, 3413–3426 (2015).Zhang, J. et al. Different decision deficits impair response inhibition in progressive supranuclear palsy and Parkinson’s disease. Brain 139, 161–173 (2016).Frank, M. J. et al. fMRI and EEG Predictors of Dynamic Decision Parameters during Human Reinforcement Learning. Journal of Neuroscience 35, 485–494 (2015).Smith, A. C. & Brown, E. N. Estimating a state-space model from point process observations. Neural Comput 15, 965–991 (2003).Wilson, R. C. & Niv, Y. Inferring relevance in a changing world. Frontiers in human neuroscience 5, 189 (2011).RĂ€mĂ€, P. et al. Medetomidine, atipamezole, and guanfacine in delayed response performance of aged monkeys. Pharmacology Biochemistry and Behavior 55, 415–422 (1996).Arnsten, A. F. T. & Contant, T. A. Alpha-2 adrenergic agonists decrease distractibility in aged monkeys performing the delayed response task. Psychopharmacology 108, 159–169 (1992).O’Neill, J. et al. Effects of guanfacine on three forms of distraction in the aging macaque. Life Sciences 67, 877–885 (2000).Wang, M., Ji, J.-Z. & Li, B.-M. The α2A-Adrenergic Agonist Guanfacine Improves Visuomotor Associative Learning in Monkeys. Neuropsychopharmacology 29, 86–92 (2004).Witte, E. a. & Marrocco, R. T. Alteration of brain noradrenergic activity in rhesus monkeys affects the alerting component of covert orienting. Psychopharmacology 132, 315–323 (1997).Decamp, E., Clark, K. & Schneider, J. S. Effects of the alpha-2 adrenoceptor agonist guanfacine on attention and working memory in aged non-human primates. European Journal of Neuroscience 34, 1018–1022 (2011)

    Habits without values

    Get PDF
    Habits form a crucial component of behavior. In recent years, key computational models have conceptualized habits as arising from model-free reinforcement learning (RL) mechanisms, which typically select between available actions based on the future value expected to result from each. Traditionally, however, habits have been understood as behaviors that can be triggered directly by a stimulus, without requiring the animal to evaluate expected outcomes. Here, we develop a computational model instantiating this traditional view, in which habits develop through the direct strengthening of recently taken actions rather than through the encoding of outcomes. We demonstrate that this model accounts for key behavioral manifestations of habits, including insensitivity to outcome devaluation and contingency degradation, as well as the effects of reinforcement schedule on the rate of habit formation. The model also explains the prevalent observation of perseveration in repeated-choice tasks as an additional behavioral manifestation of the habit system. We suggest that mapping habitual behaviors onto value-free mechanisms provides a parsimonious account of existing behavioral and neural data. This mapping may provide a new foundation for building robust and comprehensive models of the interaction of habits with other, more goal-directed types of behaviors and help to better guide research into the neural mechanisms underlying control of instrumental behavior more generally

    The neuro-computational role of uncertainty in anxiety

    Get PDF
    Anxiety disorders are the most common mental health disorders and comprise a large number of years lost to disability. The work in this thesis is oriented towards understanding anxiety using a computational approach, focusing on uncertainty estimation as a key process. Chapter 1 introduces the role of uncertainty within anxiety and motivates the subsequent experimental chapters. Chapter 2 is a review of the computational role of the amygdala in humans, a key area for uncertainty computation. Chapter 3 is an experimental chapter which aimed to address gaps in the literature highlighted in the preceding chapters, namely the link between sensory uncertainty processing and anxiety and the role of the amygdala in this process. This chapter focuses on the development of a novel computational hierarchical Bayesian model to quantify sensory uncertainty and its application to neuroimaging data, with intolerance of uncertainty relating to greater neural activation in the insula but not amygdala. Chapter 4 targets the computational mechanisms underlying the negative self-bias observed in subclinical social anxiety. Again, this chapter focuses on the development of novel computational belief-update models which explicitly model uncertainty. Here, we see that a reduced trait self-positivity underpins this negative social evaluation process. The final experimental chapter presented in Chapter 5 investigates the link between different computational mechanisms, such as uncertainty, and a range of mood and anxiety symptomatology. This study revealed cognitive, social and somatic computational profiles that share a threat bias mechanism but have distinct negative-self bias and aversive learning signatures. Contrary to expectations, none of the uncertainty measures showed any associations with anxiety symptom subtypes. Finally, chapter 6 brings together the work in this thesis and alongside limitations of the work, discusses how these experiments contribute to our understanding of anxiety and the role of uncertainty across the anxiety spectrum

    Decisions, decisions, decisions: the development and plasticity of reinforcement learning, social and temporal decision making in children

    Get PDF
    Human decision-making is the flexible way people respond to their environment, take actions, and plan toward long-term goals. It is commonly thought that humans rely on distinct decision-making systems, which are either more habitual and reflexive or deliberate and calculated. How we make decisions can provide insight into our social functioning, mental health and underlying psychopathology, and ability to consider the consequences of our actions. Notably, the ability to make appropriate, habitual or deliberate decisions depending on the context, here referred to as metacontrol, remains underexplored in developmental samples. This thesis aims to investigate the development of different decision-making mechanisms in middle childhood (ages 5-13) and to illuminate the potential neurocognitive mechanisms underlying value-based decision-making. Using a novel sequential decision-making task, the first experimental chapter presents robust markers of model-based decision-making in childhood (N = 85), which reflects the ability to plan through a sequential task structure, contrary to previous developmental studies. Using the same paradigm, in a new sample via both behavioral (N = 69) and MRI-based measures (N = 44), the second experimental chapter explores the neurocognitive mechanisms that may underlie model-based decision-making and its metacontrol in childhood and links individual differences in inhibition and cortical thickness to metacontrol. The third experimental chapter explores the potential plasticity of social and intertemporal decision-making in a longitudinal executive function training paradigm (N = 205) and initial relationships with executive functions. Finally, I critically discuss the results presented in this thesis and their implications and outline directions for future research in the neurocognitive underpinnings of decision-making during development

    The brain as a generative model: information-theoretic surprise in learning and action

    Get PDF
    Our environment is rich with statistical regularities, such as a sudden cold gust of wind indicating a potential change in weather. A combination of theoretical work and empirical evidence suggests that humans embed this information in an internal representation of the world. This generative model is used to perform probabilistic inference, which may be approximated through surprise minimization. This process rests on current beliefs enabling predictions, with expectation violation amounting to surprise. Through repeated interaction with the world, beliefs become more accurate and grow more certain over time. Perception and learning may be accounted for by minimizing surprise of current observations, while action is proposed to minimize expected surprise of future events. This framework thus shows promise as a common formulation for different brain functions. The work presented here adopts information-theoretic quantities of surprise to investigate both perceptual learning and action. We recorded electroencephalography (EEG) of participants in a somatosensory roving-stimulus paradigm and performed trial-by-trial modeling of cortical dynamics. Bayesian model selection suggests early processing in somatosensory cortices to encode confidence-corrected surprise and subsequently Bayesian surprise. This suggests the somatosensory system to signal surprise of observations and update a probabilistic model learning transition probabilities. We also extended this framework to include audition and vision in a multi-modal roving-stimulus study. Next, we studied action by investigating a sensitivity to expected Bayesian surprise. Interestingly, this quantity is also known as information gain and arises as an incentive to reduce uncertainty in the active inference framework, which can correspond to surprise minimization. In comparing active inference to a classical reinforcement learning model on the two-step decision-making task, we provided initial evidence for active inference to better account for human model-based behaviour. This appeared to relate to participants’ sensitivity to expected Bayesian surprise and contributed to explaining exploration behaviour not accounted for by the reinforcement learning model. Overall, our findings provide evidence for information-theoretic surprise as a model for perceptual learning signals while also guiding human action.Unsere Umwelt ist reich an statistischen RegelmĂ€ĂŸigkeiten, wie z. B. ein plötzlicher kalter Windstoß, der einen möglichen Wetterumschwung ankĂŒndigt. Eine Kombination aus theoretischen Arbeiten und empirischen Erkenntnissen legt nahe, dass der Mensch diese Informationen in eine interne Darstellung der Welt einbettet. Dieses generative Modell wird verwendet, um probabilistische Inferenz durchzufĂŒhren, die durch Minimierung von Überraschungen angenĂ€hert werden kann. Der Prozess beruht auf aktuellen Annahmen, die Vorhersagen ermöglichen, wobei eine Verletzung der Erwartungen einer Überraschung gleichkommt. Durch wiederholte Interaktion mit der Welt nehmen die Annahmen mit der Zeit an Genauigkeit und Gewissheit zu. Es wird angenommen, dass Wahrnehmung und Lernen durch die Minimierung von Überraschungen bei aktuellen Beobachtungen erklĂ€rt werden können, wĂ€hrend Handlung erwartete Überraschungen fĂŒr zukĂŒnftige Beobachtungen minimiert. Dieser Rahmen ist daher als gemeinsame Bezeichnung fĂŒr verschiedene Gehirnfunktionen vielversprechend. In der hier vorgestellten Arbeit werden informationstheoretische GrĂ¶ĂŸen der Überraschung verwendet, um sowohl Wahrnehmungslernen als auch Handeln zu untersuchen. Wir haben die Elektroenzephalographie (EEG) von Teilnehmern in einem somatosensorischen Paradigma aufgezeichnet und eine trial-by-trial Modellierung der kortikalen Dynamik durchgefĂŒhrt. Die Bayes'sche Modellauswahl deutet darauf hin, dass frĂŒhe Verarbeitung in den somatosensorischen Kortizes confidence corrected surprise und Bayesian surprise kodiert. Dies legt nahe, dass das somatosensorische System die Überraschung ĂŒber Beobachtungen signalisiert und ein probabilistisches Modell aktualisiert, welches wiederum Wahrscheinlichkeiten in Bezug auf ÜbergĂ€nge zwischen Reizen lernt. In einer weiteren multimodalen Roving-Stimulus-Studie haben wir diesen Rahmen auch auf die auditorische und visuelle ModalitĂ€t ausgeweitet. Als NĂ€chstes untersuchten wir Handlungen, indem wir die Empfindlichkeit gegenĂŒber der erwarteten Bayesian surprise betrachteten. Interessanterweise ist diese informationstheoretische GrĂ¶ĂŸe auch als Informationsgewinn bekannt und stellt, im Rahmen von active inference, einen Anreiz dar, Unsicherheit zu reduzieren. Dies wiederum kann einer Minimierung der Überraschung entsprechen. Durch den Vergleich von active inference mit einem klassischen Modell des VerstĂ€rkungslernens (reinforcement learning) bei der zweistufigen Entscheidungsaufgabe konnten wir erste Belege dafĂŒr liefern, dass active inference menschliches modellbasiertes Verhalten besser abbildet. Dies scheint mit der SensibilitĂ€t der Teilnehmer gegenĂŒber der erwarteten Bayesian surprise zusammenzuhĂ€ngen und trĂ€gt zur ErklĂ€rung des Explorationsverhaltens bei, das jedoch nicht vom reinforcement learning-Modell erklĂ€rt werden kann. Insgesamt liefern unsere Ergebnisse Hinweise fĂŒr Formulierungen der informationstheoretischen Überraschung als Modell fĂŒr Signale wahrnehmungsbasierten Lernens, die auch menschliches Handeln steuern
    • 

    corecore