69 research outputs found
Homeostatic Reinforcement Theory Accounts for Sodium Appetitive State- and Taste-Dependent Dopamine Responding
Seeking and consuming nutrients is essential to survival and the maintenance of life. Dynamic and volatile environments require that animals learn complex behavioral strategies to obtain the necessary nutritive substances. While this has been classically viewed in terms of homeostatic regulation, recent theoretical work proposed that such strategies result from reinforcement learning processes. This theory proposed that phasic dopamine (DA) signals play a key role in signaling potentially need-fulfilling outcomes. To examine links between homeostatic and reinforcement learning processes, we focus on sodium appetite as sodium depletion triggers state- and taste-dependent changes in behavior and DA signaling evoked by sodium-related stimuli. We find that both the behavior and the dynamics of DA signaling underlying sodium appetite can be accounted for by a homeostatically regulated reinforcement learning framework (HRRL). We first optimized HRRL-based agents to sodium-seeking behavior measured in rodents. Agents successfully reproduced the state and the taste dependence of behavioral responding for sodium as well as for lithium and potassium salts. We then showed that these same agents account for the regulation of DA signals evoked by sodium tastants in a taste- and state-dependent manner. Our models quantitatively describe how DA signals evoked by sodium decrease with satiety and increase with deprivation. Lastly, our HRRL agents assigned equal preference for sodium versus the lithium containing salts, accounting for similar behavioral and neurophysiological observations in rodents. We propose that animals use orosensory signals as predictors of the internal impact of the consumed good and our results pose clear targets for future experiments. In sum, this work suggests that appetite-driven behavior may be driven by reinforcement learning mechanisms that are dynamically tuned by homeostatic need.NIHANRPeer Reviewe
Nucleus Accumbens Neurons Are Innately Tuned for Rewarding and Aversive Taste Stimuli, Encode Their Predictors, and Are Linked to Motor Output
SummaryThe nucleus accumbens (NAc) is a key component of the brain's reward pathway, yet little is known of how NAc cells respond to primary rewarding or aversive stimuli. Here, naive rats received brief intraoral infusions of sucrose and quinine paired with cues in a classical conditioning paradigm while the electrophysiological activity of individual NAc neurons was recorded. NAc neurons (102) were typically inhibited by sucrose (39 of 52, 75%) or excited by quinine (30 of 40, 75%) infusions. Changes in firing rate were correlated with the oromotor response to intraoral infusions. Most taste-responsive neurons responded to only one of the stimuli. NAc neurons developed responses to the cues paired with sucrose and quinine. Thus, NAc neurons are innately tuned to rewarding and aversive stimuli and rapidly develop responses to predictive cues. The results indicate that the output of the NAc is very different when rats taste rewarding versus aversive stimuli
Prolonged High Fat Diet Reduces Dopamine Reuptake without Altering DAT Gene Expression
The development of diet-induced obesity (DIO) can potently alter multiple aspects of dopamine signaling, including dopamine transporter (DAT) expression and dopamine reuptake. However, the time-course of diet-induced changes in DAT expression and function and whether such changes are dependent upon the development of DIO remains unresolved. Here, we fed rats a high (HFD) or low (LFD) fat diet for 2 or 6 weeks. Following diet exposure, rats were anesthetized with urethane and striatal DAT function was assessed by electrically stimulating the dopamine cell bodies in the ventral tegmental area (VTA) and recording resultant changes in dopamine concentration in the ventral striatum using fast-scan cyclic voltammetry. We also quantified the effect of HFD on membrane associated DAT in striatal cell fractions from a separate group of rats following exposure to the same diet protocol. Notably, none of our treatment groups differed in body weight. We found a deficit in the rate of dopamine reuptake in HFD rats relative to LFD rats after 6 but not 2 weeks of diet exposure. Additionally, the increase in evoked dopamine following a pharmacological challenge of cocaine was significantly attenuated in HFD relative to LFD rats. Western blot analysis revealed that there was no effect of diet on total DAT protein. However, 6 weeks of HFD exposure significantly reduced the 50 kDa DAT isoform in a synaptosomal membrane-associated fraction, but not in a fraction associated with recycling endosomes. Our data provide further evidence for diet-induced alterations in dopamine reuptake independent of changes in DAT production and demonstrates that such changes can manifest without the development of DIO
Real-time chemical responses in the nucleus accumbens differentiate rewarding and aversive stimuli
Rewarding and aversive stimuli evoke very different patterns of behavior and are rapidly discriminated. Here taste stimuli of opposite hedonic valence evoked opposite patterns of dopamine and metabolic activity within milliseconds in the nucleus accumbens. This rapid encoding may serve to guide ongoing behavioral responses and promote plastic changes in underlying circuitry
Dopamine Operates as a Subsecond Modulator of Food Seeking
The dopamine projection to the nucleus accumbens has been implicated in behaviors directed toward the acquisition and consumption of natural rewards. The neurochemical studies that established this link made time-averaged measurements over minutes, and so the precise temporal relationship between dopamine changes and these behaviors is not known. To resolve this, we sampled dopamine every 100 msec using fast-scan cyclic voltammetry at carbon-fiber microelectrodes in the nucleus accumbens of rats trained to press a lever for sucrose. Cues that signal the opportunity to respond for sucrose evoked dopamine release (67 +/- 20 nm) with short latency (0.2 +/- 0.1 sec onset). When the same cues were presented to rats naive to the cue-sucrose pairing, similar dopamine signals were not observed. Thus, cue-evoked increases in dopamine in trained rats reflected a learned association between the cues and sucrose availability. Lever presses for sucrose occurred at the peak of the dopamine surges. After lever presses, and while sucrose was delivered and consumed, no further increases in dopamine were detected. Rather, dopamine returned to baseline levels. Together, the results strongly implicate subsecond dopamine signaling in the nucleus accumbens as a real-time modulator of food-seeking behavior
Regional specificity in the real-time development of phasic dopamine transmission patterns during acquisition of a cue-cocaine association in rats
Drug seeking is significantly regulated by drug-associated cues and associative learning between environmental cues and cocaine reward is mediated by dopamine transmission within the nucleus accumbens (NAc). However, dopamine transmission during early acquisition of a cue-cocaine association has never been assessed because of the technical difficulties associated with resolving cue-evoked and cocaine-evoked dopamine release within the same conditioning trial. Here, we used fast-scan cyclic voltammetry to measure sub-second fluctuations in dopamine concentration within the NAc core and shell during the initial acquisition of a cue-cocaine Pavlovian association. Within the NAc core, cue-evoked dopamine release developed during conditioning. However, within the NAc shell, the predictive cue appeared to cause an unconditioned decrease in dopamine concentration. The pharmacological effects of cocaine also differed between sub-regions, as cocaine increased phasic dopamine release events within the NAc shell but not the core. Thus, real-time measurements not only revealed the initial development of a conditioned neurochemical response but also demonstrated differential phasic dopamine transmission patterns across NAc sub-regions during the acquisition of a cue-cocaine association
Sources contributing to the average extracellular concentration of dopamine in the nucleus accumbens
Mesolimbic dopamine neurons fire in both tonic and phasic modes resulting in detectable extracellular levels of dopamine in the nucleus accumbens (NAc). In the past, different techniques have targeted dopamine levels in the NAc to establish a basal concentration. In this study, we used in vivo fast scan cyclic voltammetry (FSCV) in the NAc of awake, freely moving rats. The experiments were primarily designed to capture changes in dopamine caused by phasic firing - that is, the measurement of dopamine 'transients'. These FSCV measurements revealed for the first time that spontaneous dopamine transients constitute a major component of extracellular dopamine levels in the NAc. A series of experiments were designed to probe regulation of extracellular dopamine. Lidocaine was infused into the ventral tegmental area, the site of dopamine cell bodies, to arrest neuronal firing. While there was virtually no instantaneous change in dopamine concentration, longer sampling revealed a decrease in dopamine transients and a time-averaged decrease in the extracellular level. Dopamine transporter inhibition using intravenous GBR12909 injections increased extracellular dopamine levels changing both frequency and size of dopamine transients in the NAc. To further unmask the mechanics governing extracellular dopamine levels we used intravenous injection of the vesicular monoamine transporter (VMAT2) inhibitor, tetrabenazine, to deplete dopamine storage and increase cytoplasmic dopamine in the nerve terminals. Tetrabenazine almost abolished phasic dopamine release but increased extracellular dopamine to ~500 nM, presumably by inducing reverse transport by dopamine transporter (DAT). Taken together, data presented here show that average extracellular dopamine in the NAc is low (20-30 nM) and largely arises from phasic dopamine transients
Dynamic excitatory and inhibitory gain modulation can produce flexible, robust and optimal decision-making
<div><p>Behavioural and neurophysiological studies in primates have increasingly shown the involvement of urgency signals during the temporal integration of sensory evidence in perceptual decision-making. Neuronal correlates of such signals have been found in the parietal cortex, and in separate studies, demonstrated attention-induced gain modulation of both excitatory and inhibitory neurons. Although previous computational models of decision-making have incorporated gain modulation, their abstract forms do not permit an understanding of the contribution of inhibitory gain modulation. Thus, the effects of co-modulating both excitatory and inhibitory neuronal gains on decision-making dynamics and behavioural performance remain unclear. In this work, we incorporate time-dependent co-modulation of the gains of both excitatory and inhibitory neurons into our previous biologically based decision circuit model. We base our computational study in the context of two classic motion-discrimination tasks performed in animals. Our model shows that by simultaneously increasing the gains of both excitatory and inhibitory neurons, a variety of the observed dynamic neuronal firing activities can be replicated. In particular, the model can exhibit winner-take-all decision-making behaviour with higher firing rates and within a significantly more robust model parameter range. It also exhibits short-tailed reaction time distributions even when operating near a dynamical bifurcation point. The model further shows that neuronal gain modulation can compensate for weaker recurrent excitation in a decision neural circuit, and support decision formation and storage. Higher neuronal gain is also suggested in the more cognitively demanding reaction time than in the fixed delay version of the task. Using the exact temporal delays from the animal experiments, fast recruitment of gain co-modulation is shown to maximize reward rate, with a timescale that is surprisingly near the experimentally fitted value. Our work provides insights into the simultaneous and rapid modulation of excitatory and inhibitory neuronal gains, which enables flexible, robust, and optimal decision-making.</p></div
Temporal-Difference Reinforcement Learning with Distributed Representations
Temporal-difference (TD) algorithms have been proposed as models of reinforcement learning (RL). We examine two issues of distributed representation in these TD algorithms: distributed representations of belief and distributed discounting factors. Distributed representation of belief allows the believed state of the world to distribute across sets of equivalent states. Distributed exponential discounting factors produce hyperbolic discounting in the behavior of the agent itself. We examine these issues in the context of a TD RL model in which state-belief is distributed over a set of exponentially-discounting “micro-Agents”, each of which has a separate discounting factor (γ). Each µAgent maintains an independent hypothesis about the state of the world, and a separate value-estimate of taking actions within that hypothesized state. The overall agent thus instantiates a flexible representation of an evolving world-state. As with other TD models, the value-error (δ) signal within the model matches dopamine signals recorded from animals in standard conditioning reward-paradigms. The distributed representation of belief provides an explanation for the decrease in dopamine at the conditioned stimulus seen in overtrained animals, for the differences between trace and delay conditioning, and for transient bursts of dopamine seen at movement initiation. Because each µAgent also includes its own exponential discounting factor, the overall agent shows hyperbolic discounting, consistent with behavioral experiments
- …