46 research outputs found

    Adaptive learning and decision-making under uncertainty by metaplastic synapses guided by a surprise detection system

    Get PDF
    Recent experiments have shown that animals and humans have a remarkable ability to adapt their learning rate according to the volatility of the environment. Yet the neural mechanism responsible for such adaptive learning has remained unclear. To fill this gap, we investigated a biophysically inspired, metaplastic synaptic model within the context of a well-studied decision-making network, in which synapses can change their rate of plasticity in addition to their efficacy according to a reward-based learning rule. We found that our model, which assumes that synaptic plasticity is guided by a novel surprise detection system, captures a wide range of key experimental findings and performs as well as a Bayes optimal model, with remarkably little parameter tuning. Our results further demonstrate the computational power of synaptic plasticity, and provide insights into the circuit-level computation which underlies adaptive decision-making

    Discounting Future Reward in an Uncertain World

    Get PDF
    Humans discount delayed relative to more immediate reward. A plausible explanation is that impatience arises partly from uncertainty, or risk, implicit in delayed reward. Existing theories of discounting-as-risk focus on a probability that delayed reward will not materialize. By contrast, we examine how uncertainty in the magnitude of delayed reward contributes to delay discounting. We propose a model wherein reward is discounted proportional to the rate of random change in its magnitude across time, termed volatility. We find evidence to support this model across three experiments (total N = 158). First, using a task where participants chose when to sell products, whose price dynamics they previously learned, we show discounting increases in line with price volatility. Second, we show that this effect pertains over naturalistic delays of up to 4 months. Using functional magnetic resonance imaging, we observe a volatility-dependent decrease in functional hippocampal–prefrontal coupling during intertemporal choice. Third, we replicate these effects in a larger online sample, finding that volatility discounting within each task correlates with baseline discounting outside of the task.We conclude that delay discounting partly reflects time-dependent uncertainty about reward magnitude, that is volatility. Our model captures how discounting adapts to volatility, thereby partly accounting for individual differences in impatience. Our imaging findings suggest a putative mechanism whereby uncertainty reduces prospective simulation of future outcomes

    An effect of serotonergic stimulation on learning rates for rewards apparent after long intertrial intervals

    Get PDF
    Serotonin has widespread, but computationally obscure, modulatory effects on learning and cognition. Here, we studied the impact of optogenetic stimulation of dorsal raphe serotonin neurons in mice performing a non-stationary, reward-driven decision-making task. Animals showed two distinct choice strategies. Choices after short inter-trial-intervals (ITIs) depended only on the last trial outcome and followed a win-stay-lose-switch pattern. In contrast, choices after long ITIs reflected outcome history over multiple trials, as described by reinforcement learning models. We found that optogenetic stimulation during a trial significantly boosted the rate of learning that occurred due to the outcome of that trial, but these effects were only exhibited on choices after long ITIs. This suggests that serotonin neurons modulate reinforcement learning rates, and that this influence is masked by alternate, unaffected, decision mechanisms. These results provide insight into the role of serotonin in treating psychiatric disorders, particularly its modulation of neural plasticity and learning.info:eu-repo/semantics/publishedVersio

    Coarse-Grained Finite-Temperature Theory for the Condensate in Optical Lattices

    Full text link
    In this work, we derive a coarse-grained finite-temperature theory for a Bose condensate in a one-dimensional optical lattice, in addition to a confining harmonic trap potential. We start from a two-particle irreducible (2PI) effective action on the Schwinger-Keldysh closed-time contour path. In principle, this action involves all information of equilibrium and non-equilibrium properties of the condensate and noncondensate atoms. By assuming an ansatz for the variational function, i.e., the condensate order parameter in an effective action, we derive a coarse-grained effective action, which describes the dynamics on the length scale much longer than a lattice constant. Using the variational principle, coarse-grained equations of motion for the condensate variables are obtained. These equations include a dissipative term due to collisions between condensate and noncondensate atoms, as well as noncondensate mean-field. To illustrate the usefulness of our formalism, we discuss a Landau instability of the condensate in optical lattices by using the coarse-grained generalized Gross-Pitaevskii hydrodynamics. We found that the collisional damping rate due to collisions between the condensate and noncondensate atoms changes sign when the condensate velocity exceeds a renormalized sound velocity, leading to a Landau instability consistent with the Landau criterion. Our results in this work give an insight into the microscopic origin of the Landau instability.Comment: 38 pages, 2 figures. Submitted to Journal of Low Temperature Physic

    Cognitive Bias in Ambiguity Judgements:Using Computational Models to Dissect the Effects of Mild Mood Manipulation in Humans

    Get PDF
    Positive and negative moods can be treated as prior expectations over future delivery of rewards and punishments. This provides an inferential foundation for the cognitive (judgement) bias task, now widely-used for assessing affective states in non-human animals. In the task, information about affect is extracted from the optimistic or pessimistic manner in which participants resolve ambiguities in sensory input. Here, we report a novel variant of the task aimed at dissecting the effects of affect manipulations on perceptual and value computations for decision-making under ambiguity in humans. Participants were instructed to judge which way a Gabor patch (250ms presentation) was leaning. If the stimulus leant one way (e.g. left), pressing the REWard key yielded a monetary WIN whilst pressing the SAFE key failed to acquire the WIN. If it leant the other way (e.g. right), pressing the SAFE key avoided a LOSS whilst pressing the REWard key incurred the LOSS. The size (0-100 UK pence) of the offered WIN and threatened LOSS, and the ambiguity of the stimulus (vertical being completely ambiguous) were varied on a trial-by-trial basis, allowing us to investigate how decisions were affected by differing combinations of these factors. Half the subjects performed the task in a 'Pleasantly' decorated room and were given a gift (bag of sweets) prior to starting, whilst the other half were in a bare 'Unpleasant' room and were not given anything. Although these treatments had little effect on self-reported mood, they did lead to differences in decision-making. All subjects were risk averse under ambiguity, consistent with the notion of loss aversion. Analysis using a Bayesian decision model indicated that Unpleasant Room subjects were ('pessimistically') biased towards choosing the SAFE key under ambiguity, but also weighed WINS more heavily than LOSSes compared to Pleasant Room subjects. These apparently contradictory findings may be explained by the influence of affect on different processes underlying decision-making, and the task presented here offers opportunities for further dissecting such processes

    A neural integrator model for planning and value-based decision making of a robotics assistant

    Get PDF
    Modern manufacturing and assembly environments are characterized by a high variability in the built process which challenges human–robot cooperation. To reduce the cognitive workload of the operator, the robot should not only be able to learn from experience but also to plan and decide autonomously. Here, we present an approach based on Dynamic Neural Fields that apply brain-like computations to endow a robot with these cognitive functions. A neural integrator is used to model the gradual accumulation of sensory and other evidence as time-varying persistent activity of neural populations. The decision to act is modeled by a competitive dynamics between neural populations linked to different motor behaviors. They receive the persistent activation pattern of the integrators as input. In the first experiment, a robot learns rapidly by observation the sequential order of object transfers between an assistant and an operator to subsequently substitute the assistant in the joint task. The results show that the robot is able to proactively plan the series of handovers in the correct order. In the second experiment, a mobile robot searches at two different workbenches for a specific object to deliver it to an operator. The object may appear at the two locations in a certain time period with independent probabilities unknown to the robot. The trial-by-trial decision under uncertainty is biased by the accumulated evidence of past successes and choices. The choice behavior over a longer period reveals that the robot achieves a high search efficiency in stationary as well as dynamic environments.The work received financial support from FCT through the PhD fellowships PD/BD/128183/2016 and SFRH/BD/124912/2016, the project “Neurofield” (PTDC/MAT-APL/31393/2017) and the research centre CMAT within the project UID/MAT/00013/2013
    corecore