Search CORE

46 research outputs found

Adaptive learning and decision-making under uncertainty by metaplastic synapses guided by a surprise detection system

Author: Iigaya K
Publication venue: ELIFE SCIENCES PUBLICATIONS LTD
Publication date: 09/08/2016
Field of study

Recent experiments have shown that animals and humans have a remarkable ability to adapt their learning rate according to the volatility of the environment. Yet the neural mechanism responsible for such adaptive learning has remained unclear. To fill this gap, we investigated a biophysically inspired, metaplastic synaptic model within the context of a well-studied decision-making network, in which synapses can change their rate of plasticity in addition to their efficacy according to a reward-based learning rule. We found that our model, which assumes that synaptic plasticity is guided by a novel surprise detection system, captures a wide range of key experimental findings and performs as well as a Bayes optimal model, with remarkably little parameter tuning. Our results further demonstrate the computational power of synaptic plasticity, and provide insights into the circuit-level computation which underlies adaptive decision-making

UCL Discovery

PubMed Central

Discounting Future Reward in an Uncertain World

Author: Blain B
Dolan RJ
Hauser TU
Iigaya K
Kurth-Nelson Z
Moutoussis M
Story GW
Vlaev I
Will GJ
Publication venue: 'American Psychological Association (APA)'
Publication date: 29/06/2023
Field of study

Humans discount delayed relative to more immediate reward. A plausible explanation is that impatience arises partly from uncertainty, or risk, implicit in delayed reward. Existing theories of discounting-as-risk focus on a probability that delayed reward will not materialize. By contrast, we examine how uncertainty in the magnitude of delayed reward contributes to delay discounting. We propose a model wherein reward is discounted proportional to the rate of random change in its magnitude across time, termed volatility. We find evidence to support this model across three experiments (total N = 158). First, using a task where participants chose when to sell products, whose price dynamics they previously learned, we show discounting increases in line with price volatility. Second, we show that this effect pertains over naturalistic delays of up to 4 months. Using functional magnetic resonance imaging, we observe a volatility-dependent decrease in functional hippocampal–prefrontal coupling during intertemporal choice. Third, we replicate these effects in a larger online sample, finding that volatility discounting within each task correlates with baseline discounting outside of the task.We conclude that delay discounting partly reflects time-dependent uncertainty about reward magnitude, that is volatility. Our model captures how discounting adapts to volatility, thereby partly accounting for individual differences in impatience. Our imaging findings suggest a putative mechanism whereby uncertainty reduces prospective simulation of future outcomes

UCL Discovery

An effect of serotonergic stimulation on learning rates for rewards apparent after long intertrial intervals

Author: A Soltani
AC Butler
AG Collins
AJ Yu
AM Bornstein
AS Hart
B Lau
B Seymour
BL Jacobs
CJCH Watkins
CR Gallistel
D Lee
DA Worthy
DJ Barraclough
G Aston-Jones
GS Corrado
H Clarke
HF Kim
HS Seung
J Deakin
J Schweimer
JFM Vetencourt
JW Deakin
JY Cohen
K Doya
K Iigaya
K Iigaya
K Iigaya
K Iigaya
K Preuschoff
KP Kording
KT Kishida
KW Miyazaki
LP Sugrue
M Guitart-Masip
M Luo
M Tops
MJ Crockett
MR Nassar
MS Fonseca
ND Daw
ND Daw
P Ashourian
P Dayan
P Dayan
P Deurwaerdère De
P Soubrie
PA Correia
PJ Fletcher
PR Montague
QJ Huys
RD Hawkins
RJ Herrnstein
RS Sutton
RS Sutton
S Fusi
S Gong
S Walker
S Xu
SW Lee
TE Behrens
W Schultz
Y Loewenstein
Y Loewenstein
Y Sakai
YL Boureau
Z Liu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

Serotonin has widespread, but computationally obscure, modulatory effects on learning and cognition. Here, we studied the impact of optogenetic stimulation of dorsal raphe serotonin neurons in mice performing a non-stationary, reward-driven decision-making task. Animals showed two distinct choice strategies. Choices after short inter-trial-intervals (ITIs) depended only on the last trial outcome and followed a win-stay-lose-switch pattern. In contrast, choices after long ITIs reflected outcome history over multiple trials, as described by reinforcement learning models. We found that optogenetic stimulation during a trial significantly boosted the rate of learning that occurred due to the outcome of that trial, but these effects were only exhibited on choices after long ITIs. This suggests that serotonin neurons modulate reinforcement learning rates, and that this influence is masked by alternate, unaffected, decision mechanisms. These results provide insight into the role of serotonin in treating psychiatric disorders, particularly its modulation of neural plasticity and learning.info:eu-repo/semantics/publishedVersio

Recommended from our members

Retrospective model-based inference guides model-free credit assignment

Author: A Lak
BB Doll
BW Balleine
CD Adams
CD Gipson
CK Starkweather
EJ Wagenmakers
ES Bromberg-Martin
F Cushman
HH Yin
HH Yin
J Gläscher
K Iigaya
M Keramati
M Vasconcelos
N Kriegeskorte
ND Daw
ND Daw
ND Daw
P Smittenaar
R Kiani
R Moran
RA Rescorla
RJ Dolan
RPN Rao
S Kakade
S Killcross
S Wan Lee
SJ Gershman
SP Singh
TR Zentall
VV Valentin
W Schultz
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

An extensive reinforcement learning literature shows that organisms assign credit efficiently, even under conditions of state uncertainty. However, little is known about credit-assignment when state uncertainty is subsequently resolved. Here, we address this problem within the framework of an interaction between model-free (MF) and model-based (MB) control systems. We present and support experimentally a theory of MB retrospective-inference. Within this framework, a MB system resolves uncertainty that prevailed when actions were taken thus guiding an MF credit-assignment. Using a task in which there was initial uncertainty about the lotteries that were chosen, we found that when participants’ momentary uncertainty about which lottery had generated an outcome was resolved by provision of subsequent information, participants preferentially assigned credit within a MF system to the lottery they retrospectively inferred was responsible for this outcome. These findings extend our knowledge about the range of MB functions and the scope of system interactions

City Research Online

Crossref

Directory of Open Access Journals

UCL Discovery

MPG.PuRe

Coarse-Grained Finite-Temperature Theory for the Condensate in Optical Lattices

Author: A. Griffin
A.A. Abrikosov
A.M. Rey
A.M. Rey
B. Wu
B. Wu
C. Kollath
C. Menotti
C.D. Fertig
C.J. Pethick
C.W. Gardiner
D. Jaksch
E. Altman
E. Altman
E. Taylor
E. Zaremba
F. Dalfovo
F. Ferlaino
F.S. Cataliotti
F.S. Cataliotti
H. Haug
H. Shi
H.T.C. Stoof
H.T.C. Stoof
I. Bloch
I. Danshita
J.E. Williams
J.E. Williams
J.E. Williams
J.M. Cornwall
J.S. Schwinger
K. Iigaya
K. Temme
K. Xu
L. Fallani
L. Pitaevskii
L. Sarlo De
L.D. Landau
L.P. Kadanoff
L.P. Pitaevskii
L.V. Keldysh
M. Albiez
M. Greiner
M. Krämer
M. Krämer
M. Machholm
M. Modugno
N.P. Proukakis
O. Morsch
P. Danielewicz
P. Navez
P.O. Fedichev
P.O. Fedichev
R. Walser
R.A. Duine
S. Burger
S. Giorgini
S. Giorgini
S. Konabe
S. Konabe
S. Konabe
S. Stringari
S. Tsuchiya
T. Gasenzer
T. Nikuni
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 26/07/2007
Field of study

In this work, we derive a coarse-grained finite-temperature theory for a Bose condensate in a one-dimensional optical lattice, in addition to a confining harmonic trap potential. We start from a two-particle irreducible (2PI) effective action on the Schwinger-Keldysh closed-time contour path. In principle, this action involves all information of equilibrium and non-equilibrium properties of the condensate and noncondensate atoms. By assuming an ansatz for the variational function, i.e., the condensate order parameter in an effective action, we derive a coarse-grained effective action, which describes the dynamics on the length scale much longer than a lattice constant. Using the variational principle, coarse-grained equations of motion for the condensate variables are obtained. These equations include a dissipative term due to collisions between condensate and noncondensate atoms, as well as noncondensate mean-field. To illustrate the usefulness of our formalism, we discuss a Landau instability of the condensate in optical lattices by using the coarse-grained generalized Gross-Pitaevskii hydrodynamics. We found that the collisional damping rate due to collisions between the condensate and noncondensate atoms changes sign when the condensate velocity exceeds a renormalized sound velocity, leading to a Landau instability consistent with the Landau criterion. Our results in this work give an insight into the microscopic origin of the Landau instability.Comment: 38 pages, 2 figures. Submitted to Journal of Low Temperature Physic

arXiv.org e-Print Archive

Crossref

Cognitive Bias in Ambiguity Judgements:Using Computational Models to Dissect the Effects of Mild Mood Manipulation in Humans

Author: A Schick
AM Isen
Aurelie Jolivald
BM Spruijt
CA Hales
CE Hernandez
D Kahneman
D Nettle
D Watson
E Bethell
E Eldar
EJ Bethell
EJ Harding
EL Gibson
Elizabeth Paul
ES Paul
ES Paul
Giuseppe di Pellegrino
H Anisman
I Blanchette
Iain D. Gilchrist
IH Gotlib
J Papciak
J Van der Harst
JA Russel
K Berridge
K Iigaya
K Mogg
Kiyohito Iigaya
L Gygax
L Whiteley
M Guitart-Masip
M Mendl
M Mendl
M Mendl
MH Anderson
Michael Mendl
MJ Crockett
OH Burman
OH Burman
P Bongers
P Willner
Peter Dayan
QJ Huys
QJ Huys
QJ Huys
R Ratcliff
RE Doyle
RM Nesse
S Lissek
S Mineka
SM Matheson
SM Tom
TE Nygren
Wittawat Jitkrittum
X Zhang
Y Bar-Haim
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/11/2016
Field of study

Positive and negative moods can be treated as prior expectations over future delivery of rewards and punishments. This provides an inferential foundation for the cognitive (judgement) bias task, now widely-used for assessing affective states in non-human animals. In the task, information about affect is extracted from the optimistic or pessimistic manner in which participants resolve ambiguities in sensory input. Here, we report a novel variant of the task aimed at dissecting the effects of affect manipulations on perceptual and value computations for decision-making under ambiguity in humans. Participants were instructed to judge which way a Gabor patch (250ms presentation) was leaning. If the stimulus leant one way (e.g. left), pressing the REWard key yielded a monetary WIN whilst pressing the SAFE key failed to acquire the WIN. If it leant the other way (e.g. right), pressing the SAFE key avoided a LOSS whilst pressing the REWard key incurred the LOSS. The size (0-100 UK pence) of the offered WIN and threatened LOSS, and the ambiguity of the stimulus (vertical being completely ambiguous) were varied on a trial-by-trial basis, allowing us to investigate how decisions were affected by differing combinations of these factors. Half the subjects performed the task in a 'Pleasantly' decorated room and were given a gift (bag of sweets) prior to starting, whilst the other half were in a bare 'Unpleasant' room and were not given anything. Although these treatments had little effect on self-reported mood, they did lead to differences in decision-making. All subjects were risk averse under ambiguity, consistent with the notion of loss aversion. Analysis using a Bayesian decision model indicated that Unpleasant Room subjects were ('pessimistically') biased towards choosing the SAFE key under ambiguity, but also weighed WINS more heavily than LOSSes compared to Pleasant Room subjects. These apparently contradictory findings may be explained by the influence of affect on different processes underlying decision-making, and the task presented here offers opportunities for further dissecting such processes

Crossref

Directory of Open Access Journals

PubMed Central

UCL Discovery

MPG.PuRe

Explore Bristol Research

FigShare

A neural integrator model for planning and value-based decision making of a robotics assistant

Author: A Agostini
A Bannat
A Billard
A Zunino
AA Koulakov
B Lau
BJ Rhodes
BR Cox
C Faubel
CD Brody
CD Brody
CE Curtis
CR Laing
E Bicho
E Bicho
E Bicho
E Sousa
EC Silva
ED Remington
Estela Bicho
F Ferreira
F Ferreira
Flora Ferreira
G Schöner
HS Seung
J Krüger
J Rankin
J Wang
K Iigaya
LP Sugrue
Luís Louro
M Haller
M Pardowitz
MH Histed
MP Mayer
N Cain
N Sünderhauf
O Lomp
P Choe
P Tsarouchi
P Wang
Paulo Vicente
R Kozma
R Silva
R Wilcox
RJ Herrnstein
S Amari
S Coombes
S Coombes
S Lemaignan
SJ Hu
T Machado
W Erlhagen
W Erlhagen
W Erlhagen
W Wei
Weronika Wojtak
Wolfram Erlhagen
Y LeCun
Y Lin
Y Sakai
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

Modern manufacturing and assembly environments are characterized by a high variability in the built process which challenges human–robot cooperation. To reduce the cognitive workload of the operator, the robot should not only be able to learn from experience but also to plan and decide autonomously. Here, we present an approach based on Dynamic Neural Fields that apply brain-like computations to endow a robot with these cognitive functions. A neural integrator is used to model the gradual accumulation of sensory and other evidence as time-varying persistent activity of neural populations. The decision to act is modeled by a competitive dynamics between neural populations linked to different motor behaviors. They receive the persistent activation pattern of the integrators as input. In the first experiment, a robot learns rapidly by observation the sequential order of object transfers between an assistant and an operator to subsequently substitute the assistant in the joint task. The results show that the robot is able to proactively plan the series of handovers in the correct order. In the second experiment, a mobile robot searches at two different workbenches for a specific object to deliver it to an operator. The object may appear at the two locations in a certain time period with independent probabilities unknown to the robot. The trial-by-trial decision under uncertainty is biased by the accumulated evidence of past successes and choices. The choice behavior over a longer period reveals that the robot achieves a high search efficiency in stationary as well as dynamic environments.The work received financial support from FCT through the PhD fellowships PD/BD/128183/2016 and SFRH/BD/124912/2016, the project “Neurofield” (PTDC/MAT-APL/31393/2017) and the research centre CMAT within the project UID/MAT/00013/2013

Universidade do Minho: RepositoriUM

Crossref