Search CORE

5,274 research outputs found

Recommended from our members

A reinforcement learning theory for homeostatic regulation

Author: Gutkin B. S.
Keramati M.
Publication venue: Neural Information Processing Systems ( NIPS )
Publication date: 01/01/2011
Field of study

Reinforcement learning models address animal’s behavioral adaptation to its changing “external” environment, and are based on the assumption that Pavlovian, habitual and goal-directed responses seek to maximize reward acquisition. Negative-feedback models of homeostatic regulation, on the other hand, are concerned with behavioral adaptation in response to the “internal” state of the animal, and assume that animals’ behavioral objective is to minimize deviations of some key physiological variables from their hypothetical setpoints. Building upon the drive-reduction theory of reward, we propose a new analytical framework that integrates learning and regulatory systems, such that the two seemingly unrelated objectives of reward maximization and physiological-stability prove to be identi- cal. The proposed theory shows behavioral adaptation to both internal and external states in a disciplined way. We further show that the proposed framework allows for a unified explanation of some behavioral pattern like motivational sensitivity of different associative learning mechanism, anticipatory responses, interaction among competing motivational systems, and risk aversion

City Research Online

Recommended from our members

Cocaine Addiction as a Homeostatic Reinforcement Learning Disorder

Author: Durand A.
Girardeau P.
Gutkin B.
Keramati M.
Serge A.
Publication venue: 'American Psychological Association (APA)'
Publication date: 01/01/2017
Field of study

Drug addiction implicates both reward learning and homeostatic regulation mechanisms of the brain. This has stimulated 2 partially successful theoretical perspectives on addiction. Many important aspects of addiction, however, remain to be explained within a single, unified framework that integrates the 2 mechanisms. Building upon a recently developed homeostatic reinforcement learning theory, the authors focus on a key transition stage of addiction that is well modeled in animals, escalation of drug use, and propose a computational theory of cocaine addiction where cocaine reinforces behavior due to its rapid homeostatic corrective effect, whereas its chronic use induces slow and long-lasting changes in homeostatic setpoint. Simulations show that our new theory accounts for key behavioral and neurobiological features of addiction, most notably, escalation of cocaine use, drug-primed craving and relapse, individual differences underlying dose-response curves, and dopamine D2-receptor downregulation in addicts. The theory also generates unique predictions about cocaine self-administration behavior in rats that are confirmed by new experimental results. Viewing addiction as a homeostatic reinforcement learning disorder coherently explains many behavioral and neurobiological aspects of the transition to cocaine addiction, and suggests a new perspective toward understanding addiction

City Research Online

Crossref

MPG.PuRe

Homeostatic Reinforcement Theory Accounts for Sodium Appetitive State- and Taste-Dependent Dopamine Responding

Author: Bergerot Clémence
Cone Jackson
Duriez Alexia
Gutkin Boris
Roitman Mitchell F.
Publication venue: 'MDPI AG'
Publication date: 17/02/2023
Field of study

Seeking and consuming nutrients is essential to survival and the maintenance of life. Dynamic and volatile environments require that animals learn complex behavioral strategies to obtain the necessary nutritive substances. While this has been classically viewed in terms of homeostatic regulation, recent theoretical work proposed that such strategies result from reinforcement learning processes. This theory proposed that phasic dopamine (DA) signals play a key role in signaling potentially need-fulfilling outcomes. To examine links between homeostatic and reinforcement learning processes, we focus on sodium appetite as sodium depletion triggers state- and taste-dependent changes in behavior and DA signaling evoked by sodium-related stimuli. We find that both the behavior and the dynamics of DA signaling underlying sodium appetite can be accounted for by a homeostatically regulated reinforcement learning framework (HRRL). We first optimized HRRL-based agents to sodium-seeking behavior measured in rodents. Agents successfully reproduced the state and the taste dependence of behavioral responding for sodium as well as for lithium and potassium salts. We then showed that these same agents account for the regulation of DA signals evoked by sodium tastants in a taste- and state-dependent manner. Our models quantitatively describe how DA signals evoked by sodium decrease with satiety and increase with deprivation. Lastly, our HRRL agents assigned equal preference for sodium versus the lithium containing salts, accounting for similar behavioral and neurophysiological observations in rodents. We propose that animals use orosensory signals as predictors of the internal impact of the consumed good and our results pose clear targets for future experiments. In sum, this work suggests that appetite-driven behavior may be driven by reinforcement learning mechanisms that are dynamically tuned by homeostatic need.NIHANRPeer Reviewe

Dokumenten-Publikationsserver der Humboldt-Universität zu Berlin

A deep reinforcement learning based homeostatic system for unmanned position control

Author: Anjum Ashiq
Bower Craig
Dassanayake Priyanthi
Manning Warren
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2019
Field of study

Deep Reinforcement Learning (DRL) has been proven to be capable of designing an optimal control theory by minimising the error in dynamic systems. However, in many of the real-world operations, the exact behaviour of the environment is unknown. In such environments, random changes cause the system to reach different states for the same action. Hence, application of DRL for unpredictable environments is difficult as the states of the world cannot be known for non-stationary transition and reward functions. In this paper, a mechanism to encapsulate the randomness of the environment is suggested using a novel bio-inspired homeostatic approach based on a hybrid of Receptor Density Algorithm (an artificial immune system based anomaly detection application) and a Plastic Spiking Neuronal model. DRL is then introduced to run in conjunction with the above hybrid model. The system is tested on a vehicle to autonomously re-position in an unpredictable environment. Our results show that the DRL based process control raised the accuracy of the hybrid model by 32%.N/

Crossref

UDORA - University of Derby Online Research Archive

Bayesian Learning Models of Pain: A Call to Action

Author: Burr Christopher
Tabor Abby
Publication venue
Publication date: 01/01/2018
Field of study

Learning is fundamentally about action, enabling the successful navigation of a changing and uncertain environment. The experience of pain is central to this process, indicating the need for a change in action so as to mitigate potential threat to bodily integrity. This review considers the application of Bayesian models of learning in pain that inherently accommodate uncertainty and action, which, we shall propose are essential in understanding learning in both acute and persistent cases of pain

PhilPapers

OPUS

UWE Bristol Research Repository

Oxford University Research Archive

Explore Bristol Research

Chaotic exploration and learning of locomotion behaviours

Author: Cohen A. H.
Doya K.
Itoh Y.
Kelso J.A.S.
Ott E.
Pearson K. G.
Pfeifer R.
Phil Husbands
Rescorla R. A.
Schultz W.
Shim Y. S.
Stein P. S. G.
Yoonsik Shim
Zhang C. K.
Publication venue: 'MIT Press - Journals'
Publication date: 01/08/2012
Field of study

We present a general and fully dynamic neural system, which exploits intrinsic chaotic dynamics, for the real-time goal-directed exploration and learning of the possible locomotion patterns of an articulated robot of an arbitrary morphology in an unknown environment. The controller is modeled as a network of neural oscillators that are initially coupled only through physical embodiment, and goal-directed exploration of coordinated motor patterns is achieved by chaotic search using adaptive bifurcation. The phase space of the indirectly coupled neural-body-environment system contains multiple transient or permanent self-organized dynamics, each of which is a candidate for a locomotion behavior. The adaptive bifurcation enables the system orbit to wander through various phase-coordinated states, using its intrinsic chaotic dynamics as a driving force, and stabilizes on to one of the states matching the given goal criteria. In order to improve the sustainability of useful transient patterns, sensory homeostasis has been introduced, which results in an increased diversity of motor outputs, thus achieving multiscale exploration. A rhythmic pattern discovered by this process is memorized and sustained by changing the wiring between initially disconnected oscillators using an adaptive synchronization method. Our results show that the novel neurorobotic system is able to create and learn multiple locomotion behaviors for a wide range of body configurations and physical environments and can readapt in realtime after sustaining damage

Crossref

Sussex Research Online