112 research outputs found
Short-term plasticity as cause-effect hypothesis testing in distal reward learning
Asynchrony, overlaps and delays in sensory-motor signals introduce ambiguity
as to which stimuli, actions, and rewards are causally related. Only the
repetition of reward episodes helps distinguish true cause-effect relationships
from coincidental occurrences. In the model proposed here, a novel plasticity
rule employs short and long-term changes to evaluate hypotheses on cause-effect
relationships. Transient weights represent hypotheses that are consolidated in
long-term memory only when they consistently predict or cause future rewards.
The main objective of the model is to preserve existing network topologies when
learning with ambiguous information flows. Learning is also improved by biasing
the exploration of the stimulus-response space towards actions that in the past
occurred before rewards. The model indicates under which conditions beliefs can
be consolidated in long-term memory, it suggests a solution to the
plasticity-stability dilemma, and proposes an interpretation of the role of
short-term plasticity.Comment: Biological Cybernetics, September 201
Confidence and psychosis: a neuro-computational account of contingency learning disruption by NMDA blockade.
A state of pathological uncertainty about environmental regularities might represent a key step in the pathway to psychotic illness. Early psychosis can be investigated in healthy volunteers under ketamine, an NMDA receptor antagonist. Here, we explored the effects of ketamine on contingency learning using a placebo-controlled, double-blind, crossover design. During functional magnetic resonance imaging, participants performed an instrumental learning task, in which cue-outcome contingencies were probabilistic and reversed between blocks. Bayesian model comparison indicated that in such an unstable environment, reinforcement learning parameters are downregulated depending on confidence level, an adaptive mechanism that was specifically disrupted by ketamine administration. Drug effects were underpinned by altered neural activity in a fronto-parietal network, which reflected the confidence-based shift to exploitation of learned contingencies. Our findings suggest that an early characteristic of psychosis lies in a persistent doubt that undermines the stabilization of behavioral policy resulting in a failure to exploit regularities in the environment.FV was supported by the Groupe Pasteur Mutualité. RG was supported by the Fondation pour la Recherche Médicale and the Fondation Bettencourt Schueller. SP is supported by a Marie Curie Intra-European fellowship (FP7-PEOPLE-2012-IEF). AF was supported by National Health and Medical Research Council grants (IDs : 1050504 and 1066779) and an Australian Research Council Future Fellowship (ID: FT130100589). This work was supported by the Wellcome Trust and the Bernard Wolfe Health Neuroscience Fund.This is the final version of the article. It first appeared from the Nature Publishing Group via http://dx.doi.org/10.1038/mp.2015.7
Deep Reinforcement Learning with Modulated Hebbian plus Q Network Architecture
This paper presents a new neural architecture that combines a modulated
Hebbian network (MOHN) with DQN, which we call modulated Hebbian plus Q network
architecture (MOHQA). The hypothesis is that such a combination allows MOHQA to
solve difficult partially observable Markov decision process (POMDP) problems
which impair temporal difference (TD)-based RL algorithms such as DQN, as the
TD error cannot be easily derived from observations. The key idea is to use a
Hebbian network with bio-inspired neural traces in order to bridge temporal
delays between actions and rewards when confounding observations and sparse
rewards result in inaccurate TD errors. In MOHQA, DQN learns low level features
and control, while the MOHN contributes to the high-level decisions by
associating rewards with past states and actions. Thus the proposed
architecture combines two modules with significantly different learning
algorithms, a Hebbian associative network and a classical DQN pipeline,
exploiting the advantages of both. Simulations on a set of POMDPs and on the
MALMO environment show that the proposed algorithm improved DQN's results and
even outperformed control tests with A2C, QRDQN+LSTM and REINFORCE algorithms
on some POMDPs with confounding stimuli and sparse rewards
A Hormone-Driven Epigenetic Mechanism for Adaptation in Autonomous Robots
Different epigenetic mechanisms provide biological organisms with the ability to adjust their physiology and/or morphology and adapt to a wide range of challenges posed by their environments. In particular, one type of epigenetic process, in which hormone concentrations are linked to the regulation of hormone receptors, has been shown to have implications for behavioral development. In this paper, taking inspiration from these biological processes, we investigate whether an epigenetic model based on the concept of hormonal regulation of receptors can provide a similarly robust and general adaptive mechanism for autonomous robots. We have implemented our model using a Koala robot, and tested it in a series of experiments in six different environments with varying challenges to negotiate. Our results, including the emergence of varied behaviors that permit the robot to exploit its current environment, demonstrate the potential of our epigenetic model as a general mechanism for adaptation in autonomous robots.Peer reviewe
On affect and self-adaptation: potential benefits of valence-controlled action-selection
Computer Systems, Imagery and MediaAlgorithms and the Foundations of Software technolog
Evolutionary and Computational Advantages of Neuromodulated Plasticity
The integration of modulatory neurons into evolutionary artificial neural networks is proposed here. A model of modulatory neurons was devised to describe a plasticity mechanism at the low level of synapses and neurons. No initial assumptions were made on the network structures or on the system level dynamics. The work of this thesis studied the outset of high level system dynamics that emerged employing the low level mechanism of neuromodulated plasticity. Fully-fledged control networks were designed by simulated evolution: an evolutionary algorithm could evolve networks with arbitrary size and topology using standard and modulatory neurons as building blocks. A set of dynamic, reward-based environments was implemented with the purpose of eliciting the outset of learning and memory in networks. The evolutionary time and the performance of solutions were compared for networks that could or could not use modulatory neurons. The experimental results demonstrated that modulatory neurons provide an evolutionary advantage that increases with the complexity of the control problem. Networks with modulatory neurons were also observed to evolve alternative neural control structures with respect to networks without neuromodulation. Different network topologies were observed to lead to a computational advantage such as faster input-output signal processing. The evolutionary and computational advantages induced by modulatory neurons strongly suggest the important role of neuromodulated plasticity for the evolution of networks that require temporal neural dynamics, adaptivity and memory functions
Linear and non linear measures of pupil size as a function of hypnotizability
Higher arousal and cortical excitability have been observed in high hypnotizable individuals (highs) with respect to low hypnotizables (lows), which may be due to differences in the activation of ascending activating systems. The present study investigated the possible hypnotizability-related difference in the cortical noradrenergic tone sustained by the activity of the Locus Coeruleus which is strongly related to pupil size. This was measured during relaxation in three groups of participantsâhighs (N = 15), lows (N = 15) and medium hypnotizable individuals (mediums, N = 11)âin the time and frequency domains and through the Recurrence Quantification Analysis. ECG and Skin Conductace (SC) were monitored to extract autonomic indices of relaxation (heart interbeats intervals, parasympathetic component of heart rate variability (RMSSD) and tonic SC (MeanTonicSC). Most variables indicated that participants relaxed throughout the session. Pupil features did not show significant differences between highs, mediums and lows, except for the spectral Band Median Frequency which was higher in mediums than in lows and highs at the beginning, but not at the end of the session.Thus, the present findings of pupil size cannot account for the differences in arousal and motor cortex excitability observed between highs and lows in resting conditions
Embodied Decisions and the Predictive Brain
Decision-making has traditionally been modelled as a serial process, consisting of a number of distinct stages. The traditional account assumes that an agent first acquires the necessary perceptual evidence, by constructing a detailed inner repre- sentation of the environment, in order to deliberate over a set of possible options. Next, the agent considers her goals and beliefs, and subsequently commits to the best possible course of action. This process then repeats once the agent has learned from the consequences of her actions and subsequently updated her beliefs. Under this interpretation, the agentâs body is considered merely as a means to report the decision, or to acquire the relevant goods. However, embodied cognition argues that an agentâs body should be understood as a proper part of the decision-making pro- cess. Accepting this principle challenges a number of commonly held beliefs in the cognitive sciences, but may lead to a more unified account of decision-making.
This thesis explores an embodied account of decision-making using a recent frame- work known as predictive processing. This framework has been proposed by some as a functional description of neural activity. However, if it is approached from an embodied perspective, it can also offer a novel account of decision-making that ex- tends the scope of our explanatory considerations out beyond the brain and the body. We explore work in the cognitive sciences that supports this view, and argue that decision theory can benefit from adopting an embodied and predictive perspective
- âŠ