112 research outputs found

    Short-term plasticity as cause-effect hypothesis testing in distal reward learning

    Get PDF
    Asynchrony, overlaps and delays in sensory-motor signals introduce ambiguity as to which stimuli, actions, and rewards are causally related. Only the repetition of reward episodes helps distinguish true cause-effect relationships from coincidental occurrences. In the model proposed here, a novel plasticity rule employs short and long-term changes to evaluate hypotheses on cause-effect relationships. Transient weights represent hypotheses that are consolidated in long-term memory only when they consistently predict or cause future rewards. The main objective of the model is to preserve existing network topologies when learning with ambiguous information flows. Learning is also improved by biasing the exploration of the stimulus-response space towards actions that in the past occurred before rewards. The model indicates under which conditions beliefs can be consolidated in long-term memory, it suggests a solution to the plasticity-stability dilemma, and proposes an interpretation of the role of short-term plasticity.Comment: Biological Cybernetics, September 201

    Confidence and psychosis: a neuro-computational account of contingency learning disruption by NMDA blockade.

    Get PDF
    A state of pathological uncertainty about environmental regularities might represent a key step in the pathway to psychotic illness. Early psychosis can be investigated in healthy volunteers under ketamine, an NMDA receptor antagonist. Here, we explored the effects of ketamine on contingency learning using a placebo-controlled, double-blind, crossover design. During functional magnetic resonance imaging, participants performed an instrumental learning task, in which cue-outcome contingencies were probabilistic and reversed between blocks. Bayesian model comparison indicated that in such an unstable environment, reinforcement learning parameters are downregulated depending on confidence level, an adaptive mechanism that was specifically disrupted by ketamine administration. Drug effects were underpinned by altered neural activity in a fronto-parietal network, which reflected the confidence-based shift to exploitation of learned contingencies. Our findings suggest that an early characteristic of psychosis lies in a persistent doubt that undermines the stabilization of behavioral policy resulting in a failure to exploit regularities in the environment.FV was supported by the Groupe Pasteur Mutualité. RG was supported by the Fondation pour la Recherche Médicale and the Fondation Bettencourt Schueller. SP is supported by a Marie Curie Intra-European fellowship (FP7-PEOPLE-2012-IEF). AF was supported by National Health and Medical Research Council grants (IDs : 1050504 and 1066779) and an Australian Research Council Future Fellowship (ID: FT130100589). This work was supported by the Wellcome Trust and the Bernard Wolfe Health Neuroscience Fund.This is the final version of the article. It first appeared from the Nature Publishing Group via http://dx.doi.org/10.1038/mp.2015.7

    Deep Reinforcement Learning with Modulated Hebbian plus Q Network Architecture

    Full text link
    This paper presents a new neural architecture that combines a modulated Hebbian network (MOHN) with DQN, which we call modulated Hebbian plus Q network architecture (MOHQA). The hypothesis is that such a combination allows MOHQA to solve difficult partially observable Markov decision process (POMDP) problems which impair temporal difference (TD)-based RL algorithms such as DQN, as the TD error cannot be easily derived from observations. The key idea is to use a Hebbian network with bio-inspired neural traces in order to bridge temporal delays between actions and rewards when confounding observations and sparse rewards result in inaccurate TD errors. In MOHQA, DQN learns low level features and control, while the MOHN contributes to the high-level decisions by associating rewards with past states and actions. Thus the proposed architecture combines two modules with significantly different learning algorithms, a Hebbian associative network and a classical DQN pipeline, exploiting the advantages of both. Simulations on a set of POMDPs and on the MALMO environment show that the proposed algorithm improved DQN's results and even outperformed control tests with A2C, QRDQN+LSTM and REINFORCE algorithms on some POMDPs with confounding stimuli and sparse rewards

    A Hormone-Driven Epigenetic Mechanism for Adaptation in Autonomous Robots

    Get PDF
    Different epigenetic mechanisms provide biological organisms with the ability to adjust their physiology and/or morphology and adapt to a wide range of challenges posed by their environments. In particular, one type of epigenetic process, in which hormone concentrations are linked to the regulation of hormone receptors, has been shown to have implications for behavioral development. In this paper, taking inspiration from these biological processes, we investigate whether an epigenetic model based on the concept of hormonal regulation of receptors can provide a similarly robust and general adaptive mechanism for autonomous robots. We have implemented our model using a Koala robot, and tested it in a series of experiments in six different environments with varying challenges to negotiate. Our results, including the emergence of varied behaviors that permit the robot to exploit its current environment, demonstrate the potential of our epigenetic model as a general mechanism for adaptation in autonomous robots.Peer reviewe

    On affect and self-adaptation: potential benefits of valence-controlled action-selection

    Get PDF
    Computer Systems, Imagery and MediaAlgorithms and the Foundations of Software technolog

    Evolutionary and Computational Advantages of Neuromodulated Plasticity

    Get PDF
    The integration of modulatory neurons into evolutionary artificial neural networks is proposed here. A model of modulatory neurons was devised to describe a plasticity mechanism at the low level of synapses and neurons. No initial assumptions were made on the network structures or on the system level dynamics. The work of this thesis studied the outset of high level system dynamics that emerged employing the low level mechanism of neuromodulated plasticity. Fully-fledged control networks were designed by simulated evolution: an evolutionary algorithm could evolve networks with arbitrary size and topology using standard and modulatory neurons as building blocks. A set of dynamic, reward-based environments was implemented with the purpose of eliciting the outset of learning and memory in networks. The evolutionary time and the performance of solutions were compared for networks that could or could not use modulatory neurons. The experimental results demonstrated that modulatory neurons provide an evolutionary advantage that increases with the complexity of the control problem. Networks with modulatory neurons were also observed to evolve alternative neural control structures with respect to networks without neuromodulation. Different network topologies were observed to lead to a computational advantage such as faster input-output signal processing. The evolutionary and computational advantages induced by modulatory neurons strongly suggest the important role of neuromodulated plasticity for the evolution of networks that require temporal neural dynamics, adaptivity and memory functions

    Linear and non linear measures of pupil size as a function of hypnotizability

    Get PDF
    Higher arousal and cortical excitability have been observed in high hypnotizable individuals (highs) with respect to low hypnotizables (lows), which may be due to differences in the activation of ascending activating systems. The present study investigated the possible hypnotizability-related difference in the cortical noradrenergic tone sustained by the activity of the Locus Coeruleus which is strongly related to pupil size. This was measured during relaxation in three groups of participants—highs (N = 15), lows (N = 15) and medium hypnotizable individuals (mediums, N = 11)—in the time and frequency domains and through the Recurrence Quantification Analysis. ECG and Skin Conductace (SC) were monitored to extract autonomic indices of relaxation (heart interbeats intervals, parasympathetic component of heart rate variability (RMSSD) and tonic SC (MeanTonicSC). Most variables indicated that participants relaxed throughout the session. Pupil features did not show significant differences between highs, mediums and lows, except for the spectral Band Median Frequency which was higher in mediums than in lows and highs at the beginning, but not at the end of the session.Thus, the present findings of pupil size cannot account for the differences in arousal and motor cortex excitability observed between highs and lows in resting conditions

    Embodied Decisions and the Predictive Brain

    Get PDF
    Decision-making has traditionally been modelled as a serial process, consisting of a number of distinct stages. The traditional account assumes that an agent first acquires the necessary perceptual evidence, by constructing a detailed inner repre- sentation of the environment, in order to deliberate over a set of possible options. Next, the agent considers her goals and beliefs, and subsequently commits to the best possible course of action. This process then repeats once the agent has learned from the consequences of her actions and subsequently updated her beliefs. Under this interpretation, the agent’s body is considered merely as a means to report the decision, or to acquire the relevant goods. However, embodied cognition argues that an agent’s body should be understood as a proper part of the decision-making pro- cess. Accepting this principle challenges a number of commonly held beliefs in the cognitive sciences, but may lead to a more unified account of decision-making. This thesis explores an embodied account of decision-making using a recent frame- work known as predictive processing. This framework has been proposed by some as a functional description of neural activity. However, if it is approached from an embodied perspective, it can also offer a novel account of decision-making that ex- tends the scope of our explanatory considerations out beyond the brain and the body. We explore work in the cognitive sciences that supports this view, and argue that decision theory can benefit from adopting an embodied and predictive perspective
    • 

    corecore